diff --git a/Algorithm/AI/Deep_Learning_Haar_Cascade_Explained.md b/Algorithm/AI/Deep_Learning_Haar_Cascade_Explained.md index 8947ace..83841a3 100755 --- a/Algorithm/AI/Deep_Learning_Haar_Cascade_Explained.md +++ b/Algorithm/AI/Deep_Learning_Haar_Cascade_Explained.md @@ -122,13 +122,22 @@ Haar 级联分类器最著名的应用是检测图像中的人脸或身体,但 * To the right of the stage we see the how well it performed in identifying the face. * Notice as it gets closer and closer to identifying the face, the number of stages increases into the 20s. (around the 1 minute mark). This demonstrates the cascading effect where the early stages are discarding the input as it has identified them as irrelevant. As it gets closer to finding a face it pays closer attention.” +本文视频能够帮助大家理解算法的工作过程。我们可以观察到下面一些情况。 + +* 注意观察算法是如何有组织滑动窗口来覆盖图像,应用 Haar 特征并试图检测出人脸。这一过程由绿色矩形来描述。 +* 注意,在红色矩形边界下,我们观察到,分类器迅速的丢掉了没匹配成功的窗口。 +* 在其余的阶段中,我们能观察到分类器很好的检测出了人脸。 +* 注意,随着越来越接近人脸检测结果,检测过程所消耗的时间也越来越长,这表示越接近目标分类器的工作越精细,在检测的初始阶段,分类器只要进行粗略的分类即可。 + ![Integral Images](./img/Deep_Learning_Haar_Cascade_Explained/005.gif) [点击查看原始视频](https://www.youtube.com/watch?v=hPCTwxF0qf4&feature=youtu.be) -“Let me know if you have any questions or have any comments below.” +“Let me know if you have any questions or have any comments below. -“I want to make sure I got this post right. It will be critical that you understand this before we go into the next section where we will implement a full Custom Object Haar Cascade detector.” +I want to make sure I got this post right. It will be critical that you understand this before we go into the next section where we will implement a full Custom Object Haar Cascade detector.” + +如果对本文有任何疑问请告知,我需要确保本文的正确性。理解本文的内容是至关重要的,这有助于进入下一个阶段的学习——定制 Haar 级联分类器。 ## Next Steps @@ -148,19 +157,19 @@ Haar 级联分类器最著名的应用是检测图像中的人脸或身体,但 [1] Haar -“A Haar-like feature considers adjacent rectangular regions at a specific location in a detection window, sums up the pixel intensities in each region and calculates the difference between these sums.“ +A Haar-like feature considers adjacent rectangular regions at a specific location in a detection window, sums up the pixel intensities in each region and calculates the difference between these sums. -“This difference is then used to categorize subsections of an image. For example, let us say we have an image database with human faces. It is a common observation that among all faces the region of the eyes is darker than the region of the cheeks. Therefore a common Haar feature for face detection is a set of two adjacent rectangles that lie above the eye and the cheek region. The position of these rectangles is defined relative to a detection window that acts like a bounding box to the target object (the face in this case).“ +This difference is then used to categorize subsections of an image. For example, let us say we have an image database with human faces. It is a common observation that among all faces the region of the eyes is darker than the region of the cheeks. Therefore a common Haar feature for face detection is a set of two adjacent rectangles that lie above the eye and the cheek region. The position of these rectangles is defined relative to a detection window that acts like a bounding box to the target object (the face in this case). [2] Integral Images -“An integral image is summed-area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid. To understand this look at image 1 and image 2. Image 1 is the source table, Image 2 is the summation table. Notice in Image 2 row 1, col 2 value 33 is sum of Image 1 row 1 (col 1 + col 2).“ +An integral image is summed-area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid. To understand this look at image 1 and image 2. Image 1 is the source table, Image 2 is the summation table. Notice in Image 2 row 1, col 2 value 33 is sum of Image 1 row 1 (col 1 + col 2). ![Integral Images](./img/Deep_Learning_Haar_Cascade_Explained/006.png) [3] Adaboost -“Problems in machine learning often suffer from the curse of dimensionality — each sample may consist of a huge number of potential features (for instance, there can be 162,336 Haar features, as used by the Viola–Jones object detection framework, in a 24×24 pixel image window), and evaluating every feature can reduce not only the speed of classifier training and execution, but in fact reduce predictive power, per the Hughes Effect.[3] Unlike neural networks and SVMs, the AdaBoost training process selects only those features known to improve the predictive power of the model, reducing dimensionality and potentially improving execution time as irrelevant features need not be computed.“ +Problems in machine learning often suffer from the curse of dimensionality — each sample may consist of a huge number of potential features (for instance, there can be 162,336 Haar features, as used by the Viola–Jones object detection framework, in a 24×24 pixel image window), and evaluating every feature can reduce not only the speed of classifier training and execution, but in fact reduce predictive power, per the Hughes Effect. Unlike neural networks and SVMs, the AdaBoost training process selects only those features known to improve the predictive power of the model, reducing dimensionality and potentially improving execution time as irrelevant features need not be computed. ## 译注