openface/docs/demo-3-classifier.md

# Demo 3: Training a Classifier
OpenFace's core provides a feature extraction method to
obtain a low-dimensional representation of any face.
[demos/classifier.py](https://github.com/cmusatyalab/openface/blob/master/demos/classifier.py)
shows a demo of how these representations can be
used to create a face classifier.

There is a distinction between training the deep neural network (DNN)
model for feature representation
and training a model for classifying people with the DNN model.
This shows how to use a pre-trained DNN model to train and use
a classification model.

## Creating a Classification Model

### 1. Create raw image directory.
Create a directory for your raw images so that images from different
people are in different subdirectories. The names of the labels or
images do not matter, and each person can have a different amount of images.
The images should be formatted as `jpg` or `png` and have
a lowercase extension.

```
$ tree data/mydataset/raw
person-1
├── image-1.jpg
├── image-2.png
...
└── image-p.png

...

person-m
├── image-1.png
├── image-2.jpg
...
└── image-q.png
```


## 2. Preprocess the raw images
Change `8` to however many
separate processes you want to run:
`for N in {1..8}; do ./util/align-dlib.py <path-to-raw-data> align affine <path-to-aligned-data> --size 96 &; done`.

## 3. Create the Classification Model
Use `./demos/classifier.py train <path-to-aligned-data>` to produce
the classification model which is an SVM saved to disk as
a Python pickle.

Training uses [scikit-learn](http://scikit-learn.org) to perform
a grid search over SVM parameters.
For 1000's of images, training the SVMs takes seconds.
Our trained model obtains 87% accuracy on this set of data.

## Classifying New Images
We have released a `celeb-classifier.nn4.v1.pkl` classification model
that is trained on about 6000 total images of the following people,
which are the people with the most images in our dataset.
Classifiers can be created with far less images per
person.

+ America Ferrera
+ Amy Adams
+ Anne Hathaway
+ Ben Stiller
+ Bradley Cooper
+ David Boreanaz
+ Emily Deschanel
+ Eva Longoria
+ Jon Hamm
+ Steve Carell

For an example, consider the following small set of images
the model has no knowledge of.
For an unknown person, a prediction still needs to be made, but
the confidence score is usually lower.

Run the classifier on your images with:

```
./demos/classifier.py infer ./models/openface/celeb-classifier.nn4.v1.pkl ./your-image.png
```

| Person | Image | Prediction | Confidence |
|---|---|---|---|
| Carell | <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/carell.jpg' width='200px'></img> | SteveCarell | 0.78 |
| Adams | <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/adams.jpg' width='200px'></img> | AmyAdams | 0.87 |
| Lennon 1 (Unknown) | <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/lennon-1.jpg' width='200px'></img> | DavidBoreanaz | 0.28 |
| Lennon 2 (Unknown) | <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/lennon-2.jpg' width='200px'></img> | DavidBoreanaz | 0.56 |
Initial commit of mkdocs for #29. 2015-11-01 20:52:46 +08:00			`# Demo 3: Training a Classifier`
			`OpenFace's core provides a feature extraction method to`
			`obtain a low-dimensional representation of any face.`
Docs: Better distinguish between DNN and classification models. 2015-11-11 03:31:24 +08:00			`[demos/classifier.py](https://github.com/cmusatyalab/openface/blob/master/demos/classifier.py)`
			`shows a demo of how these representations can be`
			`used to create a face classifier.`
Initial commit of mkdocs for #29. 2015-11-01 20:52:46 +08:00
Docs: Better distinguish between DNN and classification models. 2015-11-11 03:31:24 +08:00			`There is a distinction between training the deep neural network (DNN)`
			`model for feature representation`
			`and training a model for classifying people with the DNN model.`
			`This shows how to use a pre-trained DNN model to train and use`
			`a classification model.`

			`## Creating a Classification Model`

			`### 1. Create raw image directory.`
			`Create a directory for your raw images so that images from different`
			`people are in different subdirectories. The names of the labels or`
			`images do not matter, and each person can have a different amount of images.`
			The images should be formatted as `jpg` or `png` and have
			`a lowercase extension.`

			```
			`$ tree data/mydataset/raw`
			`person-1`
			`├── image-1.jpg`
			`├── image-2.png`
			`...`
			`└── image-p.png`

			`...`

			`person-m`
			`├── image-1.png`
			`├── image-2.jpg`
			`...`
			`└── image-q.png`
			```


			`## 2. Preprocess the raw images`
			Change `8` to however many
			`separate processes you want to run:`
			`for N in {1..8}; do ./util/align-dlib.py <path-to-raw-data> align affine <path-to-aligned-data> --size 96 &; done`.

			`## 3. Create the Classification Model`
			Use `./demos/classifier.py train <path-to-aligned-data>` to produce
			`the classification model which is an SVM saved to disk as`
			`a Python pickle.`

			`Training uses [scikit-learn](http://scikit-learn.org) to perform`
			`a grid search over SVM parameters.`
			`For 1000's of images, training the SVMs takes seconds.`
			`Our trained model obtains 87% accuracy on this set of data.`

			`## Classifying New Images`
			We have released a `celeb-classifier.nn4.v1.pkl` classification model
			`that is trained on about 6000 total images of the following people,`
Initial commit of mkdocs for #29. 2015-11-01 20:52:46 +08:00			`which are the people with the most images in our dataset.`
			`Classifiers can be created with far less images per`
			`person.`

			`+ America Ferrera`
			`+ Amy Adams`
			`+ Anne Hathaway`
			`+ Ben Stiller`
			`+ Bradley Cooper`
			`+ David Boreanaz`
			`+ Emily Deschanel`
			`+ Eva Longoria`
			`+ Jon Hamm`
			`+ Steve Carell`

			`For an example, consider the following small set of images`
			`the model has no knowledge of.`
			`For an unknown person, a prediction still needs to be made, but`
			`the confidence score is usually lower.`

			`Run the classifier on your images with:`

			```
			`./demos/classifier.py infer ./models/openface/celeb-classifier.nn4.v1.pkl ./your-image.png`
			```

			`\| Person \| Image \| Prediction \| Confidence \|`
			`\|---\|---\|---\|---\|`
Update docs. 2015-11-03 06:30:14 +08:00			`\| Carell \| <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/carell.jpg' width='200px'></img> \| SteveCarell \| 0.78 \|`
			`\| Adams \| <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/adams.jpg' width='200px'></img> \| AmyAdams \| 0.87 \|`
			`\| Lennon 1 (Unknown) \| <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/lennon-1.jpg' width='200px'></img> \| DavidBoreanaz \| 0.28 \|`
			`\| Lennon 2 (Unknown) \| <img src='https://raw.githubusercontent.com/cmusatyalab/openface/master/images/examples/lennon-2.jpg' width='200px'></img> \| DavidBoreanaz \| 0.56 \|`