openface/README.md

# FaceNet

This is a Python and Torch implementation of the CVPR 2015 paper
[FaceNet: A Unified Embedding for Face Recognition and Clustering](http://www.cv-foundation.org/openaccess/content_cvpr_2015/app/1A_089.pdf)
by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google
using publicly available libraries and datasets.
Torch allows CPU and CUDA

**Crafted by [Brandon Amos](http://bamos.github.io) in the
[Elijah](http://elijah.cs.cmu.edu) research group at
Carnegie Mellon University.**

---

The following example shows the workflow for a single input
image of Sylvestor Stallone from the publicly available
[LFW dataset](http://vis-www.cs.umass.edu/lfw/person/Sylvester_Stallone.html).

1. Detect faces with a pre-trained models from
  [dlib](http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html)
  [OpenCV](http://docs.opencv.org/master/d7/d8b/tutorial_py_face_detection.html).
2. Transform the face for the neural network.
   This repository uses dlib's
   [real-time pose estimation](http://blog.dlib.net/2014/08/real-time-face-pose-estimation.html)
   with OpenCV's
   [affine transformation](http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html)
   to try to make the eyes and nose appear in
   the same location on each image.
3. Use a deep neural network to represent (or embed) the face on
   a 128-dimensional hypersphere.
   The embedding is a generic representation for anybody's face.
   Unlike other face representations, this embedding has the nice property
   that a larger distance between two face embeddings means
   that the faces are likely not of the same person.
   This trivializes clustering, similarity detection,
   and classification tasks.

![](./images/summary.jpg)

# Help Wanted!

As the following table shows, the forefront of deep learning research
is driven by large private datasets.
In face recognition, there are no open source implementations or
models trained on these datasets.
If you have access to a large dataset, we are very interested
in training a new FaceNet model with it.
Please contact Brandon Amos at [bamos@cs.cmu.edu](mailto:bamos@cs.cmu.edu).

| Dataset | Public | #Photos | #People |
|---|---|---|---|
| [DeepFace](https://research.facebook.com/publications/480567225376225/deepface-closing-the-gap-to-human-level-performance-in-face-verification/) (Facebook) | No | 4.4 Million | 4k |
| [Web-Scale Training...](http://arxiv.org/abs/1406.5266) (Facebook) | No | 500 Million | 10 Million |
| FaceNet (Google) | No | 100-200 Million | 8 Million |
| [FaceScrub](http://vintage.winklerbros.net/facescrub.html) | Yes | 100k | 500 |
| [CASIA-WebFace](http://arxiv.org/abs/1411.7923) | Yes | 500k | 10k |

# Real-Time Web Demo
See [our YouTube video](TODO) of using this in a real-time web application
for face recognition.
The source is available in [demos/www](/demos/www).

TODO: Screenshot

# Cool demo, but I want numbers. What's the accuracy?
Even though the public datasets we trained on have orders of magnitude less data
than private industry datasets, the accuracy is remarkably high and
outperforms all other open-source face recognition implementations we
are aware of on the standard
[LFW](http://vis-www.cs.umass.edu/lfw/results.html)
benchmark.
We had to fallback to using the deep funneled versions for
152 of 13233 images because dlib failed to detect a face or landmarks.

TODO: ROC Curve

This can be generated with the following commands from the root `facenet`
directory, assuming you have downloaded and placed the raw and
deep funneled lfw data from [here](http://http://vis-www.cs.umass.edu/lfw/)
in `./data/lfw/raw` and `./data/lfw/deepfunneled`.

1. Install prerequisites as below.
2. Preprocess the raw `lfw` images, change `8` to however many
   separate processes you want to run:
   `for N in {1..8}; do ./util/align-dlib.py data/lfw/raw align affine data/lfw/dlib-affine-sz:96 --size 96 &; done`.
   Fallback to deep funneled versions for images that dlib failed
   to align:
   `./util/align-dlib.py data/lfw/raw align affine data/lfw/dlib-affine-sz:96 --size 96 --fallbackLfw data/lfw/deepfunneled`
3. Generate representations with `./batch-represent/main.lua -outDir evaluation/lfw.nn4.v1.reps -model models/facenet/nn4.v1.t7 -data data/lfw/dlib-affine-sz:96`
4. Generate the ROC curve from the `evaluation` directory with `./lfw-roc.py --workDir lfw.nn4.v1.reps`.
   This creates `roc.pdf` in the `lfw.nn4.v1.reps` directory.

# Setup

## Check out git submodules
Clone with `--recursive` or run `git submodule init && git submodule update`
after checking out.

## Download the models
Run `./models/download_models.sh` to download pre-trained FaceNet
models on the combined CASIA-WebFace and FaceScrub database.
This also downloads dlib's pre-trained model for face landmark detection.

## With Docker
TODO

This repo can be deployed as a container with [Docker](https://www.docker.com/)
for CPU mode:

```
./models/download_models.sh
sudo docker build -t facenet .
sudo docker run -t -i -v $PWD:/facenet facenet /bin/bash
cd /facenet
TODO
```

To use, place your images in `facenet` on your host and
access them from the shared Docker directory.

## By hand
TODO

Dependencies:
+ [torch7](https://github.com/torch/torch7)
+ [dpnn](https://github.com/nicholas-leonard/dpnn)
+ TODO

Optional dependencies:
+ CUDA 6.5+
+ [cudnn.torch](https://github.com/soumith/cudnn.torch)


# Usage
## Existing Models
TODO

# Training new models
TODO
[Atcold/torch-TripletEmbedding](https://github.com/Atcold/torch-TripletEmbedding)

# Licensing
This source is copyright Carnegie Mellon University
and licensed under the [Apache 2.0 License](./LICENSE).
Portions from the following third party sources have
been modified and are included in this repository.
These portions are noted in the source files and are
copyright their respective authors with
the licenses listed.

Project | Modified | License
---|---|---|
[Atcold/torch-TripletEmbedding](https://github.com/Atcold/torch-TripletEmbedding) | No | MIT
[facebook/fbnn](https://github.com/facebook/fbnn) | Yes | BSD
Initial commit of stubbed README. 2015-09-25 02:47:17 +08:00			`# FaceNet`

			`This is a Python and Torch implementation of the CVPR 2015 paper`
			`[FaceNet: A Unified Embedding for Face Recognition and Clustering](http://www.cv-foundation.org/openaccess/content_cvpr_2015/app/1A_089.pdf)`
			`by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google`
			`using publicly available libraries and datasets.`
			`Torch allows CPU and CUDA`

			`**Crafted by [Brandon Amos](http://bamos.github.io) in the`
			`[Elijah](http://elijah.cs.cmu.edu) research group at`
			`Carnegie Mellon University.**`

			`---`

			`The following example shows the workflow for a single input`
			`image of Sylvestor Stallone from the publicly available`
			`[LFW dataset](http://vis-www.cs.umass.edu/lfw/person/Sylvester_Stallone.html).`

			`1. Detect faces with a pre-trained models from`
			`[dlib](http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html)`
			`[OpenCV](http://docs.opencv.org/master/d7/d8b/tutorial_py_face_detection.html).`
			`2. Transform the face for the neural network.`
			`This repository uses dlib's`
			`[real-time pose estimation](http://blog.dlib.net/2014/08/real-time-face-pose-estimation.html)`
			`with OpenCV's`
			`[affine transformation](http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html)`
			`to try to make the eyes and nose appear in`
			`the same location on each image.`
			`3. Use a deep neural network to represent (or embed) the face on`
			`a 128-dimensional hypersphere.`
			`The embedding is a generic representation for anybody's face.`
			`Unlike other face representations, this embedding has the nice property`
			`that a larger distance between two face embeddings means`
			`that the faces are likely not of the same person.`
			`This trivializes clustering, similarity detection,`
			`and classification tasks.`

			`![](./images/summary.jpg)`

			`# Help Wanted!`

			`As the following table shows, the forefront of deep learning research`
			`is driven by large private datasets.`
			`In face recognition, there are no open source implementations or`
			`models trained on these datasets.`
			`If you have access to a large dataset, we are very interested`
			`in training a new FaceNet model with it.`
			`Please contact Brandon Amos at [bamos@cs.cmu.edu](mailto:bamos@cs.cmu.edu).`

			`\| Dataset \| Public \| #Photos \| #People \|`
			`\|---\|---\|---\|---\|`
			`\| [DeepFace](https://research.facebook.com/publications/480567225376225/deepface-closing-the-gap-to-human-level-performance-in-face-verification/) (Facebook) \| No \| 4.4 Million \| 4k \|`
			`\| [Web-Scale Training...](http://arxiv.org/abs/1406.5266) (Facebook) \| No \| 500 Million \| 10 Million \|`
			`\| FaceNet (Google) \| No \| 100-200 Million \| 8 Million \|`
			`\| [FaceScrub](http://vintage.winklerbros.net/facescrub.html) \| Yes \| 100k \| 500 \|`
			`\| [CASIA-WebFace](http://arxiv.org/abs/1411.7923) \| Yes \| 500k \| 10k \|`

			`# Real-Time Web Demo`
			`See [our YouTube video](TODO) of using this in a real-time web application`
			`for face recognition.`
			`The source is available in [demos/www](/demos/www).`

			`TODO: Screenshot`

			`# Cool demo, but I want numbers. What's the accuracy?`
			`Even though the public datasets we trained on have orders of magnitude less data`
			`than private industry datasets, the accuracy is remarkably high and`
			`outperforms all other open-source face recognition implementations we`
			`are aware of on the standard`
			`[LFW](http://vis-www.cs.umass.edu/lfw/results.html)`
			`benchmark.`
			`We had to fallback to using the deep funneled versions for`
			`152 of 13233 images because dlib failed to detect a face or landmarks.`

			`TODO: ROC Curve`

			This can be generated with the following commands from the root `facenet`
			`directory, assuming you have downloaded and placed the raw and`
			`deep funneled lfw data from [here](http://http://vis-www.cs.umass.edu/lfw/)`
			in `./data/lfw/raw` and `./data/lfw/deepfunneled`.

			`1. Install prerequisites as below.`
			2. Preprocess the raw `lfw` images, change `8` to however many
			`separate processes you want to run:`
			`for N in {1..8}; do ./util/align-dlib.py data/lfw/raw align affine data/lfw/dlib-affine-sz:96 --size 96 &; done`.
			`Fallback to deep funneled versions for images that dlib failed`
			`to align:`
			`./util/align-dlib.py data/lfw/raw align affine data/lfw/dlib-affine-sz:96 --size 96 --fallbackLfw data/lfw/deepfunneled`
			3. Generate representations with `./batch-represent/main.lua -outDir evaluation/lfw.nn4.v1.reps -model models/facenet/nn4.v1.t7 -data data/lfw/dlib-affine-sz:96`
			4. Generate the ROC curve from the `evaluation` directory with `./lfw-roc.py --workDir lfw.nn4.v1.reps`.
			This creates `roc.pdf` in the `lfw.nn4.v1.reps` directory.

			`# Setup`

			`## Check out git submodules`
			Clone with `--recursive` or run `git submodule init && git submodule update`
			`after checking out.`

			`## Download the models`
			Run `./models/download_models.sh` to download pre-trained FaceNet
			`models on the combined CASIA-WebFace and FaceScrub database.`
			`This also downloads dlib's pre-trained model for face landmark detection.`

			`## With Docker`
			`TODO`

			`This repo can be deployed as a container with [Docker](https://www.docker.com/)`
			`for CPU mode:`

			```
			`./models/download_models.sh`
			`sudo docker build -t facenet .`
			`sudo docker run -t -i -v $PWD:/facenet facenet /bin/bash`
			`cd /facenet`
			`TODO`
			```

			To use, place your images in `facenet` on your host and
			`access them from the shared Docker directory.`

			`## By hand`
			`TODO`

			`Dependencies:`
			`+ [torch7](https://github.com/torch/torch7)`
			`+ [dpnn](https://github.com/nicholas-leonard/dpnn)`
			`+ TODO`

			`Optional dependencies:`
			`+ CUDA 6.5+`
			`+ [cudnn.torch](https://github.com/soumith/cudnn.torch)`


			`# Usage`
			`## Existing Models`
			`TODO`

			`# Training new models`
			`TODO`
			`[Atcold/torch-TripletEmbedding](https://github.com/Atcold/torch-TripletEmbedding)`

			`# Licensing`
			`This source is copyright Carnegie Mellon University`
			`and licensed under the [Apache 2.0 License](./LICENSE).`
			`Portions from the following third party sources have`
			`been modified and are included in this repository.`
			`These portions are noted in the source files and are`
			`copyright their respective authors with`
			`the licenses listed.`

			`Project \| Modified \| License`
			`---\|---\|---\|`
			`[Atcold/torch-TripletEmbedding](https://github.com/Atcold/torch-TripletEmbedding) \| No \| MIT`
			`[facebook/fbnn](https://github.com/facebook/fbnn) \| Yes \| BSD`