Initial commit of stubbed README.
This commit is contained in:
parent
caa2caa6be
commit
4eb080dce2
|
@ -0,0 +1,154 @@
|
|||
# FaceNet
|
||||
|
||||
This is a Python and Torch implementation of the CVPR 2015 paper
|
||||
[FaceNet: A Unified Embedding for Face Recognition and Clustering](http://www.cv-foundation.org/openaccess/content_cvpr_2015/app/1A_089.pdf)
|
||||
by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google
|
||||
using publicly available libraries and datasets.
|
||||
Torch allows CPU and CUDA
|
||||
|
||||
**Crafted by [Brandon Amos](http://bamos.github.io) in the
|
||||
[Elijah](http://elijah.cs.cmu.edu) research group at
|
||||
Carnegie Mellon University.**
|
||||
|
||||
---
|
||||
|
||||
The following example shows the workflow for a single input
|
||||
image of Sylvestor Stallone from the publicly available
|
||||
[LFW dataset](http://vis-www.cs.umass.edu/lfw/person/Sylvester_Stallone.html).
|
||||
|
||||
1. Detect faces with a pre-trained models from
|
||||
[dlib](http://blog.dlib.net/2014/02/dlib-186-released-make-your-own-object.html)
|
||||
[OpenCV](http://docs.opencv.org/master/d7/d8b/tutorial_py_face_detection.html).
|
||||
2. Transform the face for the neural network.
|
||||
This repository uses dlib's
|
||||
[real-time pose estimation](http://blog.dlib.net/2014/08/real-time-face-pose-estimation.html)
|
||||
with OpenCV's
|
||||
[affine transformation](http://docs.opencv.org/doc/tutorials/imgproc/imgtrans/warp_affine/warp_affine.html)
|
||||
to try to make the eyes and nose appear in
|
||||
the same location on each image.
|
||||
3. Use a deep neural network to represent (or embed) the face on
|
||||
a 128-dimensional hypersphere.
|
||||
The embedding is a generic representation for anybody's face.
|
||||
Unlike other face representations, this embedding has the nice property
|
||||
that a larger distance between two face embeddings means
|
||||
that the faces are likely not of the same person.
|
||||
This trivializes clustering, similarity detection,
|
||||
and classification tasks.
|
||||
|
||||
![](./images/summary.jpg)
|
||||
|
||||
# Help Wanted!
|
||||
|
||||
As the following table shows, the forefront of deep learning research
|
||||
is driven by large private datasets.
|
||||
In face recognition, there are no open source implementations or
|
||||
models trained on these datasets.
|
||||
If you have access to a large dataset, we are very interested
|
||||
in training a new FaceNet model with it.
|
||||
Please contact Brandon Amos at [bamos@cs.cmu.edu](mailto:bamos@cs.cmu.edu).
|
||||
|
||||
| Dataset | Public | #Photos | #People |
|
||||
|---|---|---|---|
|
||||
| [DeepFace](https://research.facebook.com/publications/480567225376225/deepface-closing-the-gap-to-human-level-performance-in-face-verification/) (Facebook) | No | 4.4 Million | 4k |
|
||||
| [Web-Scale Training...](http://arxiv.org/abs/1406.5266) (Facebook) | No | 500 Million | 10 Million |
|
||||
| FaceNet (Google) | No | 100-200 Million | 8 Million |
|
||||
| [FaceScrub](http://vintage.winklerbros.net/facescrub.html) | Yes | 100k | 500 |
|
||||
| [CASIA-WebFace](http://arxiv.org/abs/1411.7923) | Yes | 500k | 10k |
|
||||
|
||||
# Real-Time Web Demo
|
||||
See [our YouTube video](TODO) of using this in a real-time web application
|
||||
for face recognition.
|
||||
The source is available in [demos/www](/demos/www).
|
||||
|
||||
TODO: Screenshot
|
||||
|
||||
# Cool demo, but I want numbers. What's the accuracy?
|
||||
Even though the public datasets we trained on have orders of magnitude less data
|
||||
than private industry datasets, the accuracy is remarkably high and
|
||||
outperforms all other open-source face recognition implementations we
|
||||
are aware of on the standard
|
||||
[LFW](http://vis-www.cs.umass.edu/lfw/results.html)
|
||||
benchmark.
|
||||
We had to fallback to using the deep funneled versions for
|
||||
152 of 13233 images because dlib failed to detect a face or landmarks.
|
||||
|
||||
TODO: ROC Curve
|
||||
|
||||
This can be generated with the following commands from the root `facenet`
|
||||
directory, assuming you have downloaded and placed the raw and
|
||||
deep funneled lfw data from [here](http://http://vis-www.cs.umass.edu/lfw/)
|
||||
in `./data/lfw/raw` and `./data/lfw/deepfunneled`.
|
||||
|
||||
1. Install prerequisites as below.
|
||||
2. Preprocess the raw `lfw` images, change `8` to however many
|
||||
separate processes you want to run:
|
||||
`for N in {1..8}; do ./util/align-dlib.py data/lfw/raw align affine data/lfw/dlib-affine-sz:96 --size 96 &; done`.
|
||||
Fallback to deep funneled versions for images that dlib failed
|
||||
to align:
|
||||
`./util/align-dlib.py data/lfw/raw align affine data/lfw/dlib-affine-sz:96 --size 96 --fallbackLfw data/lfw/deepfunneled`
|
||||
3. Generate representations with `./batch-represent/main.lua -outDir evaluation/lfw.nn4.v1.reps -model models/facenet/nn4.v1.t7 -data data/lfw/dlib-affine-sz:96`
|
||||
4. Generate the ROC curve from the `evaluation` directory with `./lfw-roc.py --workDir lfw.nn4.v1.reps`.
|
||||
This creates `roc.pdf` in the `lfw.nn4.v1.reps` directory.
|
||||
|
||||
# Setup
|
||||
|
||||
## Check out git submodules
|
||||
Clone with `--recursive` or run `git submodule init && git submodule update`
|
||||
after checking out.
|
||||
|
||||
## Download the models
|
||||
Run `./models/download_models.sh` to download pre-trained FaceNet
|
||||
models on the combined CASIA-WebFace and FaceScrub database.
|
||||
This also downloads dlib's pre-trained model for face landmark detection.
|
||||
|
||||
## With Docker
|
||||
TODO
|
||||
|
||||
This repo can be deployed as a container with [Docker](https://www.docker.com/)
|
||||
for CPU mode:
|
||||
|
||||
```
|
||||
./models/download_models.sh
|
||||
sudo docker build -t facenet .
|
||||
sudo docker run -t -i -v $PWD:/facenet facenet /bin/bash
|
||||
cd /facenet
|
||||
TODO
|
||||
```
|
||||
|
||||
To use, place your images in `facenet` on your host and
|
||||
access them from the shared Docker directory.
|
||||
|
||||
## By hand
|
||||
TODO
|
||||
|
||||
Dependencies:
|
||||
+ [torch7](https://github.com/torch/torch7)
|
||||
+ [dpnn](https://github.com/nicholas-leonard/dpnn)
|
||||
+ TODO
|
||||
|
||||
Optional dependencies:
|
||||
+ CUDA 6.5+
|
||||
+ [cudnn.torch](https://github.com/soumith/cudnn.torch)
|
||||
|
||||
|
||||
# Usage
|
||||
## Existing Models
|
||||
TODO
|
||||
|
||||
# Training new models
|
||||
TODO
|
||||
[Atcold/torch-TripletEmbedding](https://github.com/Atcold/torch-TripletEmbedding)
|
||||
|
||||
# Licensing
|
||||
This source is copyright Carnegie Mellon University
|
||||
and licensed under the [Apache 2.0 License](./LICENSE).
|
||||
Portions from the following third party sources have
|
||||
been modified and are included in this repository.
|
||||
These portions are noted in the source files and are
|
||||
copyright their respective authors with
|
||||
the licenses listed.
|
||||
|
||||
Project | Modified | License
|
||||
---|---|---|
|
||||
[Atcold/torch-TripletEmbedding](https://github.com/Atcold/torch-TripletEmbedding) | No | MIT
|
||||
[facebook/fbnn](https://github.com/facebook/fbnn) | Yes | BSD
|
Loading…
Reference in New Issue