Commit Graph

586 Commits

Author SHA1 Message Date
Jakub Mareda 960e8a014f
Missing include for `dlib::loss_multiclass_log_per_pixel_` (#2432)
* Missing include for `dlib::loss_multiclass_log_per_pixel_::label_to_ignore`

I was trying to compile the examples and encountered this issue after moving `rgb_label_image_to_index_label_image` to cpp file. Headers should include all symbols they mention.

* Update pascal_voc_2012.h

Should use the official entrypoint for including dnn stuff.

Co-authored-by: Davis E. King <davis685@gmail.com>
2021-09-15 08:27:24 -04:00
Adrià Arrufat 16500906b0
YOLO loss (#2376) 2021-07-29 20:05:54 -04:00
Abdolkarim Saeedi 7b5b375026
Update dnn_inception_ex.cpp (#2256)
Simple typo in the inception training
2020-12-09 07:37:45 -05:00
Adrià Arrufat a7627cbd07
Rename function to disable_duplicative_biases (#2246)
* Rename function to disable_duplicative_biases

* rename also the functions in the tests... oops
2020-11-24 22:07:04 -05:00
Adrià Arrufat 3c82c2259c
Add Layer Normalization (#2213)
* wip: layer normalization on cpu

* wip: add cuda implementation, nor working yet

* wip: try to fix cuda implementation

* swap grid_strid_range and grid_strid_range_y: does not work yet

* fix CUDA implementation

* implement cuda gradient

* add documentation, move layer_norm, update bn_visitor

* add tests

* use stddev instead of variance in test (they are both 1, anyway)

* add test for means and invstds on CPU and CUDA

* rename visitor to disable_duplicative_bias

* handle more cases in the visitor_disable_input_bias

* Add tests for visitor_disable_input_bias
2020-10-20 07:56:55 -04:00
Adrià Arrufat 5ec60a91c4
Show how to use the new visitors with lambdas (#2162) 2020-09-06 09:27:50 -04:00
Davis King afe19fcb8b Made the DNN layer visiting routines more convenient.
Now the user doesn't have to supply a visitor capable of visiting all
layers, but instead just the ones they are interested in.  Also added
visit_computational_layers() and visit_computational_layers_range()
since those capture a very common use case more concisely than
visit_layers().  That is, users generally want to mess with the
computational layers specifically as those are the stateful layers.
2020-09-05 18:33:04 -04:00
Adrià Arrufat e7ec6b7777
Add visitor to remove bias from bn_ layer inputs (#closes 2155) (#2156)
* add visitor to remove bias from bn_ inputs (#closes 2155)

* remove unused parameter and make documentation more clear

* remove bias from bn_ layers too and use better name

* let the batch norm keep their bias, use even better name

* be more consistent with impl naming

* remove default constructor

* do not use method to prevent some errors

* add disable bias method to pertinent layers

* update dcgan example

- grammar
- print number of network parameters to be able to check bias is not allocated
- at the end, give feedback to the user about what the discriminator thinks about each generated sample

* fix fc_ logic

* add documentation

* add bias_is_disabled methods and update to_xml

* print use_bias=false when bias is disabled
2020-09-02 21:59:19 -04:00
Adrià Arrufat 64ba66e1c7
fix receptive field comment (#2070) 2020-04-27 06:02:26 -04:00
ncoder-1 8055b8d19a
Update dnn_introduction_ex.cpp (#2066)
Changed C-style cast to static_cast.
2020-04-22 07:37:58 -04:00
Davis King fbb2db2188 fix example cmake script 2020-04-04 09:55:08 -04:00
Adrià Arrufat 5a715fe24d
Remove outdated comment from DCGAN example (#2048)
* Remove outdated comment

That comment was there from when I was using a dnn_trainer to train
the discriminator network.

* Fix case
2020-04-02 07:14:42 -04:00
Adrià Arrufat e9c56fb21a
Fix warnings while running the tests (#2046)
* fix some warnings when running tests

* rever changes in CMakeLists.txt

* update example make use of newly promoted method

* update tests to make use of newly promoted methods
2020-03-31 19:35:23 -04:00
Adrià Arrufat 57bb5eb58d
use running stats to track losses (#2041) 2020-03-30 20:20:50 -04:00
Davis King 0057461a62 Promote some of the sub-network methods into the add_loss_layer interface so users don't have to write .subnet() so often. 2020-03-29 12:17:56 -04:00
Adrià Arrufat f42f100d0f
Add DCGAN example (#2035)
* wip: dcgan-example

* wip: dcgan-example

* update example to use leaky_relu and remove bias from net

* wip

* it works!

* add more comments

* add visualization code

* add example documentation

* rename example

* fix comment

* better comment format

* fix the noise generator seed

* add message to hit enter for image generation

* fix srand, too

* add std::vector overload to update_parameters

* improve training stability

* better naming of variables

make sure it is clear we update the generator with the discriminator's
gradient using fake samples and true labels

* fix comment: generator -> discriminator

* update leaky_relu docs to match the relu ones

* replace not with !

* add Davis' suggestions to make training more stable

* use tensor instead of resizable_tensor

* do not use dnn_trainer for discriminator
2020-03-29 11:07:38 -04:00
Adrià Arrufat c832d3b2fc
simplify resnet definition by reusing struct template parameter (#2010)
* simplify definition by reusing struct template parameter

* put resnet into its own namespace

* fix infer names

* rename struct impl to def
2020-03-09 21:21:04 -04:00
Davis King fc6992ac04 A little bit of cleanup 2020-02-07 08:12:18 -05:00
Adrià Arrufat 10d7f119ca
Add dnn_introduction3_ex (#1991)
* Add dnn_introduction3_ex
2020-02-07 07:59:36 -05:00
Juha Reunanen bd6994cc66 Add new loss layer for binary loss per pixel (#1976)
* Add new loss layer for binary loss per pixel
2020-01-20 07:47:47 -05:00
Davis King f2cd9e3b1d use a time based exeuction limit in example 2019-11-28 10:48:02 -05:00
Juha Reunanen d175c35074 Instance segmentation (#1918)
* Add instance segmentation example - first version of training code

* Add MMOD options; get rid of the cache approach, and instead load all MMOD rects upfront

* Improve console output

* Set filter count

* Minor tweaking

* Inference - first version, at least compiles!

* Ignore overlapped boxes

* Ignore even small instances

* Set overlaps_ignore

* Add TODO remarks

* Revert "Set overlaps_ignore"

This reverts commit 65adeff1f8.

* Set result size

* Set label image size

* Take ignore-color into account

* Fix the cropping rect's aspect ratio; also slightly expand the rect

* Draw the largest findings last

* Improve masking of the current instance

* Add some perturbation to the inputs

* Simplify ground-truth reading; fix random cropping

* Read even class labels

* Tweak default minibatch size

* Learn only one class

* Really train only instances of the selected class

* Remove outdated TODO remark

* Automatically skip images with no detections

* Print to console what was found

* Fix class index problem

* Fix indentation

* Allow to choose multiple classes

* Draw rect in the color of the corresponding class

* Write detector window classes to ostream; also group detection windows by class (when ostreaming)

* Train a separate instance segmentation network for each classlabel

* Use separate synchronization file for each seg net of each class

* Allow more overlap

* Fix sorting criterion

* Fix interpolating the predicted mask

* Improve bilinear interpolation: if output type is an integer, round instead of truncating

* Add helpful comments

* Ignore large aspect ratios; refactor the code; tweak some network parameters

* Simplify the segmentation network structure; make the object detection network more complex in turn

* Problem: CUDA errors not reported properly to console
Solution: stop and join data loader threads even in case of exceptions

* Minor parameters tweaking

* Loss may have increased, even if prob_loss_increasing_thresh > prob_loss_increasing_thresh_max_value

* Add previous_loss_values_dump_amount to previous_loss_values.size() when deciding if loss has been increasing

* Improve behaviour when loss actually increased after disk sync

* Revert some of the earlier change

* Disregard dumped loss values only when deciding if learning rate should be shrunk, but *not* when deciding if loss has been going up since last disk sync

* Revert "Revert some of the earlier change"

This reverts commit 6c852124ef.

* Keep enough previous loss values, until the disk sync

* Fix maintaining the dumped (now "effectively disregarded") loss values count

* Detect cats instead of aeroplanes

* Add helpful logging

* Clarify the intention and the code

* Review fixes

* Add operator== for the other pixel types as well; remove the inline

* If available, use constexpr if

* Revert "If available, use constexpr if"

This reverts commit 503d4dd335.

* Simplify code as per review comments

* Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh

* Clarify console output

* Revert "Keep estimating steps_without_progress, even if steps_since_last_learning_rate_shrink < iter_without_progress_thresh"

This reverts commit 9191ebc776.

* To keep the changes to a bare minimum, revert the steps_since_last_learning_rate_shrink change after all (at least for now)

* Even empty out some of the previous test loss values

* Minor review fixes

* Can't use C++14 features here

* Do not use the struct name as a variable name
2019-11-14 22:53:16 -05:00
Davis King 1b83016abd update docs 2019-10-24 20:15:34 -04:00
Davis King 39327e71b7 Added note about using cmake's new fetch content feature. 2019-10-24 07:50:30 -04:00
Davis King fced3587f1 fixing grammar 2019-07-27 09:03:14 -04:00
Davis King 5d03b99a08 Changed to avoid compiler warning. 2019-03-03 20:12:43 -05:00
Juha Reunanen f685cb4249 Add U-net style skip connections to the semantic-segmentation example (#1600)
* Add concat_prev layer, and U-net example for semantic segmentation

* Allow to supply mini-batch size as command-line parameter

* Decrease default mini-batch size from 30 to 24

* Resize t1, if needed

* Use DenseNet-style blocks instead of residual learning

* Increase default mini-batch size to 50

* Increase default mini-batch size from 50 to 60

* Resize even during the backward step, if needed

* Use resize_bilinear_gradient for the backward step

* Fix function call ambiguity problem

* Clear destination before adding gradient

* Works OK-ish

* Add more U-tags

* Tweak default mini-batch size

* Define a simpler network when using Microsoft Visual C++ compiler; clean up the DenseNet stuff (leaving it for a later PR)

* Decrease default mini-batch size from 24 to 23

* Define separate dnn filename for MSVC++ and not

* Add documentation for the resize_to_prev layer; move the implementation so that it comes after mult_prev

* Fix previous typo

* Minor formatting changes

* Reverse the ordering of levels

* Increase the learning-rate stopping criterion back to 1e-4 (was 1e-8)

* Use more U-tags even on Windows

* Minor formatting

* Latest MSVC 2017 builds fast, so there's no need to limit the depth any longer

* Tweak default mini-batch size again

* Even though latest MSVC can now build the extra layers, it does not mean we should add them!

* Fix naming
2019-01-06 09:11:39 -05:00
Juha Reunanen cf5e25a95f Problem: integer overflow when calculating sizes (may happen e.g. with very large images) (#1148)
* Problem: integer overflow when calculating sizes (may happen e.g. with very large images)
Solution: change some types from (unsigned) long to size_t

# Conflicts:
#	dlib/dnn/tensor.h

* Fix the fact that std::numeric_limits<unsigned long>::max() isn't always the same number

* Revert serialization changes

* Review fix: use long long instead of size_t

* From long to long long all the way

* Change more types to (hopefully) make the compiler happy

* Change many more types to size_t

* Change even more types to size_t

* Minor type changes
2018-03-01 07:27:29 -05:00
Davis King e6fe1e0259 merged 2017-12-25 08:51:15 -05:00
Davis King c9faacce29 Fixed typos 2017-12-25 08:50:34 -05:00
Duc Thien Bui 9185e0a725 fix typo in train find cars example (#1028) 2017-12-25 07:56:42 -05:00
Davis King efd945618d Minor update 2017-12-18 16:20:21 -05:00
Davis King 603ebc2750 Changed this example to use repeat layers. This doesn't change the behavior of
the code, but it helps visual studio use less RAM when building the example,
and might make appveyor not crash.  It's also a
slightly cleaner way to write the code anyway.
2017-12-17 14:41:00 -05:00
Davis King 22f26ebe97 Improved visual studio compilation instructions 2017-12-16 23:17:37 -05:00
Davis King 46a1893534 Fixed spelling error in comment 2017-12-11 06:40:00 -05:00
visionworkz ac292309c1 Exposed jitter_image in Python and added an example (#980)
* Exposed jitter_image in Python and added an example

* Return Numpy array directly

* Require numpy during setup

* Added install of Numpy before builds

* Changed pip install for user only due to security issues.

* Removed malloc

* Made presence of Numpy during compile optional.

* Conflict

* Refactored get_face_chip/get_face_chips to use Numpy as well.
2017-12-08 09:59:27 -05:00
Davis King e273f5159d Added find_min_global() overloads. 2017-12-02 07:53:31 -05:00
Davis King 15c04ab224 This example still uses a lot of visual studio ram. 2017-12-01 00:26:31 -05:00
Davis King 2b3d8609e5 These examples compile now in visual studio due to the recent pragma directive added to core.h. 2017-11-30 22:38:29 -05:00
Davis King 4fa32903a6 clarified docs 2017-11-25 19:28:53 -05:00
Davis King 4d0b203541 Just moved the try block to reduce the indentation level. 2017-11-25 12:25:35 -05:00
Davis King 929870d3ad Updated example to use C++11 style code and also to show the new find_max_global() routine. 2017-11-25 12:23:43 -05:00
Davis King 1aa6667481 Switched this example to use the svm C instead of nu trainer. 2017-11-25 08:26:16 -05:00
Davis King 04991b7da6 Made this example program use the new find_max_global() instead of grid search
and BOBYQA.  This greatly simplifies the example.
2017-11-24 22:04:25 -05:00
Amin Cheloh 1798e8877c Update dnn_mmod_find_cars2_ex.cpp (#966) 2017-11-17 06:38:48 -05:00
Davis King b84e2123d1 Changed network filename to something more descriptive. 2017-11-15 07:10:50 -05:00
Juha Reunanen e48125c2a2 Add semantic segmentation example (#943)
* Add example of semantic segmentation using the PASCAL VOC2012 dataset

* Add note about Debug Information Format when using MSVC

* Make the upsampling layers residual as well

* Fix declaration order

* Use a wider net

* trainer.set_iterations_without_progress_threshold(5000); // (was 20000)

* Add residual_up

* Process entire directories of images (just easier to use)

* Simplify network structure so that builds finish even on Visual Studio (faster, or at all)

* Remove the training example from CMakeLists, because it's too much for the 32-bit MSVC++ compiler to handle

* Remove the probably-now-unnecessary set_dnn_prefer_smallest_algorithms call

* Review fix: remove the batch normalization layer from right before the loss

* Review fix: point out that only the Visual C++ compiler has problems.
Also expand the instructions how to run MSBuild.exe to circumvent the problems.

* Review fix: use dlib::match_endings

* Review fix: use dlib::join_rows. Also add some comments, and instructions where to download the pre-trained net from.

* Review fix: make formatting comply with dlib style conventions.

* Review fix: output training parameters.

* Review fix: remove #ifndef __INTELLISENSE__

* Review fix: use std::string instead of char*

* Review fix: update interpolation_abstract.h to say that extract_image_chips can now take the interpolation method as a parameter

* Fix whitespace formatting

* Add more comments

* Fix finding image files for inference

* Resize inference test output to the size of the input; add clarifying remarks

* Resize net output even in calculate_accuracy

* After all crop the net output instead of resizing it by interpolation

* For clarity, add an empty line in the console output
2017-11-15 07:01:52 -05:00
Davis King 978da26ed0 Fixed grammar in comment 2017-11-05 07:37:29 -05:00
Davis King 50de3da992 Updated comments to reflect recent API changes. 2017-11-02 05:43:15 -04:00
Davis King dc0245af05 Changed graph construction for chinese_whispers() so that each face is always
included in the edge graph.  If it isn't then the output labels from
chinese_whispers would be missing faces in this degenerate case.
2017-10-27 19:29:52 -04:00