From cd376f8ac4a327a9101a447e5d2bcb6c9308a69c Mon Sep 17 00:00:00 2001
From: Ludovic Hussonnois <ludovic.hussonnois@c-s.fr>
Date: Thu, 27 Apr 2017 17:09:41 +0100
Subject: [PATCH] DOC: Update vector training and classification recipes.

---
 .../Cookbook/rst/recipes/featextract.rst      |   3 +-
 .../Cookbook/rst/recipes/pbclassif.rst        | 116 +++++++++++++-----
 2 files changed, 90 insertions(+), 29 deletions(-)

diff --git a/Documentation/Cookbook/rst/recipes/featextract.rst b/Documentation/Cookbook/rst/recipes/featextract.rst
index 8e4c73c386..3d5e1af59d 100644
--- a/Documentation/Cookbook/rst/recipes/featextract.rst
+++ b/Documentation/Cookbook/rst/recipes/featextract.rst
@@ -6,7 +6,8 @@ refers to techniques aiming at extracting added value information from
 images. These extracted items named *features* can be local statistical
 moments, edges, radiometric indices, morphological and textural
 properties. For example, such features can be used as input data for
-other image processing methods like *Segmentation* and *Classification*.
+other image processing methods like *Segmentation* and
+`Classification <https://www.orfeo-toolbox.org/CookBook/recipes/pbclassif.html#feature-classification>`_ .
 
 Local statistics extraction
 ---------------------------
diff --git a/Documentation/Cookbook/rst/recipes/pbclassif.rst b/Documentation/Cookbook/rst/recipes/pbclassif.rst
index 5189fd24c5..156b8b9c63 100644
--- a/Documentation/Cookbook/rst/recipes/pbclassif.rst
+++ b/Documentation/Cookbook/rst/recipes/pbclassif.rst
@@ -1,6 +1,90 @@
 Classification
 ==============
 
+Feature classification and training
+-----------------------------------
+
+The Orfeo ToolBox provided applications to train a supervised
+or unsupervised classifier from different set of *features*
+and to use the generated classifier for vector data classification.
+Those *features* can be information extracted from images
+(see `feature extraction <https://www.orfeo-toolbox.org/CookBook/recipes/featextract.html#feature-extraction>`_ section)
+or it can be different types of *features* such as the perimeter, width,
+or area of a surface present in a vector data file in an ogr compatible
+format.
+
+Train a classifier with features
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The *TrainVectorClassifier* application provide a way to train a classifier
+with an input set of labeled geometries and a list of *features* to consider
+for classification.
+
+::
+
+   otbcli_TrainVectorClassifier -io.vd samples.sqlite
+                                -cfield CODE
+                                -io.out model.rf
+                                -classifier rf
+                                -feat perimeter area width
+
+The ``-classifier`` parameter allows to choose which machine learning
+model algorithm to train. You have the possibility to do the unsupervised
+classification,for it, you must to choose the Shark kmeans classifier.
+Please refer to the ``TrainVectorClassifier`` application reference documentation.
+
+In case of multiple samples files, you can add them to the ``-io.vd``
+parameter.
+
+The feature to be used for training must be explicitly listed using
+the ``-feat`` parameter. Order of the list matters.
+
+If you want to use a statistic file for features normalization, you
+can pass it using the ``-io.stats`` parameter. Make sure that the
+order of feature statistics in the statistics file matches the order
+of feature passed to the ``-feat`` option.
+
+The field in vector data allowing to specify the label of each sample
+can be set using the ``-cfield`` option.
+
+By default, the application will estimate the trained classifier
+performances on the same set of samples that has been used for
+training. The ``-io.vd`` parameter allows to specify a different
+samples file for this purpose, for a more fair estimation of the
+performances. Note that this performances estimation scheme can also
+be estimated afterward (see `Validating the classification model`_
+section).
+
+
+Features classification
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Once the classifier has been trained, one can apply the model to
+classify a set of features on a new vector data file using the
+*VectorClassifier* application:
+
+::
+
+    otbcli_VectorClassifier -in      vectorData.shp
+                            -model   model.rf
+                            -feat    perimeter area width
+                            -cfield  predicted
+                            -out     classifiedData.shp
+
+This application output a vector data file storing sample values
+and classification label. The output is optional, in this case the
+input vector data classification label field is updated.
+
+Validating classification
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The performance of the model generated by the *TrainVectorClassifier*
+or *TrainImagesClassifier* applications is directly estimated by the
+application itself, which displays the precision, recall and F-score
+of each class, and can generate the global confusion matrix for
+supervised algorithms. For unsupervised algorithms a contingency table
+is generated. Those results are output as an \*.CSV file.
+
 Pixel based classification
 --------------------------
 
@@ -346,33 +430,11 @@ using the ``TrainVectorClassifier`` application.
                                 -classifier rf
                                 -feat band_0 band_1 band_2 band_3 band_4 band_5 band_6
 
-The ``-classifier`` parameter allows to choose which machine learning
-model algorithm to train. You have the possibility to do the unsupervised
-classification,for it, you must to choose the Shark kmeans classifier.
-Please refer to the ``TrainVectorClassifier`` application reference documentation.
-
 In case of multiple samples files, you can add them to the ``-io.vd``
 parameter (see  `Working with several images`_ section).
 
-The feature to be used for training must be explicitly listed using
-the ``-feat`` parameter. Order of the list matters.
-
-If you want to use a statistic file for features normalization, you
-can pass it using the ``-io.stats`` parameter. Make sure that the
-order of feature statistics in the statistics file matches the order
-of feature passed to the ``-feat`` option.
-
-The field in vector data allowing to specify the label of each sample
-can be set using the ``-cfield`` option.
-
-By default, the application will estimate the trained classifier
-performances on the same set of samples that has been used for
-training. The ``-io.vd`` parameter allows to specify a different
-samples file for this purpose, for a more fair estimation of the
-performances. Note that this performances estimation scheme can also
-be estimated afterward (see `Validating the classification model`_
-section).
-                     
+For more information about the training process for features
+please refer to the `Train a classifier with features`_ section.
 
 Using the classification model
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -395,10 +457,8 @@ with value >0.
 Validating the classification model
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-The performance of the model generated by the *TrainImagesClassifier*
-application is directly estimated by the application itself, which
-displays the precision, recall and F-score of each class, and can
-generate the global confusion matrix as an output \*.CSV file.
+The Orfeo ToolBox training applications provides information about the performance
+of the generated model (see `Validating classification`_ ).
 
 With the *ConputeConfusionMatrix* application, it is also possible to
 estimate the performance of a model from a classification map generated
-- 
GitLab