Commit cd376f8a authored by Ludovic Hussonnois's avatar Ludovic Hussonnois

DOC: Update vector training and classification recipes.

parent 6ff84331
......@@ -6,7 +6,8 @@ refers to techniques aiming at extracting added value information from
images. These extracted items named *features* can be local statistical
moments, edges, radiometric indices, morphological and textural
properties. For example, such features can be used as input data for
other image processing methods like *Segmentation* and *Classification*.
other image processing methods like *Segmentation* and
`Classification <https://www.orfeo-toolbox.org/CookBook/recipes/pbclassif.html#feature-classification>`_ .
Local statistics extraction
---------------------------
......
Classification
==============
Feature classification and training
-----------------------------------
The Orfeo ToolBox provided applications to train a supervised
or unsupervised classifier from different set of *features*
and to use the generated classifier for vector data classification.
Those *features* can be information extracted from images
(see `feature extraction <https://www.orfeo-toolbox.org/CookBook/recipes/featextract.html#feature-extraction>`_ section)
or it can be different types of *features* such as the perimeter, width,
or area of a surface present in a vector data file in an ogr compatible
format.
Train a classifier with features
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The *TrainVectorClassifier* application provide a way to train a classifier
with an input set of labeled geometries and a list of *features* to consider
for classification.
::
otbcli_TrainVectorClassifier -io.vd samples.sqlite
-cfield CODE
-io.out model.rf
-classifier rf
-feat perimeter area width
The ``-classifier`` parameter allows to choose which machine learning
model algorithm to train. You have the possibility to do the unsupervised
classification,for it, you must to choose the Shark kmeans classifier.
Please refer to the ``TrainVectorClassifier`` application reference documentation.
In case of multiple samples files, you can add them to the ``-io.vd``
parameter.
The feature to be used for training must be explicitly listed using
the ``-feat`` parameter. Order of the list matters.
If you want to use a statistic file for features normalization, you
can pass it using the ``-io.stats`` parameter. Make sure that the
order of feature statistics in the statistics file matches the order
of feature passed to the ``-feat`` option.
The field in vector data allowing to specify the label of each sample
can be set using the ``-cfield`` option.
By default, the application will estimate the trained classifier
performances on the same set of samples that has been used for
training. The ``-io.vd`` parameter allows to specify a different
samples file for this purpose, for a more fair estimation of the
performances. Note that this performances estimation scheme can also
be estimated afterward (see `Validating the classification model`_
section).
Features classification
~~~~~~~~~~~~~~~~~~~~~~~
Once the classifier has been trained, one can apply the model to
classify a set of features on a new vector data file using the
*VectorClassifier* application:
::
otbcli_VectorClassifier -in vectorData.shp
-model model.rf
-feat perimeter area width
-cfield predicted
-out classifiedData.shp
This application output a vector data file storing sample values
and classification label. The output is optional, in this case the
input vector data classification label field is updated.
Validating classification
~~~~~~~~~~~~~~~~~~~~~~~~~
The performance of the model generated by the *TrainVectorClassifier*
or *TrainImagesClassifier* applications is directly estimated by the
application itself, which displays the precision, recall and F-score
of each class, and can generate the global confusion matrix for
supervised algorithms. For unsupervised algorithms a contingency table
is generated. Those results are output as an \*.CSV file.
Pixel based classification
--------------------------
......@@ -346,33 +430,11 @@ using the ``TrainVectorClassifier`` application.
-classifier rf
-feat band_0 band_1 band_2 band_3 band_4 band_5 band_6
The ``-classifier`` parameter allows to choose which machine learning
model algorithm to train. You have the possibility to do the unsupervised
classification,for it, you must to choose the Shark kmeans classifier.
Please refer to the ``TrainVectorClassifier`` application reference documentation.
In case of multiple samples files, you can add them to the ``-io.vd``
parameter (see `Working with several images`_ section).
The feature to be used for training must be explicitly listed using
the ``-feat`` parameter. Order of the list matters.
If you want to use a statistic file for features normalization, you
can pass it using the ``-io.stats`` parameter. Make sure that the
order of feature statistics in the statistics file matches the order
of feature passed to the ``-feat`` option.
The field in vector data allowing to specify the label of each sample
can be set using the ``-cfield`` option.
By default, the application will estimate the trained classifier
performances on the same set of samples that has been used for
training. The ``-io.vd`` parameter allows to specify a different
samples file for this purpose, for a more fair estimation of the
performances. Note that this performances estimation scheme can also
be estimated afterward (see `Validating the classification model`_
section).
For more information about the training process for features
please refer to the `Train a classifier with features`_ section.
Using the classification model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
......@@ -395,10 +457,8 @@ with value >0.
Validating the classification model
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The performance of the model generated by the *TrainImagesClassifier*
application is directly estimated by the application itself, which
displays the precision, recall and F-score of each class, and can
generate the global confusion matrix as an output \*.CSV file.
The Orfeo ToolBox training applications provides information about the performance
of the generated model (see `Validating classification`_ ).
With the *ConputeConfusionMatrix* application, it is also possible to
estimate the performance of a model from a classification map generated
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment