From cd376f8ac4a327a9101a447e5d2bcb6c9308a69c Mon Sep 17 00:00:00 2001 From: Ludovic Hussonnois <ludovic.hussonnois@c-s.fr> Date: Thu, 27 Apr 2017 17:09:41 +0100 Subject: [PATCH] DOC: Update vector training and classification recipes. --- .../Cookbook/rst/recipes/featextract.rst | 3 +- .../Cookbook/rst/recipes/pbclassif.rst | 116 +++++++++++++----- 2 files changed, 90 insertions(+), 29 deletions(-) diff --git a/Documentation/Cookbook/rst/recipes/featextract.rst b/Documentation/Cookbook/rst/recipes/featextract.rst index 8e4c73c386..3d5e1af59d 100644 --- a/Documentation/Cookbook/rst/recipes/featextract.rst +++ b/Documentation/Cookbook/rst/recipes/featextract.rst @@ -6,7 +6,8 @@ refers to techniques aiming at extracting added value information from images. These extracted items named *features* can be local statistical moments, edges, radiometric indices, morphological and textural properties. For example, such features can be used as input data for -other image processing methods like *Segmentation* and *Classification*. +other image processing methods like *Segmentation* and +`Classification <https://www.orfeo-toolbox.org/CookBook/recipes/pbclassif.html#feature-classification>`_ . Local statistics extraction --------------------------- diff --git a/Documentation/Cookbook/rst/recipes/pbclassif.rst b/Documentation/Cookbook/rst/recipes/pbclassif.rst index 5189fd24c5..156b8b9c63 100644 --- a/Documentation/Cookbook/rst/recipes/pbclassif.rst +++ b/Documentation/Cookbook/rst/recipes/pbclassif.rst @@ -1,6 +1,90 @@ Classification ============== +Feature classification and training +----------------------------------- + +The Orfeo ToolBox provided applications to train a supervised +or unsupervised classifier from different set of *features* +and to use the generated classifier for vector data classification. +Those *features* can be information extracted from images +(see `feature extraction <https://www.orfeo-toolbox.org/CookBook/recipes/featextract.html#feature-extraction>`_ section) +or it can be different types of *features* such as the perimeter, width, +or area of a surface present in a vector data file in an ogr compatible +format. + +Train a classifier with features +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The *TrainVectorClassifier* application provide a way to train a classifier +with an input set of labeled geometries and a list of *features* to consider +for classification. + +:: + + otbcli_TrainVectorClassifier -io.vd samples.sqlite + -cfield CODE + -io.out model.rf + -classifier rf + -feat perimeter area width + +The ``-classifier`` parameter allows to choose which machine learning +model algorithm to train. You have the possibility to do the unsupervised +classification,for it, you must to choose the Shark kmeans classifier. +Please refer to the ``TrainVectorClassifier`` application reference documentation. + +In case of multiple samples files, you can add them to the ``-io.vd`` +parameter. + +The feature to be used for training must be explicitly listed using +the ``-feat`` parameter. Order of the list matters. + +If you want to use a statistic file for features normalization, you +can pass it using the ``-io.stats`` parameter. Make sure that the +order of feature statistics in the statistics file matches the order +of feature passed to the ``-feat`` option. + +The field in vector data allowing to specify the label of each sample +can be set using the ``-cfield`` option. + +By default, the application will estimate the trained classifier +performances on the same set of samples that has been used for +training. The ``-io.vd`` parameter allows to specify a different +samples file for this purpose, for a more fair estimation of the +performances. Note that this performances estimation scheme can also +be estimated afterward (see `Validating the classification model`_ +section). + + +Features classification +~~~~~~~~~~~~~~~~~~~~~~~ + +Once the classifier has been trained, one can apply the model to +classify a set of features on a new vector data file using the +*VectorClassifier* application: + +:: + + otbcli_VectorClassifier -in vectorData.shp + -model model.rf + -feat perimeter area width + -cfield predicted + -out classifiedData.shp + +This application output a vector data file storing sample values +and classification label. The output is optional, in this case the +input vector data classification label field is updated. + +Validating classification +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The performance of the model generated by the *TrainVectorClassifier* +or *TrainImagesClassifier* applications is directly estimated by the +application itself, which displays the precision, recall and F-score +of each class, and can generate the global confusion matrix for +supervised algorithms. For unsupervised algorithms a contingency table +is generated. Those results are output as an \*.CSV file. + Pixel based classification -------------------------- @@ -346,33 +430,11 @@ using the ``TrainVectorClassifier`` application. -classifier rf -feat band_0 band_1 band_2 band_3 band_4 band_5 band_6 -The ``-classifier`` parameter allows to choose which machine learning -model algorithm to train. You have the possibility to do the unsupervised -classification,for it, you must to choose the Shark kmeans classifier. -Please refer to the ``TrainVectorClassifier`` application reference documentation. - In case of multiple samples files, you can add them to the ``-io.vd`` parameter (see `Working with several images`_ section). -The feature to be used for training must be explicitly listed using -the ``-feat`` parameter. Order of the list matters. - -If you want to use a statistic file for features normalization, you -can pass it using the ``-io.stats`` parameter. Make sure that the -order of feature statistics in the statistics file matches the order -of feature passed to the ``-feat`` option. - -The field in vector data allowing to specify the label of each sample -can be set using the ``-cfield`` option. - -By default, the application will estimate the trained classifier -performances on the same set of samples that has been used for -training. The ``-io.vd`` parameter allows to specify a different -samples file for this purpose, for a more fair estimation of the -performances. Note that this performances estimation scheme can also -be estimated afterward (see `Validating the classification model`_ -section). - +For more information about the training process for features +please refer to the `Train a classifier with features`_ section. Using the classification model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -395,10 +457,8 @@ with value >0. Validating the classification model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The performance of the model generated by the *TrainImagesClassifier* -application is directly estimated by the application itself, which -displays the precision, recall and F-score of each class, and can -generate the global confusion matrix as an output \*.CSV file. +The Orfeo ToolBox training applications provides information about the performance +of the generated model (see `Validating classification`_ ). With the *ConputeConfusionMatrix* application, it is also possible to estimate the performance of a model from a classification map generated -- GitLab