KMeans classification gives weird results with the use masks

Mantis Issue 702, reported by sdinot, assigned to cpeyrega, created: 2013-04-16

I've tested the KMeans classification application on IMGTest3_stack.tif with IMGTest3_urbain_entrainement_masque.tif as a validity mask with the settings specified below. The ouput gives only one class, label 0.
With the ENVI KMeans classification, I've used the same dataset and the same settings. The output gives the number of classes set in the parameters.

Application crashes is training set size is set to 5.

The application doesn't seem to process the iterations, with the same settings, ENVI is much slower

1371651238 - cpeyregaHi.

Could you please upload some pieces of data with which you observe this issue, in order to let us test and resolve the problem ?

Thank you in advance.

1372253223 - cpeyregaBefore the KMeans learning, a subsampling is processed over the input image and over the mask if present, which implies the total number of samples after the subsampling being equal to the "Training set size" input parameter.

Thus, choosing a too small Training set size compared to the actual size of the smallest valid connected components (i.e. different from 0) will imply the smallest valid components being "eroded" or simply erased from the learning samples. This explains both observed issues: the wrong number of output classes and the application crash (SEGFAULT) if the "Training set size" is too small.

Thus, in order to avoid such an issue, the user is recommended to increase the "Training set size".

For instance, an error is logged when the input learning mask is too badly eroded in the f