ENH: optimize input reading in `SampleAugmentation` (!988) · Merge requests · Main Repositories / otb

Laurențiu Nicola requested to merge sample-augmentation-reading into develop Jan 19, 2024

SampleAugmentation's SMOTE implementation is a bit inefficient, but while I was looking into the memory usage, I noticed it was slow even when using only 6700 rows out of 11M.

There are roughly two changes here:

using SetAttributeFilter instead of iterating through every feature to filter by the selected class
doing fewer field type checks

In my case, which is a merged VRT (which is not ideal for GDAL), the app takes about:

2125 seconds, originally
173 seconds, using SetAttributeFilter
168 seconds, with both changes

Edited Jan 19, 2024 by Laurențiu Nicola

ENH: optimize input reading in `SampleAugmentation`

Merge request reports