Sentinel2 masks for no-data
It seems that missing masks (CLM, CLP, dataMask) that can be found on Terrascope OpenEO SENTINEL2_L2A
collection but not on CDSE SENTINEL2_L2A
collection are in fact produced by synergise using s2cloudless. They are not part of the standard sen2cor L2A product. Hence, it is unlikely that CDSE add them in the future.
In order to assess the impact of using only the bare SCL mask from CDSE, I did 2 runs on the code that generated the LS2S2 dataset.
First is the reference code that uses SCL, CLM, CLP and dataMask:
# Build no-data mask from Scene classification layer
no_data_mask = s2_arr.SCL.astype(np.uint8).isin([0, 1, 2, 3, 7, 8, 9, 10])
no_data_mask = np.logical_or(no_data_mask, s2_arr["dataMask"] == 0)
no_data_mask = np.logical_or(no_data_mask, s2_arr["CLM"] > 0)
no_data_mask = np.logical_or(no_data_mask, s2_arr["CLP"] > 150)
mask_stack = []
# Perform mask dilation since sen2corr mask are very tight
for t in s2_arr.t:
current_mask = no_data_mask.sel(t=t).values
mask_stack.append(
mask_processing(current_mask, min_object_size=10, dilation=25)
)
mask_stack = np.stack(mask_stack)
nan_mask = np.isnan(s2_arr["B02"].values)
for b in (
"B03",
"B04",
"B08",
"B05",
"B06",
"B07",
"B8A",
"B11",
"B12",
):
nan_mask = np.logical_or(nan_mask, np.isnan(s2_arr[b].values))
# Introduce nan mask here since we do not want to dilate nan mask
s2_arr["no_data"] = ("t", "y", "x"), np.logical_or(nan_mask, mask_stack)
Second is the same code, but removes all logic related to CLM, CLP and dataMask:
# Build no-data mask from Scene classification layer
no_data_mask = s2_arr.SCL.astype(np.uint8).isin([0, 1, 2, 3, 7, 8, 9, 10])
mask_stack = []
# Perform mask dilation since sen2corr mask are very tight
for t in s2_arr.t:
current_mask = no_data_mask.sel(t=t).values
mask_stack.append(
mask_processing(current_mask, min_object_size=10, dilation=25)
)
mask_stack = np.stack(mask_stack)
nan_mask = np.isnan(s2_arr["B02"].values)
for b in (
"B03",
"B04",
"B08",
"B05",
"B06",
"B07",
"B8A",
"B11",
"B12",
):
nan_mask = np.logical_or(nan_mask, np.isnan(s2_arr[b].values))
# Introduce nan mask here since we do not want to dilate nan mask
s2_arr["no_data"] = ("t", "y", "x"), np.logical_or(nan_mask, mask_stack)
As expected cloud masking is not as good with the bare SCL mask, especially for the small and on the edges of large clouds.
Is this good enough for RELEO ?