Strip extended filenames or URLs with parameters, in pyotb.summarize() return
The idea is to enable modifying the paths of resources in pyotb.summarize()
, to strip what's after ?
character in the resources paths.
Rationale:
import planetary_computer
some_pc_asset = "/vsicurl/https://sentinel1euwest.blob.core.windows.net/s1-"
"grd/GRD/2020/12/27/IW/DV/S1B_IW_GRDH_1SDV_20201227T060759_"
"20201227T060824_024884_02F5F6_847A/measurement/iw-vv.tiff"
signed_asset = planetary_computer.sign_inplace(some_pc_asset)
# Now signed_asset is:
# "/vsicurl/https://sentinel1euwest.blob.core.windows.net/s1-"
# "grd/GRD/2020/12/27/IW/DV/S1B_IW_GRDH_1SDV_20201227T060759_"
# "20201227T060824_024884_02F5F6_847A/measurement/iw-vv.tiff?"
# "st=2023-05-20T20%3A14%3A07Z&se=2023-05-21T20%3A59%3A07Z&sp"
# "=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0"
# "f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=202"
# "3-05-20T17%3A22%3A17Z&ske=2023-05-27T17%3A22%3A17Z&sks=b&s"
# "kv=2021-06-08&sig=ww1ZfySpWebi3x6NJNJxdkqPBBHvPw%2B2qIqGp1"
# "UeGX4%3D
app = pyotb.SomeApplication({"in": signed_asset, ...})
summary1 = pyotb.summarize(app)
Now, summary1
includes the whole planetary computer signed URL.
This is not really interesting, since the URL has a short time
to live and will expire soon. Moreover, it's bad if the purpose of
the summary is to be reused since it would imply to remove manually
the SAS token before signing the original URL again.
{
"name": "SomeApplication",
"parameters": {
"in": "/vsicurl/https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2020/12/27/IW/DV/S1B_IW_GRDH_1SDV_20201227T060759_20201227T060824_024884_02F5F6_847A/measurement/iw-vv.tiff?st=2023-05-20T20%3A14%3A07Z&se=2023-05-21T20%3A59%3A07Z&sp=rl&sv=2021-06-08&sr=c&skoid=c85c15d6-d1ae-42d4-af60-e2ca0f81359b&sktid=72f988bf-86f1-41af-91ab-2d7cd011db47&skt=2023-05-20T17%3A22%3A17Z&ske=2023-05-27T17%3A22%3A17Z&sks=b&skv=2021-06-08&sig=ww1ZfySpWebi3x6NJNJxdkqPBBHvPw%2B2qIqGp1UeGX4%3D",
...
}
}
Proposed change
We could add an option to summarize()
to strip URLs parameters, like this:
...
summary2 = pyotb.summarize(app, strip=True)
Which would result in:
{
"name": "SomeApplication",
"parameters": {
"in": "/vsicurl/https://sentinel1euwest.blob.core.windows.net/s1-grd/GRD/2020/12/27/IW/DV/S1B_IW_GRDH_1SDV_20201227T060759_20201227T060824_024884_02F5F6_847A/measurement/iw-vv.tiff",
...
}
}
It could also help to remove extended filenames when we forgot to remove them at applications input.
Anyway, the strip
option would be optional, so the user would do as pleased.
Edited by Rémi Cresson