Detect and report saturated storage (for tmp/data_out/data_in)
In order to avoid saturating disk space, S1Tiling should detect disk saturation and report it appropriately. Program exit code have been reserved in !33 (merged). Use exits.OUTPUT_DISK_FULL
and exits.TMP_DISK_FULL
from s1tiling.libs.exits.py
.
Several approaches are possible
-
define an allocated space in MB/GB in the configuration file, and stop processing+report when the quota is reached
-
pros: simple, portable
-
cons:
-
the quota is per execution on S1Tiling. If two instances are running on the same disks no load balancing is possible
-
this approach cannot know is the configured quota is compatible with the available diskspace allocated implicitly (e.g. we don't know how much space other processes on the same machine need) or explicitly on the storage (e.g. scratch has a limited capacity on HAL, per user!).
- => We need to check whether the allocated quota makes sense with the situation detected when S1Tiling process is started
-
-
-
define a percentile on available disk-space/quota allocated to user
-
pros: load balancing becomes implicit
-
cons:
- having a portable way to detect how much one can use may be complex
- on shared machines (qsubed nodes) where no quota is allocated to a job, we don't know how much can be used
-
-
let S1Tiling saturate, but intercept the error, and report the issue with the proper error code.
-
cons:
- we may not be able to have a complete log file
- we may saturate a disk space shared with other people and project (qsubed nodes)
-
pros: fail-safe mechanism that can be implemented on top of any of the two previous approaches.
-