Processing DAGs¶
These DAGs run after the IFS download DAGs and before the model run. They verify that all three IFS files (analysis, analysis-00, forecast) are present and then invoke the IFS-to-NEMO pre-processing script that converts ECMWF fields into the format expected by the ocean model.
ifs_process (Lucia)¶
| Property | Value |
|---|---|
| Schedule | 15 08 * * * (08:15 UTC) |
| Retries | 80 x 10 min (default), 3 x - for prepare_ifs only |
| File | LUCIA/process_ifs.py |
All four check tasks run in parallel and must all succeed before prepare_ifs starts.
check_ifs_an / check_ifs_an00 / check_ifs_fc are SSHOperators that verify the expected IFS files exist on Lucia's GPFS filesystem:
# check_ifs_an
ssh frontal "test -f /gpfs/.../data/IFS/Analysis/\
{{ get_date(-1,0) }}-ECMWF---AM0100-MEDATL-b{{ get_date(0,0) }}_an-fv13.00.nc"
# check_ifs_fc
ssh frontal "test -f /gpfs/.../data/IFS/Forecast/\
{{ get_date(0,0) }}_{{ get_date(10,0) }}-ECMWF---AM0100-MEDATL-b{{ get_date(0,0) }}_fc00-fv13.00.nc"
test -f exits with code 0 if the file exists and code 1 if not. Airflow treats a non-zero exit code as task failure, which triggers a retry.
check_free verifies that no previous V2025_an Slurm jobs are already running on Lucia, preventing the IFS preparation from starting while a leftover job is consuming cluster resources:
ssh frontal "count=\$(squeue -u lvdbk | grep V2025_an | wc -l); echo Jobs found: \$count; exit \$count"
If count > 0 the exit code is non-zero, triggering a retry. This is an important guard: if the previous day's run got stuck in the queue, the new day's IFS preparation would otherwise start and overwrite the namelist files that the stuck job is using.
prepare_ifs runs the conversion script on Lucia:
This script reads the three IFS NetCDF files and produces the ecmwf_{date}.nc files that NEMO expects. The output file ecmwf_{{ get_date(10,2) }}.nc is later checked by model_lucia_run before submitting the Slurm job.
ifs_process_local (LOCAL)¶
| Property | Value |
|---|---|
| Schedule | 15 08 * * * (08:15 UTC) |
| Retries | 80 x 10 min |
| File | LOCAL/process_ifs.py |
Same three-file check pattern as the Lucia version, but run locally inside the Docker worker container using BashOperator:
# check_ifs_an
test -f /opt/airflow/marines_data/data/IFS/Analysis/\
{{ get_date(-1,0) }}-ECMWF---AM0100-MEDATL-b{{ get_date(0,0) }}_an-fv13.00.nc
Files are in the locally mounted /opt/airflow/marines_data/data/IFS/ directory, populated by the LOCAL IFS download DAGs.
prepare_ifs runs the same conversion script but from inside the Docker container:
Note the quotes around the path: this is a BashOperator quirk where a trailing space or quote is needed to prevent Jinja2 from treating the path as a template. The output feeds the GCP pipeline's forcing preparation step.
No check_free in the LOCAL version
The local version does not check for a running Slurm job because it operates on the GCP pipeline, not on Lucia's Slurm cluster. The GCP job is managed differently.