Evaluation
This page describes what happens when you run DC1 evaluation and which configuration knobs matter in practice.
Run commands
Recommended command:
python -m dc1.submit run <data_path> --model-name <MODEL_NAME> --data-directory ./dc1_output
Validation only:
python -m dc1.submit validate <data_path> --model-name <MODEL_NAME>
Low-level runner:
python dc1/evaluate.py --model-name <MODEL_NAME>
evaluate.py injects default paths under dc1_output/ when they are not provided.
Main pipeline stages
Read submission files and normalize coordinates/aliases.
Enforce surface-only processing (
surface_only = trueinDC1Evaluation).Fetch and prepare reference observations/reanalysis according to YAML config.
Interpolate predictions to observation support and compute configured metrics.
Write consolidated outputs under the chosen data directory.
Default outputs
Typical files produced in dc1_output/results/:
results_<MODEL_NAME>.jsonresults_<MODEL_NAME>_per_bins.jsonl.gz(when per-bin output is enabled)coordinate_conformance_report.json
Logs are written in dc1_output/logs/ (default logfile name dc1.log).
Configuration profiles
DC1 ships two YAML profiles in dc1/config/:
dc1_wasabi.yamldc1_edito.yaml
Important keys to tune:
parallelism_presetsandvoluminous_parallelism_presetsrestart_workers_per_batchcleanup_between_batchesresumemax_worker_memory_fractionper_bins_resolution
Surface-only behavior
DC1 is strictly 2-D at evaluation time. If input data contains a depth dimension, the pipeline uses surface extraction and evaluates only the top level.
Temporal setup
Evaluation window: 2024-01-01 to 2025-01-01
Forecast horizon: 10 lead times (0..9)
Matching tolerance in configs: typically 12 hours for observation datasets
Practical guidance
Use
validate --quickbefore full runs for fast format checks.Keep
resume: truefor long jobs.Adjust worker counts and memory limits before increasing batch sizes.
Start with a short period/profile when benchmarking a new environment.