dctools.metrics.evaluator.Evaluator

class dctools.metrics.evaluator.Evaluator(dataset_manager, metrics, dataloader, ref_aliases, dataset_processor, dask_cfgs_by_dataset=None, results_dir=None, reduce_precision=False, surface_only=False, restart_workers_per_batch=False, restart_frequency=1, max_p_memory_increase=0.2, max_worker_memory_fraction=0.85, max_worker_rss_fraction=1.0, resume=False)

Class to evaluate metrics on datasets.

Parameters:
  • dataset_manager (MultiSourceDatasetManager)

  • metrics (Dict[str, List[MetricComputer]])

  • dataloader (EvaluationDataloader)

  • ref_aliases (List[str])

  • dataset_processor (oceanbench.core.distributed.DatasetProcessor)

  • dask_cfgs_by_dataset (Dict[str, Dict[str, Any]] | None)

  • results_dir (str | None)

  • reduce_precision (bool)

  • surface_only (bool)

  • restart_workers_per_batch (bool)

  • restart_frequency (int)

  • max_p_memory_increase (float)

  • max_worker_memory_fraction (float)

  • max_worker_rss_fraction (float)

  • resume (bool)

__init__(dataset_manager, metrics, dataloader, ref_aliases, dataset_processor, dask_cfgs_by_dataset=None, results_dir=None, reduce_precision=False, surface_only=False, restart_workers_per_batch=False, restart_frequency=1, max_p_memory_increase=0.2, max_worker_memory_fraction=0.85, max_worker_rss_fraction=1.0, resume=False)

Initializes the evaluator.

Parameters:
  • dataset_manager (MultiSourceDatasetManager) – Multi-source dataset manager.

  • metrics (Dict[str, List[MetricComputer]]) – Dictionary {ref_alias: [MetricComputer, …]}.

  • dataloader (EvaluationDataloader) – Dataloader for evaluation.

  • ref_aliases (List[str]) – List of reference aliases.

  • dataset_processor (DatasetProcessor) – Dataset processor for distribution.

  • dask_cfgs_by_dataset (Dict[str, Dict[str, Any]], optional) – Per-dataset Dask configuration (n_workers, threads_per_worker, memory_limit) extracted from the YAML config sources. Defaults to None.

  • results_dir (str, optional) – Folder to save results. Defaults to None.

  • reduce_precision (bool, optional) – Reduce float precision (float32). Defaults to False.

  • surface_only (bool, optional) – When True, select only the surface depth level immediately after opening gridded datasets. This avoids carrying the full 3-D depth dimension through all subsequent transform and compute steps. Defaults to False.

  • restart_workers_per_batch (bool, optional) – Restart workers after each batch. Defaults to False.

  • restart_frequency (int, optional) – Frequency (nb of batches) cleanup/restart. Defaults to 1.

  • max_p_memory_increase (float, optional) – RAM increase threshold before restart. Defaults to 0.5 (50%).

  • max_worker_memory_fraction (float, optional) – Absolute threshold (fraction of Dask memory_limit) beyond which restart is triggered. Defaults to 0.85 (85%).

  • resume (bool, optional) – When True, skip batches whose result file already exists and passes integrity checks. Defaults to False.

  • max_worker_rss_fraction (float)

Methods

__init__(dataset_manager, metrics, ...[, ...])

Initializes the evaluator.

clean_namespace(namespace)

Clean namespace by removing unpicklable objects.

evaluate()

Evaluates metrics on dataloader data for each reference.

get_max_memory_fraction()

Get max(memory_used / memory_limit) across workers.

get_max_memory_usage()

Get the maximum memory usage across all workers (in bytes).

log_cluster_memory_usage(batch_idx)

Log memory usage of each Dask worker.

clean_namespace(namespace)

Clean namespace by removing unpicklable objects.

Parameters:

namespace (Namespace)

Return type:

Namespace

evaluate()

Evaluates metrics on dataloader data for each reference.

Returns:

Metric results for each batch and each reference.

Return type:

List[Dict[str, Any]]

get_max_memory_fraction()

Get max(memory_used / memory_limit) across workers.

Returns:

Fraction in [0, +inf). Returns 0.0 if unavailable.

Return type:

float

get_max_memory_usage()

Get the maximum memory usage across all workers (in bytes).

Return type:

float

log_cluster_memory_usage(batch_idx)

Log memory usage of each Dask worker.

Parameters:

batch_idx (int)