dctools.data.datasets.dataloader.preprocess_argo_profiles

dctools.data.datasets.dataloader.preprocess_argo_profiles(profile_sources, open_func, alias, time_bounds, depth_levels, n_points_dim='N_POINTS')

Load ARGO data through ArgoManager for a single time window.

This is the fallback path used when the evaluator’s shared-Zarr prefetch (ArgoManager.prefetch_batch_shared_zarr) did not run or failed. The preferred pipeline is:

  1. Driver merges all batch time-windows and downloads all profiles once (prefetch_batch_shared_zarr).

  2. Workers open the shared Zarr and filter by time_bounds via searchsorted (fast, contiguous chunk reads).

When this fallback IS used, it opens the ArgoManager for the requested window, which downloads and interpolates profiles on-demand.

Parameters:
  • profile_sources (list[str]) – Monthly catalog keys (unused in Kerchunk path — kept for API compat).

  • open_func (callable) – ArgoManager.open bound method (or the ArgoManager itself).

  • alias (str) – Dataset alias ("argo_profiles").

  • time_bounds (tuple of pd.Timestamp) – (start, end) time window.

  • depth_levels (array-like) – Target depth levels for interpolation.

  • n_points_dim (str) – Name of the points dimension (default "N_POINTS").

Return type:

xr.Dataset or None