dctools.data.connection.connection_manager.BaseConnectionManager
- class dctools.data.connection.connection_manager.BaseConnectionManager(connect_config, call_list_files=True, batch_size=64)
Abstract base connection manager.
Manages opening, closing and listing files for various protocols.
- Parameters:
connect_config (BaseConnectionConfig | Namespace)
call_list_files (bool)
batch_size (int | None)
- __init__(connect_config, call_list_files=True, batch_size=64)
- Parameters:
connect_config (BaseConnectionConfig | Namespace)
call_list_files (bool)
batch_size (int | None)
Methods
__init__(connect_config[, call_list_files, ...])adjust_full_day(date_start, date_end)Adjust date_end to cover a full day if dates are the same at midnight.
download_file(remote_path, local_path)Download a file from the remote source to the local path.
estimate_resolution(ds, coord_system)Estimate resolution from dataset based on coordinates.
Extract global metadata (common to all files) from a single file.
extract_metadata(path)Extract metadata combining global/file-specific info.
extract_metadata_worker(path, ...[, argo_index])Extract metadata combining global/file-specific info.
Return a clean copy of the configuration.
Get global metadata for all files in the connection manager.
List files matching the configuration.
Version with integrated Dask client and optimized configuration.
open(path[, mode])Open a file, prioritizing local then remote access.
open_local(local_path)Open a file locally if it exists.
open_remote(path[, mode])Open a file remotely if the source supports it.
set_global_metadata(global_metadata)Sets the global metadata for the connection manager.
supports(path)Check if path is supported by this manager.
- adjust_full_day(date_start, date_end)
Adjust date_end to cover a full day if dates are the same at midnight.
- Parameters:
date_start (pandas.Timestamp)
date_end (pandas.Timestamp)
- Return type:
tuple[pandas.Timestamp, pandas.Timestamp]
- download_file(remote_path, local_path)
Download a file from the remote source to the local path.
- Parameters:
remote_path (str) – Remote path of the file.
local_path (str) – Local path to save the file.
- estimate_resolution(ds, coord_system)
Estimate resolution from dataset based on coordinates.
Only inspects coordinate values (small arrays). Handles both in-memory and dask-backed datasets safely —
np.asarray()is used to materialise only the coordinate arrays (typically tiny).- Parameters:
ds (xarray.Dataset) – xarray.Dataset
coord_system (CoordinateSystem) – CoordinateSystem object.
- Returns:
Dictionary of estimated resolutions.
- Return type:
Dict[str, float | str]
- extract_global_metadata()
Extract global metadata (common to all files) from a single file.
- Returns:
Global metadata including spatial bounds and variable names.
- Return type:
Dict[str, Any]
- extract_metadata(path)
Extract metadata combining global/file-specific info.
- Parameters:
path (str) – Path to the file.
global_metadata (Dict[str, Any]) – Global metadata to apply to all files.
- Returns:
Metadata for the specific file as a CatalogEntry.
- Return type:
- static extract_metadata_worker(path, global_metadata, connection_params, class_name, argo_index=None)
Extract metadata combining global/file-specific info.
Thread-safe version to avoid conflicts.
- Parameters:
path (str) – Path to the file.
global_metadata (Dict[str, Any]) – Global metadata.
connection_params (dict)
class_name (Any)
argo_index (Any | None)
- Returns:
Metadata for the specific file as a CatalogEntry.
- Return type:
- get_config_clean_copy()
Return a clean copy of the configuration.
- get_global_metadata()
Get global metadata for all files in the connection manager.
- Returns:
Global metadata including spatial bounds and variable names.
- Return type:
Dict[str, Any]
- abstractmethod list_files()
List files matching the configuration.
- Return type:
List[str]
- list_files_with_metadata()
Version with integrated Dask client and optimized configuration.
- Return type:
List[CatalogEntry]
- open(path, mode='rb')
Open a file, prioritizing local then remote access.
If the file is not available, attempt to download it locally and open it.
- Parameters:
path (str) – Remote path of the file.
mode (str) – Mode to open the file (default is “rb”).
- Returns:
Opened dataset.
- Return type:
xr.Dataset
- open_local(local_path)
Open a file locally if it exists.
- Parameters:
local_path (str) – Path to the local file.
- Returns:
Opened dataset, or None if the file does not exist.
- Return type:
Optional[xr.Dataset]
- open_remote(path, mode='rb')
Open a file remotely if the source supports it.
- Parameters:
path (str) – Remote path of the file.
mode (str) – Mode to open the file (default is “rb”).
- Returns:
Opened dataset, or None if remote opening is not supported.
- Return type:
Optional[xr.Dataset]
- set_global_metadata(global_metadata)
Sets the global metadata for the connection manager.
Keeps only the keys listed in the global_metadata class variable.
- Parameters:
global_metadata (Dict[str, Any]) – Global metadata dictionary.
- Return type:
None
- abstractmethod classmethod supports(path)
Check if path is supported by this manager.
- Parameters:
path (str)
- Return type:
bool