dctools.data.datasets.dataset.BaseDataset

class dctools.data.datasets.dataset.BaseDataset(config)

Base class for all datasets.

Parameters:

config (DatasetConfig)

__init__(config)

Initializes a dataset with a configuration.

Parameters:

config (DatasetConfig) – Dataset configuration.

Methods

__init__(config)

Initializes a dataset with a configuration.

build_catalog()

Builds a catalog for this dataset.

catalog_is_empty()

Checks if the catalog is empty.

download(index, local_path)

Downloads a file based on its index.

filter_catalog_by_date(start, end)

Filters the catalog by time range.

filter_catalog_by_region(region)

Filters the catalog by bounding box.

filter_catalog_by_variable(variables)

Filters the catalog by specified variables.

get_catalog()

Returns the dataset catalog.

get_connection_config()

Get the connection configuration parameters.

get_connection_manager()

Returns the ConnectionManager instance associated with the dataset.

get_coord_system()

Returns the coordinate system of the dataset.

get_eval_variables()

Return the list of standard evaluation variables.

get_global_metadata()

Returns the global metadata of the dataset.

get_metadata()

Returns the metadata of the dataset files.

get_path(index)

Returns the path of a file at a given index.

iter_data()

Iterates over the dataset files and loads them as Xarray datasets.

list_paths()

Returns the list of file paths in the dataset.

load_data(index)

Loads a dataset from a path.

standardize_names(coord_rename_dict, ...)

Standardize coordinate and variable names using rename dictionaries.

to_json(path)

Exports the entire BaseDataset content to JSON format.

build_catalog()

Builds a catalog for this dataset.

Return type:

None

catalog_is_empty()

Checks if the catalog is empty.

Returns:

True if the catalog is empty, otherwise False.

Return type:

bool

download(index, local_path)

Downloads a file based on its index.

Parameters:
  • index (int) – File index.

  • local_path (str) – Local path where to save the file.

filter_catalog_by_date(start, end)

Filters the catalog by time range.

Parameters:
  • start (datetime) – Start date.

  • end (datetime) – End date.

filter_catalog_by_region(region)

Filters the catalog by bounding box.

Parameters:
  • bbox (Tuple[float, float, float, float]) – (lon_min, lat_min, lon_max, lat_max).

  • region (Any)

filter_catalog_by_variable(variables)

Filters the catalog by specified variables.

Parameters:

variables (List[str]) – List of variable names to filter.

get_catalog()

Returns the dataset catalog.

Returns:

Dataset catalog.

Return type:

DatasetCatalog

get_connection_config()

Get the connection configuration parameters.

get_connection_manager()

Returns the ConnectionManager instance associated with the dataset.

Returns:

ConnectionManager instance.

Return type:

BaseConnectionManager

get_coord_system()

Returns the coordinate system of the dataset.

Returns:

Coordinate system.

Return type:

Dict[str, Any]

get_eval_variables()

Return the list of standard evaluation variables.

get_global_metadata()

Returns the global metadata of the dataset.

Returns:

Global metadata.

Return type:

Dict[str, Any]

get_metadata()

Returns the metadata of the dataset files.

Returns:

List of metadata (CatalogEntry objects).

Return type:

List[Any]

get_path(index)

Returns the path of a file at a given index.

Parameters:

index (int) – File index.

Returns:

File path.

Return type:

str

iter_data()

Iterates over the dataset files and loads them as Xarray datasets.

Yields:

xr.Dataset – Loaded dataset.

Return type:

Iterator[xarray.Dataset]

list_paths()

Returns the list of file paths in the dataset.

Returns:

List of file paths.

Return type:

List[str]

load_data(index)

Loads a dataset from a path.

Parameters:
  • path (str) – File path.

  • index (int)

Returns:

Loaded dataset.

Return type:

xr.Dataset

standardize_names(coord_rename_dict, variable_rename_dict)

Standardize coordinate and variable names using rename dictionaries.

Parameters:
  • coord_rename_dict (Dict[str, str])

  • variable_rename_dict (Dict[str, str])

Return type:

None

to_json(path)

Exports the entire BaseDataset content to JSON format.

Parameters:

path (str) – Path to save the JSON file.

Return type:

None