dctools.data.datasets.dc_catalog.DatasetCatalog
- class dctools.data.datasets.dc_catalog.DatasetCatalog(alias, global_metadata=None, entries=None, dataframe=None)
Structured catalog to hold and filter dataset metadata entries.
- Parameters:
alias (str)
global_metadata (Dict[str, Any] | None)
entries (Sequence[CatalogEntry | Dict[str, Any]] | None)
dataframe (geopandas.GeoDataFrame | None)
- __init__(alias, global_metadata=None, entries=None, dataframe=None)
Initialize the catalog with a list of entries.
- Parameters:
entries (List[Union[CatalogEntry, Dict[str, Any]]]) – List of dataset metadata.
alias (str)
global_metadata (Dict[str, Any] | None)
dataframe (geopandas.GeoDataFrame | None)
Methods
__init__(alias[, global_metadata, entries, ...])Initialize the catalog with a list of entries.
append(metadata)Append an entry to the catalog.
check_geometries_compatibility(gdf, region)Diagnostic with automatic corrections.
extend(other_catalog)Extend the catalog with another catalog.
filter_attrs(filters)Filter catalog entries by attribute values using filter functions.
filter_by_date(start, end)Filter entries by time range.
filter_by_region(region)Filter GeoDataFrame entries intersecting with the given region.
filter_by_variables(variables)Filter entries by variable list.
from_json(path, alias, limit[, ignore_geometry])Reconstruct a DatasetCatalog instance from a GeoJSON file.
Return the internal GeoDataFrame.
Get global metadata for the catalog.
List file paths in the catalog.
set_dataframe(gdf)Set the internal GeoDataFrame.
Return the complete GeoDataFrame.
to_json([path])Export the entire DatasetCatalog content to JSON format.
- append(metadata)
Append an entry to the catalog.
- Parameters:
metadata (Union[CatalogEntry, Dict[str, Any]]) – Metadata to append.
- check_geometries_compatibility(gdf, region)
Diagnostic with automatic corrections.
- Parameters:
gdf (geopandas.GeoDataFrame)
region (geopandas.GeoSeries | shapely.geometry.base.BaseGeometry)
- extend(other_catalog)
Extend the catalog with another catalog.
- Parameters:
other_catalog (DatasetCatalog) – Other catalog to merge.
- filter_attrs(filters)
Filter catalog entries by attribute values using filter functions.
- Parameters:
filters (dict[str, Callable[[Any], bool] | geopandas.GeoSeries])
- Return type:
None
- filter_by_date(start, end)
Filter entries by time range.
- Parameters:
start (datetime) – Start date(s).
end (datetime) – End date(s).
- Returns:
Filtered GeoDataFrame.
- Return type:
gpd.GeoDataFrame
- filter_by_region(region)
Filter GeoDataFrame entries intersecting with the given region.
- Parameters:
region (gpd.GeoSeries) – A GeoSeries containing a polygon or a collection of polygons.
- Return type:
None
- filter_by_variables(variables)
Filter entries by variable list.
- Parameters:
variables (List[str]) – List of variables to filter.
- Returns:
Filtered GeoDataFrame.
- Return type:
gpd.GeoDataFrame
- classmethod from_json(path, alias, limit, ignore_geometry=False)
Reconstruct a DatasetCatalog instance from a GeoJSON file.
- Parameters:
path (str)
alias (str)
limit (int)
- Return type:
- get_dataframe()
Return the internal GeoDataFrame.
- Returns:
The catalog GeoDataFrame.
- Return type:
gpd.GeoDataFrame
- get_global_metadata()
Get global metadata for the catalog.
- list_paths()
List file paths in the catalog.
- Returns:
List of paths.
- Return type:
List[str]
- set_dataframe(gdf)
Set the internal GeoDataFrame.
- Parameters:
gdf (gpd.GeoDataFrame) – The catalog GeoDataFrame.
- Return type:
None
- to_geodataframe()
Return the complete GeoDataFrame.
- Returns:
Catalog GeoDataFrame.
- Return type:
gpd.GeoDataFrame
- to_json(path=None)
Export the entire DatasetCatalog content to JSON format.
- Parameters:
path (Optional[str]) – Path to save the JSON file.
- Returns:
Complete JSON representation of the instance.
- Return type:
str