ERDDAP
The ERDDAPDataFetcher class retrieves gridded environmental data from
any ERDDAP server. It handles automatic dateline-crossing correction and
provides helpers for listing available datasets and variables.
- class ecosound.environment.erddap.ERDDAPDataFetcher(server: str, dataset_id: str | None = None)[source]
Bases:
objectClass for fetching data from ERDDAP servers.
Handles data retrieval, dateline crossing, quality masking, and spatial/temporal striding.
Initialize the ERDDAP data fetcher.
- Parameters:
server – ERDDAP server URL (e.g., “https://comet.nefsc.noaa.gov/erddap”)
dataset_id – Dataset ID (optional, can be set later for fetch operations)
- __init__(server: str, dataset_id: str | None = None)[source]
Initialize the ERDDAP data fetcher.
- Parameters:
server – ERDDAP server URL (e.g., “https://comet.nefsc.noaa.gov/erddap”)
dataset_id – Dataset ID (optional, can be set later for fetch operations)
- list_datasets(search_text: str | None = None) DataFrame[source]
List all datasets available on the ERDDAP server.
- Parameters:
search_text – Optional text to filter datasets (searches in title and summary)
- Returns:
dataset_id, title, summary, institution
- Return type:
DataFrame with columns
- get_dataset_info(dataset_id: str | None = None) Dict[source]
Get detailed information about a specific dataset.
- Parameters:
dataset_id – Dataset ID (uses instance dataset_id if not provided)
- Returns:
Dictionary with dataset metadata
- list_variables(dataset_id: str | None = None) List[str][source]
List all variables available in a dataset.
- Parameters:
dataset_id – Dataset ID (uses instance dataset_id if not provided)
- Returns:
List of variable names
- fetch_data(variables: str | List[str], *, date: str | None = None, start_date: str | None = None, end_date: str | None = None, lat_min: float, lat_max: float, lon_min: float, lon_max: float, dataset_id: str | None = None, include_quality: bool = False, quality_variable: str = 'quality_level', quality_mask_value: int | None = None, spatial_stride: int = 1, time_stride: int = 1, max_request_duration_days: int = 31) Dataset[source]
Fetch gridded data from ERDDAP for a single day or date range. Automatically splits large time ranges into chunks to avoid overwhelming the server.
- Parameters:
variables – Variable name(s) to fetch (string or list of strings)
date – Single date (YYYY-MM-DD) - mutually exclusive with start_date/end_date
start_date – Start date (YYYY-MM-DD) for range
end_date – End date (YYYY-MM-DD) for range
lat_min – Minimum latitude
lat_max – Maximum latitude
lon_min – Minimum longitude
lon_max – Maximum longitude
dataset_id – Dataset ID (uses instance dataset_id if not provided)
include_quality – Whether to include quality variable
quality_variable – Name of quality variable (default: “quality_level”)
quality_mask_value – Mask data by quality value (e.g., 5 for best quality)
spatial_stride – Spatial stride for thinning data (1=all, 2=every other point, etc.)
time_stride – Temporal stride for thinning data (1=all, 2=every other day, etc.)
max_request_duration_days – Maximum time range per request (default: 31 days/~1 month) Large ranges are split into multiple requests automatically.
- Returns:
xarray.Dataset with requested data (seamlessly combined from multiple requests if needed)
Examples
Single day:
ds = fetcher.fetch_data( "sea_surface_temperature", date="2018-01-01", lat_min=42, lat_max=43, lon_min=-71, lon_max=-69)
Large date range (automatically chunked into monthly requests):
ds = fetcher.fetch_data( ["sst", "chlorophyll"], start_date="2018-01-01", end_date="2018-06-30", lat_min=42, lat_max=43, lon_min=-71, lon_max=-69, max_request_duration_days=31)