Access Layer¶
The access layer is the primary user-facing interface of pyBDL. It provides a clean, pandas DataFrame-based API that automatically handles data conversion and normalization.
Overview¶
The access layer sits on top of the raw API clients and provides:
Automatic DataFrame conversion: All responses are converted to pandas DataFrames
Column name normalization: camelCase API fields are converted to snake_case
Data type inference: Proper types (integers, floats, booleans) are automatically detected
Nested data normalization: Complex nested structures are flattened into tabular format
The main client provides two interfaces:
Access layer (default): Returns pandas DataFrames - use bdl.levels, bdl.data, etc.
API layer: Returns raw dictionaries - use bdl.api.levels, bdl.api.data, etc.
For most users, the access layer is recommended as it provides a more Pythonic and data-analysis-friendly interface.
Quick Start¶
from pybdl import BDL, BDLConfig
# Initialize client
bdl = BDL(BDLConfig(api_key="your-api-key"))
# Access layer returns DataFrames
levels_df = bdl.levels.list_levels()
print(levels_df.head())
# Data is ready for analysis
print(levels_df.dtypes)
print(levels_df.columns)
Key Features¶
DataFrame Conversion¶
All access layer methods return pandas DataFrames, making data immediately ready for analysis:
# Get variables as DataFrame
variables_df = bdl.variables.list_variables()
# Use pandas operations directly
filtered = variables_df[variables_df['name'].str.contains('population', case=False)]
sorted_vars = variables_df.sort_values('name')
Column Name Normalization¶
API responses use camelCase (e.g., variableId, unitName), but the access layer converts these to snake_case (e.g., variable_id, unit_name) for Pythonic access:
df = bdl.variables.get_variable("3643")
# Columns are: variable_id, name, description (not variableId, Name, Description)
print(df.columns)
Data Type Inference¶
The access layer automatically infers and converts data types:
df = bdl.data.get_data_by_variable("3643", years=[2021])
# year column is Int64, val column is float
print(df.dtypes)
Nested Data Normalization¶
The data endpoints return nested structures. The access layer automatically flattens them:
# API returns: [{"id": "1", "name": "Warsaw", "values": [{"year": 2021, "val": 1000}, ...]}]
# Access layer returns flat DataFrame:
df = bdl.data.get_data_by_variable("3643", years=[2021])
# Columns: unit_id, unit_name, year, val, attr_id
print(df.head())
Available Endpoints¶
The access layer provides endpoints for all BDL API resources:
Endpoint |
Access Method |
Description |
|---|---|---|
Aggregates |
|
Aggregation level metadata |
Attributes |
|
Attribute metadata |
Data |
|
Statistical data access |
Levels |
|
Administrative unit levels |
Measures |
|
Measure unit metadata |
Subjects |
|
Subject hierarchy |
Units |
|
Administrative units |
Variables |
|
Variable metadata |
Years |
|
Available years |
Endpoint Details¶
Levels¶
Administrative unit aggregation levels (country, voivodeship, county, municipality):
# List all levels
levels_df = bdl.levels.list_levels()
# Get specific level
level_df = bdl.levels.get_level(1) # Level 1 = country
# Get metadata
metadata_df = bdl.levels.get_levels_metadata()
Subjects¶
Subject categories and hierarchy:
# List all top-level subjects
subjects_df = bdl.subjects.list_subjects()
# Get subjects under a parent
child_subjects = bdl.subjects.list_subjects(parent_id="P0001")
# Search subjects
results = bdl.subjects.search_subjects(name="population")
# Get specific subject
subject_df = bdl.subjects.get_subject("P0001")
Variables¶
Statistical variables (indicators):
# List all variables
variables_df = bdl.variables.list_variables()
# Filter variables
filtered = bdl.variables.list_variables(
category_id="P0001",
name="population"
)
# Search variables
results = bdl.variables.search_variables(name="unemployment")
# Get specific variable
variable_df = bdl.variables.get_variable("3643")
Data¶
Statistical data retrieval:
# Get data by variable (most common)
df = bdl.data.get_data_by_variable(
variable_id="3643",
years=[2021],
unit_level=2 # Voivodeship level
)
# Get data for multiple years
df = bdl.data.get_data_by_variable(
variable_id="3643",
years=[2020, 2021, 2022],
unit_level=2
)
# Get data with aggregate filter
df = bdl.data.get_data_by_variable(
variable_id="3643",
years=[2021],
aggregate_id=1
)
# Get data by administrative unit
df = bdl.data.get_data_by_unit(
unit_id="020000000000",
variable_ids=["3643"],
years=[2021]
)
# Get data for a locality
df = bdl.data.get_data_by_variable_locality(
variable_id="3643",
unit_parent_id="1465011",
years=[2021]
)
# Get data by unit locality
df = bdl.data.get_data_by_unit_locality(
unit_id="1465011",
variable_id="3643",
years=[2021]
)
The data endpoints automatically normalize nested values arrays into flat rows.
Units¶
Administrative units (regions, cities, etc.):
# List units by level
voivodeships = bdl.units.list_units(level=2) # Level 2 = voivodeship
# Search units
warsaw = bdl.units.search_units(name="Warsaw")
# Get specific unit
unit_df = bdl.units.get_unit("020000000000")
# List localities (statistical localities)
localities = bdl.units.list_localities(level=6) # Level 6 = municipality
# Search localities
warsaw_localities = bdl.units.search_localities(name="Warsaw", level=6)
# Get specific locality
locality_df = bdl.units.get_locality("1465011")
Attributes¶
Data attributes (dimensions):
# List all attributes
attributes_df = bdl.attributes.list_attributes()
# Get specific attribute
attr_df = bdl.attributes.get_attribute("1")
Measures¶
Measure units:
# List all measures
measures_df = bdl.measures.list_measures()
# Get specific measure
measure_df = bdl.measures.get_measure(1)
Aggregates¶
Aggregation types:
# List all aggregates
aggregates_df = bdl.aggregates.list_aggregates()
# Get specific aggregate
aggregate_df = bdl.aggregates.get_aggregate("1")
Years¶
Available years for data:
# List all available years
years_df = bdl.years.list_years()
# Get specific year metadata
year_df = bdl.years.get_year(2021)
Pagination¶
Most list methods support pagination:
# Fetch all pages (default, max_pages=None)
all_data = bdl.variables.list_variables()
# Fetch only first page
first_page = bdl.variables.list_variables(max_pages=1, page_size=50)
# Limit number of pages
limited = bdl.variables.list_variables(max_pages=5, page_size=100)
Parameters:
max_pages: Maximum number of pages to fetch.None(default) fetches all pages,1fetches only the first page,Nfetches up to N pagespage_size: Number of results per page (default: 100 from config or 100)
Async Usage¶
All access layer methods have async versions (prefixed with a):
import asyncio
from pybdl import BDL
async def main():
bdl = BDL()
# Async methods return DataFrames
levels_df = await bdl.levels.alist_levels()
variables_df = await bdl.variables.alist_variables()
# Can run multiple requests concurrently
levels_task = bdl.levels.alist_levels()
variables_task = bdl.variables.alist_variables()
levels_df, variables_df = await asyncio.gather(levels_task, variables_task)
return levels_df, variables_df
asyncio.run(main())
Available async methods:
alist_levels(),alist_variables(),alist_subjects(), etc.aget_level(),aget_variable(),aget_subject(), etc.aget_data_by_variable(),aget_data_by_unit(), etc.
Examples¶
Basic Usage¶
from pybdl import BDL, BDLConfig
bdl = BDL(BDLConfig(api_key="your-api-key"))
# Get administrative levels
levels = bdl.levels.list_levels()
print(f"Found {len(levels)} administrative levels")
# Get variables related to population
population_vars = bdl.variables.search_variables(name="population")
print(f"Found {len(population_vars)} population-related variables")
# Get data for a specific variable
data = bdl.data.get_data_by_variable(
variable_id="3643",
years=[2021],
unit_level=2 # Voivodeship level
)
print(data.head())
Filtering and Analysis¶
# Get all variables
variables = bdl.variables.list_variables()
# Filter using pandas
economic_vars = variables[variables['name'].str.contains('economic', case=False)]
# Get data for multiple variables
for var_id in economic_vars['id'].head(5):
data = bdl.data.get_data_by_variable(var_id, years=[2021])
print(f"Variable {var_id}: {len(data)} records")
Getting Data¶
# Get data
df = bdl.data.get_data_by_variable("3643", years=[2021])
# DataFrame includes IDs and values
print(df[['unit_name', 'attr_name', 'val']].head())
# Group by attribute name
by_attr = df.groupby('attr_name')['val'].mean()
print(by_attr)
Working with Nested Data¶
The data endpoints automatically normalize nested structures:
# API returns nested structure, but access layer flattens it
df = bdl.data.get_data_by_variable("3643", years=[2021])
# Each row represents one data point
# Columns: unit_id, unit_name, year, val, attr_id, attr_name
print(df.head())
# Easy to analyze
avg_by_unit = df.groupby('unit_name')['val'].mean()
print(avg_by_unit)
# Get data for multiple years
multi_year_df = bdl.data.get_data_by_variable("3643", years=[2020, 2021, 2022])
# Analyze trends over time
yearly_avg = multi_year_df.groupby('year')['val'].mean()
print(yearly_avg)
See Examples for more comprehensive real-world examples.
API Reference¶
Access layer for converting API responses to pandas DataFrames.
- class pybdl.access.AggregatesAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for aggregates API, converting responses to DataFrames.
Example column renaming:
_column_renames = { "list_aggregates": { "id": "aggregate_id", "name": "aggregate_name", }, "get_aggregate": { "id": "aggregate_id", }, }
- async aget_aggregate(aggregate_id: str, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata details for a specific aggregate as a DataFrame.
- Parameters:
aggregate_id – Aggregate identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with aggregate metadata.
- async alist_aggregates(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all aggregates as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with aggregates data.
- get_aggregate(aggregate_id: str, **kwargs: Any) DataFrame[source]¶
Retrieve metadata details for a specific aggregate as a DataFrame.
- Parameters:
aggregate_id – Aggregate identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with aggregate metadata.
- list_aggregates(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all aggregates as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with aggregates data.
- class pybdl.access.AttributesAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for attributes API, converting responses to DataFrames.
- async aget_attribute(attribute_id: str, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata details for a specific attribute as a DataFrame.
- Parameters:
attribute_id – Attribute identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with attribute metadata.
- async alist_attributes(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all attributes as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with attributes data.
- get_attribute(attribute_id: str, **kwargs: Any) DataFrame[source]¶
Retrieve metadata details for a specific attribute as a DataFrame.
- Parameters:
attribute_id – Attribute identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with attribute metadata.
- list_attributes(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all attributes as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with attributes data.
- class pybdl.access.DataAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for data API, converting responses to DataFrames with nested data normalization.
- async aget_data_by_unit(unit_id: str, variable_ids: list[str], years: list[int] | None = None, aggregate_id: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Asynchronously retrieve statistical data for a specific administrative unit as a DataFrame.
- Parameters:
unit_id – Identifier of the administrative unit.
variable_ids – List of variable IDs (as strings) to get results.
years – Optional list of years to filter by.
aggregate_id – Optional aggregate ID.
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with data, or tuple (DataFrame, metadata) if return_metadata=True.
- async aget_data_by_unit_locality(unit_id: str, variable_id: list[int] | int, years: list[int] | None = None, aggregate_id: int | None = None, page_size: int | None = None, max_pages: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Asynchronously retrieve data for a single statistical locality by unit as a DataFrame.
- Parameters:
unit_id – Identifier of the statistical locality.
variable_id – Variable ID or list of variable IDs to filter.
years – Optional list of years to filter by.
aggregate_id – Optional aggregate ID.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with data, or tuple (DataFrame, metadata) if return_metadata=True.
- async aget_data_by_variable(variable_id: str, years: list[int] | None = None, unit_parent_id: str | None = None, unit_level: int | None = None, aggregate_id: int | None = None, page_size: int | None = None, max_pages: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Asynchronously retrieve statistical data for a specific variable as a DataFrame.
- Parameters:
variable_id – Identifier of the variable.
years – Optional list of years to filter by.
unit_parent_id – Optional parent administrative unit ID.
unit_level – Optional administrative unit aggregation level.
aggregate_id – Optional aggregate ID.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with normalized data, or tuple (DataFrame, metadata) if return_metadata is True.
- async aget_data_by_variable_locality(variable_id: str, unit_parent_id: str, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Asynchronously retrieve data for a variable within a specific locality as a DataFrame.
- Parameters:
variable_id – Identifier of the variable.
unit_parent_id – Parent unit ID (required).
years – Optional list of years to filter by.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with data, or tuple (DataFrame, metadata) if return_metadata=True.
- get_data_by_unit(unit_id: str, variable_ids: list[str], years: list[int] | None = None, aggregate_id: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Retrieve statistical data for a specific administrative unit as a DataFrame.
- Parameters:
unit_id – Identifier of the administrative unit.
variable_ids – List of variable IDs (as strings) to get results.
years – Optional list of years to filter by.
aggregate_id – Optional aggregate ID.
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with data, or tuple (DataFrame, metadata) if return_metadata=True.
- get_data_by_unit_locality(unit_id: str, variable_id: list[int] | int, years: list[int] | None = None, aggregate_id: int | None = None, page_size: int | None = None, max_pages: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Retrieve data for a single statistical locality by unit as a DataFrame.
- Parameters:
unit_id – Identifier of the statistical locality.
variable_id – Variable ID or list of variable IDs to filter.
years – Optional list of years to filter by.
aggregate_id – Optional aggregate ID.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with data, or tuple (DataFrame, metadata) if return_metadata=True.
- get_data_by_variable(variable_id: str, years: list[int] | None = None, unit_parent_id: str | None = None, unit_level: int | None = None, aggregate_id: int | None = None, page_size: int | None = None, max_pages: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Retrieve statistical data for a specific variable as a DataFrame.
The nested ‘values’ array is normalized into separate rows, with each row containing: unit_id, unit_name, year, val, attr_id.
- Parameters:
variable_id – Identifier of the variable.
years – Optional list of years to filter by.
unit_parent_id – Optional parent administrative unit ID.
unit_level – Optional administrative unit aggregation level.
aggregate_id – Optional aggregate ID.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with normalized data, or tuple (DataFrame, metadata) if return_metadata is True.
- get_data_by_variable_locality(variable_id: str, unit_parent_id: str, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, return_metadata: bool = False, **kwargs: Any) DataFrame | tuple[DataFrame, dict[str, Any]][source]¶
Retrieve data for a variable within a specific locality as a DataFrame.
- Parameters:
variable_id – Identifier of the variable.
unit_parent_id – Parent unit ID (required).
years – Optional list of years to filter by.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
return_metadata – If True, return tuple (DataFrame, metadata).
**kwargs – Additional parameters passed to API layer (e.g., format, lang, extra_query).
- Returns:
DataFrame with data, or tuple (DataFrame, metadata) if return_metadata=True.
- class pybdl.access.LevelsAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for levels API, converting responses to DataFrames.
- async aget_level(level_id: int, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata for a specific aggregation level as a DataFrame.
- Parameters:
level_id – Aggregation level identifier (integer).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with level metadata.
- async alist_levels(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all administrative unit aggregation levels as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with levels data.
- get_level(level_id: int, **kwargs: Any) DataFrame[source]¶
Retrieve metadata for a specific aggregation level as a DataFrame.
- Parameters:
level_id – Aggregation level identifier (integer).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with level metadata.
- list_levels(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all administrative unit aggregation levels as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with levels data.
- class pybdl.access.MeasuresAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for measures API, converting responses to DataFrames.
- async aget_measure(measure_id: int, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata for a specific measure unit as a DataFrame.
- Parameters:
measure_id – Measure unit identifier (integer).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with measure unit metadata.
- async alist_measures(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all measure units as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with measures data.
- get_measure(measure_id: int, **kwargs: Any) DataFrame[source]¶
Retrieve metadata for a specific measure unit as a DataFrame.
- Parameters:
measure_id – Measure unit identifier (integer).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with measure unit metadata.
- list_measures(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all measure units as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with measures data.
- class pybdl.access.SubjectsAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for subjects API, converting responses to DataFrames.
- async aget_subject(subject_id: str, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata for a specific subject as a DataFrame.
- Parameters:
subject_id – Subject identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with subject metadata.
- async alist_subjects(parent_id: str | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all subjects as a DataFrame.
- Parameters:
parent_id – Optional parent subject ID. If not specified, returns all top-level subjects.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with subjects data.
- async asearch_subjects(name: str, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously search for subjects by name as a DataFrame.
- Parameters:
name – Subject name to search for.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with matching subjects.
- get_subject(subject_id: str, **kwargs: Any) DataFrame[source]¶
Retrieve metadata for a specific subject as a DataFrame.
- Parameters:
subject_id – Subject identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with subject metadata.
- list_subjects(parent_id: str | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all subjects as a DataFrame.
- Parameters:
parent_id – Optional parent subject ID. If not specified, returns all top-level subjects.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with subjects data.
- search_subjects(name: str, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Search for subjects by name as a DataFrame.
- Parameters:
name – Subject name to search for.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with matching subjects.
- class pybdl.access.UnitsAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for units API, converting responses to DataFrames.
- async aget_locality(locality_id: str, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata details for a specific statistical locality as a DataFrame.
- Parameters:
locality_id – Locality identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with locality metadata.
- async aget_unit(unit_id: str, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata details for a specific administrative unit as a DataFrame.
- Parameters:
unit_id – Administrative unit identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with unit metadata.
- async alist_localities(parent_id: str | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all statistical localities as a DataFrame.
- Parameters:
parent_id – Optional parent unit ID.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., name, level, sort, lang, format, extra_query).
- Returns:
DataFrame with localities data.
- async alist_units(parent_id: str | None = None, level: int | list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all administrative units as a DataFrame.
- Parameters:
parent_id – Optional parent unit ID.
level – Optional administrative level (integer or list of integers).
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., name, sort, lang, format, extra_query).
- Returns:
DataFrame with units data.
- async asearch_localities(name: str | None = None, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously search for statistical localities by name and optional filters as a DataFrame.
- Parameters:
name – Optional substring to search in locality name.
years – Optional list of years to filter by.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., level, parent_id, sort, lang, format, extra_query).
- Returns:
DataFrame with matching localities.
- async asearch_units(name: str | None = None, level: int | list[int] | None = None, years: list[int] | None = None, kind: str | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously search for administrative units by name and optional filters as a DataFrame.
- Parameters:
name – Optional substring to search in unit name.
level – Optional administrative level (integer or list of integers).
years – Optional list of years to filter by.
kind – Optional unit kind filter.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with matching units.
- get_locality(locality_id: str, **kwargs: Any) DataFrame[source]¶
Retrieve metadata details for a specific statistical locality as a DataFrame.
- Parameters:
locality_id – Locality identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with locality metadata.
- get_unit(unit_id: str, **kwargs: Any) DataFrame[source]¶
Retrieve metadata details for a specific administrative unit as a DataFrame.
- Parameters:
unit_id – Administrative unit identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with unit metadata.
- list_localities(parent_id: str | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all statistical localities as a DataFrame.
- Parameters:
parent_id – Optional parent unit ID.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., name, level, sort, lang, format, extra_query).
- Returns:
DataFrame with localities data.
- list_units(parent_id: str | None = None, level: int | list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all administrative units as a DataFrame.
- Parameters:
parent_id – Optional parent unit ID.
level – Optional administrative level (integer or list of integers).
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., name, sort, lang, format, extra_query).
- Returns:
DataFrame with units data.
- search_localities(name: str | None = None, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Search for statistical localities by name and optional filters as a DataFrame.
- Parameters:
name – Optional substring to search in locality name.
years – Optional list of years to filter by.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., level, parent_id, sort, lang, format, extra_query).
- Returns:
DataFrame with matching localities.
- search_units(name: str | None = None, level: int | list[int] | None = None, years: list[int] | None = None, kind: str | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Search for administrative units by name and optional filters as a DataFrame.
- Parameters:
name – Optional substring to search in unit name.
level – Optional administrative level (integer or list of integers).
years – Optional list of years to filter by.
kind – Optional unit kind filter.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with matching units.
- class pybdl.access.VariablesAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for variables API, converting responses to DataFrames.
- async aget_variable(variable_id: str, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata details for a specific variable as a DataFrame.
- Parameters:
variable_id – Variable identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with variable metadata.
- async alist_variables(subject_id: str | None = None, level: int | None = None, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all variables as a DataFrame.
- Parameters:
subject_id – Optional subject ID to filter variables.
level – Optional level to filter variables.
years – Optional list of years to filter variables.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with variables data.
- async asearch_variables(name: str | None = None, subject_id: str | None = None, level: int | None = None, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously search for variables by name and optional filters as a DataFrame.
- Parameters:
name – Optional substring to search in variable name.
subject_id – Optional subject ID to filter variables.
level – Optional level to filter variables.
years – Optional list of years to filter variables.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with matching variables.
- get_variable(variable_id: str, **kwargs: Any) DataFrame[source]¶
Retrieve metadata details for a specific variable as a DataFrame.
- Parameters:
variable_id – Variable identifier.
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with variable metadata.
- list_variables(subject_id: str | None = None, level: int | None = None, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all variables as a DataFrame.
- Parameters:
subject_id – Optional subject ID to filter variables.
level – Optional level to filter variables.
years – Optional list of years to filter variables.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with variables data.
- search_variables(name: str | None = None, subject_id: str | None = None, level: int | None = None, years: list[int] | None = None, page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Search for variables by name and optional filters as a DataFrame.
- Parameters:
name – Optional substring to search in variable name.
subject_id – Optional subject ID to filter variables.
level – Optional level to filter variables.
years – Optional list of years to filter variables.
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with matching variables.
- class pybdl.access.YearsAccess(api_client: Any)[source]¶
Bases:
BaseAccessAccess layer for years API, converting responses to DataFrames.
- async aget_year(year_id: int, **kwargs: Any) DataFrame[source]¶
Asynchronously retrieve metadata for a specific year as a DataFrame.
- Parameters:
year_id – Year identifier (integer, e.g. 2020).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with year metadata.
- async alist_years(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
Asynchronously list all available years as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with available years.
- get_year(year_id: int, **kwargs: Any) DataFrame[source]¶
Retrieve metadata for a specific year as a DataFrame.
- Parameters:
year_id – Year identifier (integer, e.g. 2020).
**kwargs – Additional parameters passed to API layer (e.g., lang, format, extra_query).
- Returns:
DataFrame with year metadata.
- list_years(page_size: int | None = None, max_pages: int | None = None, **kwargs: Any) DataFrame[source]¶
List all available years as a DataFrame.
- Parameters:
page_size – Number of results per page (defaults to config.page_size or 100).
max_pages – Maximum number of pages to fetch (None for all pages).
**kwargs – Additional parameters passed to API layer (e.g., sort, lang, format, extra_query).
- Returns:
DataFrame with available years.
See also
Main Client for main client usage
API Clients for low-level API access
Examples for real-world examples
Configuration for configuration options