Rate Limiting

pyBDL includes a sophisticated rate limiting system that automatically enforces API quotas to prevent exceeding the BDL API provider’s limits. The rate limiter supports both synchronous and asynchronous operations, persistent quota tracking, and flexible wait/raise behaviors.

Overview

The rate limiting system enforces multiple quota periods simultaneously (per second, per 15 minutes, per 12 hours, per 7 days) as specified by the BDL API provider. It automatically tracks quota usage and can either wait for quota to become available or raise exceptions when limits are exceeded.

Key Features

  • Automatic enforcement: Rate limiting is built into all API calls

  • Multiple quota periods: Enforces limits across different time windows simultaneously

  • Persistent cache: Quota usage survives process restarts

  • Sync & async support: Works seamlessly with both synchronous and asynchronous code

  • Configurable behavior: Choose to wait or raise exceptions when limits are exceeded

  • Shared state: Sync and async limiters share quota state via persistent cache

Default Quotas

The rate limiter enforces the following default quotas based on user registration status:

Period

Anonymous user

Registered user

1s

5

10

15m

100

500

12h

1,000

5,000

7d

10,000

50,000

These limits are automatically applied based on whether you provide an API key (registered user) or not (anonymous user).

Registration Status Detection

The library automatically determines your registration status:

  • Anonymous user: When api_key is None or not provided in BDLConfig

  • Registered user: When api_key is provided in BDLConfig

The rate limiter uses separate quota tracking for registered and anonymous users, ensuring that each user type gets the correct limits.

User Guide

Basic Usage

Rate limiting is automatically handled by the library. Simply use the API client normally:

from pybdl import BDL, BDLConfig

config = BDLConfig(api_key="your-api-key")
bdl = BDL(config)

# Rate limiting is automatic - no extra code needed
data = bdl.api.data.get_data_by_variable(variable_id="3643", years=[2021])

The rate limiter will automatically: - Track your API usage across all calls - Enforce quota limits - Raise exceptions if limits are exceeded (default behavior)

Handling Rate Limit Errors

By default, the rate limiter raises a RateLimitError when quota is exceeded:

from pybdl.api.utils.rate_limiter import RateLimitError

try:
    data = bdl.api.data.get_data_by_variable(variable_id="3643", years=[2021])
except RateLimitError as e:
    print(f"Rate limit exceeded. Retry after {e.retry_after:.1f} seconds")
    print(f"Limit info: {e.limit_info}")

The exception includes: - retry_after: Number of seconds to wait before retrying - limit_info: Dictionary with detailed quota information

Waiting Instead of Raising

You can configure the rate limiter to wait automatically instead of raising exceptions. This requires creating a custom rate limiter:

from pybdl.api.utils.rate_limiter import RateLimiter, PersistentQuotaCache
from pybdl.config import DEFAULT_QUOTAS

# Create a rate limiter that waits up to 30 seconds
cache = PersistentQuotaCache(enabled=True)
quotas = {k: v[1] for k, v in DEFAULT_QUOTAS.items()}  # Registered user quotas
limiter = RateLimiter(
    quotas=quotas,
    is_registered=True,
    cache=cache,
    raise_on_limit=False,  # Wait instead of raising
    max_delay=30.0  # Maximum wait time in seconds
)

# Use the limiter before making API calls
limiter.acquire()
data = bdl.api.data.get_data_by_variable(variable_id="3643", years=[2021])

Using Context Managers

Rate limiters can be used as context managers for cleaner code:

from pybdl.api.utils.rate_limiter import RateLimiter, PersistentQuotaCache
from pybdl.config import DEFAULT_QUOTAS

cache = PersistentQuotaCache(enabled=True)
quotas = {k: v[1] for k, v in DEFAULT_QUOTAS.items()}
limiter = RateLimiter(quotas, is_registered=True, cache=cache)

# Automatically acquires quota when entering context
with limiter:
    data = bdl.api.data.get_data_by_variable(variable_id="3643", years=[2021])

Using Decorators

You can decorate functions to automatically rate limit them:

from pybdl.api.utils.rate_limiter import rate_limit
from pybdl.config import DEFAULT_QUOTAS

quotas = {k: v[1] for k, v in DEFAULT_QUOTAS.items()}

@rate_limit(quotas=quotas, is_registered=True, max_delay=10)
def fetch_data(variable_id: str, year: int):
    return bdl.api.data.get_data_by_variable(variable_id=variable_id, years=[year])

# Function is automatically rate limited
data = fetch_data("3643", 2021)

For async functions:

from pybdl.api.utils.rate_limiter import async_rate_limit
from pybdl.config import DEFAULT_QUOTAS

quotas = {k: v[1] for k, v in DEFAULT_QUOTAS.items()}

@async_rate_limit(quotas=quotas, is_registered=True)
async def async_fetch_data(variable_id: str, year: int):
    return await bdl.api.data.aget_data_by_variable(variable_id=variable_id, years=[year])

Checking Remaining Quota

You can check how much quota remains before making API calls:

from pybdl import BDL, BDLConfig

bdl = BDL(BDLConfig(api_key="your-api-key"))

# Get remaining quota (requires accessing the internal limiter)
remaining = bdl._client._sync_limiter.get_remaining_quota()
print(f"Remaining requests per second: {remaining.get(1, 0)}")
print(f"Remaining requests per 15 minutes: {remaining.get(900, 0)}")

Custom Quotas

You can override default quotas for testing or special deployments:

from pybdl import BDLConfig

# Custom quotas: period in seconds -> limit
custom_quotas = {
    1: 20,        # 20 requests per second
    900: 500,     # 500 requests per 15 minutes
    43200: 2000,  # 2000 requests per 12 hours
    604800: 20000 # 20000 requests per 7 days
}

config = BDLConfig(api_key="your-api-key", custom_quotas=custom_quotas)
bdl = BDL(config)

Or via environment variable:

export BDL_QUOTAS='{"1": 20, "900": 500}'

Persistent Cache

The rate limiter uses a persistent cache to track quota usage across process restarts. The cache is stored in:

  • Project-local: .cache/pybdl/quota_cache.json (default)

  • Global: Platform-specific cache directory (e.g., ~/.cache/pybdl/quota_cache.json on Linux)

You can disable persistent caching:

from pybdl import BDLConfig

config = BDLConfig(api_key="your-api-key", quota_cache_enabled=False)
bdl = BDL(config)

Sync and Async Sharing

Both synchronous and asynchronous rate limiters share the same quota state via the persistent cache. This means:

  • Sync and async API calls count toward the same limits

  • Quota usage persists across different execution contexts

  • Process restarts maintain quota state

Technical Details

For technical implementation details, including architecture, algorithm, thread safety, cache implementation, and configuration options, see Appendix: Technical Implementation Details.

API Reference

Rate limiting utilities for pyBDL API client.

This module provides thread-safe rate limiting for both synchronous and asynchronous API requests. It enforces multiple quota periods simultaneously and supports persistent quota tracking across process restarts.

Key Components:
  • RateLimiter: Thread-safe synchronous rate limiter

  • AsyncRateLimiter: Asyncio-compatible asynchronous rate limiter

  • PersistentQuotaCache: Thread-safe persistent storage for quota usage

  • rate_limit: Decorator for rate-limiting synchronous functions

  • async_rate_limit: Decorator for rate-limiting asynchronous functions

Exceptions:
  • GUSBDLError: Base exception for all GUS BDL API errors

  • RateLimitError: Raised when rate limit is exceeded

  • RateLimitDelayExceeded: Raised when required delay exceeds max_delay

Example

Basic usage with automatic rate limiting:

from pybdl import BDL, BDLConfig
bdl = BDL(BDLConfig(api_key="your-api-key"))
data = bdl.api.data.get_data_by_variable(variable_id="3643", year=2021)

Using a custom rate limiter with wait behavior:

from pybdl.api.utils.rate_limiter import RateLimiter, PersistentQuotaCache
from pybdl.config import DEFAULT_QUOTAS

cache = PersistentQuotaCache(enabled=True)
quotas = {k: v[1] for k, v in DEFAULT_QUOTAS.items()}
limiter = RateLimiter(
    quotas=quotas,
    is_registered=True,
    cache=cache,
    raise_on_limit=False,
    max_delay=30.0
)

with limiter:
    # Make API call here
    pass

Using decorators:

from pybdl.api.utils.rate_limiter import rate_limit

@rate_limit(quotas={1: 10, 900: 500}, is_registered=True)
def fetch_data():
    return api_call()

See also

class pybdl.api.utils.rate_limiter.AsyncRateLimiter(quotas: dict[int, int | tuple], is_registered: bool, cache: PersistentQuotaCache | None = None, max_delay: float | None = None, raise_on_limit: bool = True, buffer_seconds: float = 0.05)[source]

Bases: object

Asyncio-compatible rate limiter for API requests.

Enforces multiple quota periods and persists usage if a cache is provided.

async acquire() None[source]

Acquire a slot for an API request asynchronously.

If rate limit is exceeded: - If raise_on_limit=True: Raises RateLimitError immediately - If raise_on_limit=False: Waits until quota available - If max_delay is set and wait_time > max_delay: Raises RateLimitDelayExceeded

Raises:
get_remaining_quota() dict[int, int][source]

Get remaining quota for each period.

async get_remaining_quota_async() dict[int, int][source]

Get remaining quota for each period (async version).

reset() None[source]

Reset all quota counters.

async reset_async() None[source]

Reset all quota counters (async version).

exception pybdl.api.utils.rate_limiter.GUSBDLError[source]

Bases: Exception

Base exception for all GUS BDL API errors.

class pybdl.api.utils.rate_limiter.PersistentQuotaCache(enabled: bool = True)[source]

Bases: object

Persistent cache for API quota usage, stored on disk.

This class provides thread-safe, persistent storage for quota usage data, allowing rate limiters to survive process restarts and share state between sessions.

get(key: str) Any[source]

Retrieve a cached value by key.

Parameters:

key – Cache key.

Returns:

Cached value, or [] if not found or disabled.

set(key: str, value: Any) None[source]

Set a cached value by key and persist it.

Parameters:
  • key – Cache key.

  • value – Value to store.

try_append_if_under_limit(key: str, value: float, max_length: int, cleanup_older_than: float | None = None) bool[source]

Atomically try to append a value to a cached list if it wouldn’t exceed max_length.

This prevents race conditions when multiple limiters try to record calls simultaneously. The entire operation (get, check, append, save) happens atomically under the cache lock.

Parameters:
  • key – Cache key.

  • value – Value to append (typically a timestamp).

  • max_length – Maximum length allowed.

  • cleanup_older_than – If provided, remove values older than this timestamp.

Returns:

True if append succeeded, False if it would exceed the limit.

exception pybdl.api.utils.rate_limiter.RateLimitDelayExceeded(actual_delay: float, max_delay: float, limit_info: dict[str, Any] | None = None)[source]

Bases: RateLimitError

Raised when required delay exceeds max_delay setting.

exception pybdl.api.utils.rate_limiter.RateLimitError(retry_after: float, limit_info: dict[str, Any] | None = None, message: str | None = None)[source]

Bases: GUSBDLError

Raised when rate limit is exceeded.

class pybdl.api.utils.rate_limiter.RateLimiter(quotas: dict[int, int | tuple], is_registered: bool, cache: PersistentQuotaCache | None = None, max_delay: float | None = None, raise_on_limit: bool = True, buffer_seconds: float = 0.05)[source]

Bases: object

Thread-safe synchronous rate limiter for API requests.

Enforces multiple quota periods (e.g., per second, per minute) and persists usage if a cache is provided.

acquire() None[source]

Acquire a slot for an API request.

If rate limit is exceeded: - If raise_on_limit=True: Raises RateLimitError immediately - If raise_on_limit=False: Sleeps until quota available - If max_delay is set and wait_time > max_delay: Raises RateLimitDelayExceeded

Raises:
get_remaining_quota() dict[int, int][source]

Get remaining quota for each period.

reset() None[source]

Reset all quota counters.

pybdl.api.utils.rate_limiter.async_rate_limit(quotas: dict[int, int | tuple], is_registered: bool, **limiter_kwargs: Any) Callable[[Callable[[...], Awaitable[T]]], Callable[[...], Awaitable[T]]][source]

Decorator for rate-limiting async functions.

Parameters:
  • quotas – Dictionary of {period_seconds: limit or (anon_limit, reg_limit)}.

  • is_registered – Whether the user is registered (affects quota).

  • **limiter_kwargs – Additional arguments to pass to AsyncRateLimiter (e.g., max_delay, raise_on_limit).

Returns:

Decorator function.

Example:

@async_rate_limit(quotas={60: 100}, is_registered=True)
async def async_fetch_dataset(dataset_id):
    return await api.get(f"/datasets/{dataset_id}")
pybdl.api.utils.rate_limiter.rate_limit(quotas: dict[int, int | tuple], is_registered: bool, **limiter_kwargs: Any) Callable[[Callable[[...], T]], Callable[[...], T]][source]

Decorator for rate-limiting functions.

Parameters:
  • quotas – Dictionary of {period_seconds: limit or (anon_limit, reg_limit)}.

  • is_registered – Whether the user is registered (affects quota).

  • **limiter_kwargs – Additional arguments to pass to RateLimiter (e.g., max_delay, raise_on_limit).

Returns:

Decorator function.

Example:

@rate_limit(quotas={60: 100}, is_registered=True, max_delay=30)
def fetch_dataset(dataset_id):
    return api.get(f"/datasets/{dataset_id}")

Examples

Example: Custom Rate Limiter with Wait Behavior

from pybdl.api.utils.rate_limiter import RateLimiter, PersistentQuotaCache
from pybdl.config import DEFAULT_QUOTAS

# Create cache
cache = PersistentQuotaCache(enabled=True)

# Get registered user quotas
quotas = {k: v[1] for k, v in DEFAULT_QUOTAS.items()}

# Create limiter that waits up to 30 seconds
limiter = RateLimiter(
    quotas=quotas,
    is_registered=True,
    cache=cache,
    raise_on_limit=False,
    max_delay=30.0
)

# Use limiter
limiter.acquire()  # Will wait if needed, up to 30 seconds
# Make your API call here

Example: Handling Rate Limit Errors

from pybdl import BDL, BDLConfig
from pybdl.api.utils.rate_limiter import RateLimitError, RateLimitDelayExceeded

bdl = BDL(BDLConfig(api_key="your-api-key"))

try:
    data = bdl.api.data.get_data_by_variable(variable_id="3643", years=[2021])
except RateLimitError as e:
    if isinstance(e, RateLimitDelayExceeded):
        print(f"Would need to wait {e.actual_delay:.1f}s, exceeds max {e.max_delay:.1f}s")
    else:
        print(f"Rate limit exceeded. Retry after {e.retry_after:.1f}s")
        print(f"Current limits: {e.limit_info}")

Example: Checking Quota Before Making Calls

from pybdl import BDL, BDLConfig

bdl = BDL(BDLConfig(api_key="your-api-key"))

# Check remaining quota
remaining = bdl._client._sync_limiter.get_remaining_quota()

if remaining.get(1, 0) < 5:
    print("Warning: Low quota remaining for 1-second period")
    # Consider waiting or reducing request rate

# Make API call
data = bdl.api.data.get_data_by_variable(variable_id="3643", years=[2021])

Example: Resetting Quota (for testing)

from pybdl import BDL, BDLConfig

bdl = BDL(BDLConfig(api_key="your-api-key"))

# Reset quota counters (useful for testing)
bdl._client._sync_limiter.reset()

# Now you can make fresh API calls

Best Practices

  1. Use default behavior: The default raise-on-limit behavior is usually best for most applications

  2. Handle exceptions: Always catch RateLimitError and implement retry logic

  3. Monitor quota: Check remaining quota periodically to avoid hitting limits unexpectedly

  4. Use persistent cache: Keep quota_cache_enabled=True (default) to maintain quota state across restarts

  5. Custom quotas for testing: Use custom quotas when testing to avoid hitting production limits

  6. Async operations: Use async rate limiters for async code to avoid blocking the event loop

Troubleshooting

Q: I’m getting RateLimitError even though I haven’t made many calls

A: The persistent cache may contain old quota data. Try resetting the quota or clearing the cache file.

Q: Sync and async calls seem to have separate limits

A: Ensure both limiters share the same PersistentQuotaCache instance. This is automatic when using BDLConfig.

Q: Rate limiter is too slow

A: Consider using async operations or adjusting max_delay. The rate limiter adds minimal overhead (<1ms per call).

Q: Cache file is corrupted

A: The cache file is automatically recreated if corrupted. Old quota data will be lost, but this is usually fine.

See also