Examples

Enrich millions of users through a rate-limited API

Make millions of API calls without lying about the rate cap

In this example we:

Split 2,000,000 ids into 2,000 chunks.
Run up to 1,000 Burla workers.
Sleep inside each worker so the global rate stays around 1,000 requests per second.

A 5,000-id test does not tell you much if it never hits the provider's real limit.

Step 1: Chunk the ids

Each chunk is large enough to amortize startup and small enough to stream results as it finishes.

with open("user_ids.txt") as f:
    user_ids = [line.strip() for line in f if line.strip()]

CHUNK = 1000
chunks = [user_ids[i:i + CHUNK] for i in range(0, len(user_ids), CHUNK)]

Step 2: Put pacing near the HTTP call

The worker owns local pacing and retry behavior.

def enrich_chunk(ids: list[str]) -> list[dict]:
    import time
    import httpx

    out = []
    with httpx.Client(timeout=30.0, headers={"Authorization": "Bearer $API_KEY"}) as client:
        for uid in ids:
            for attempt in range(5):
                r = client.get(f"https://api.example.com/v1/users/{uid}")
                if r.status_code == 429:
                    time.sleep(float(r.headers.get("Retry-After", 2 ** attempt)))
                    continue
                r.raise_for_status()
                out.append({"user_id": uid, **r.json()})
                break
            time.sleep(1.0)
    return out

Step 3: Cap live workers

max_parallelism is the global throttle.

from burla import remote_parallel_map

results = remote_parallel_map(
    enrich_chunk,
    chunks,
    func_cpu=1,
    func_ram=2,
    max_parallelism=1000,
    generator=True,
    grow=True,
)

What's the point?

Rate limits are where toy parallelism lies. A local async script can look great until it turns into a retry storm.

The useful question is: can I finish the whole backfill without breaking the provider's contract? That means chunking, local sleeps, global concurrency, and output streaming. Burla handles the worker fleet; your function still owns the API behavior.

Enrich millions of users through a rate-limited API

#Make millions of API calls without lying about the rate cap

#Step 1: Chunk the ids

#Step 2: Put pacing near the HTTP call

#Step 3: Cap live workers

#What's the point?

Make millions of API calls without lying about the rate cap

Step 1: Chunk the ids

Step 2: Put pacing near the HTTP call

Step 3: Cap live workers

What's the point?