Airbnb at scale: every public listing, looked at all at once on Burla

How it ran on Burla

Burla is a high-performance parallel processing library for data teams that iterate quickly. You write a Python function, you call remote_parallel_map, and it runs across a cluster with a shared filesystem mounted at ./shared. No Docker, no Kubernetes, no orchestration glue.

For this run a single dynamic cluster scaled CPU workers up to ~1.7K for photo download and CLIP scoring, and the same cluster ran 20 A100 GPUs for embedding-cluster work, in parallel with the CPU jobs. Claude Haiku validation ran rate-limited on top.

-- concurrent workers at peak across photo download, CLIP scoring, and review tier-1. 20 A100 GPUs ran in parallel on the same cluster, while CPU jobs kept going.

Full writeup is on GitHub.
Burla docs are at docs.burla.dev.

# s02b: download every photo URL, score with CLIP,
# write parquet shards to ./shared. 6K batches.
from burla import remote_parallel_map
import open_clip

def score_batch(args):
    model, _, prep = open_clip.create_model_and_transforms(
        "ViT-B-32", pretrained="laion2b_s34b_b79k",
        cache_dir="./shared/clip_weights",
    )
    # download -> encode -> cosine vs PROMPTS -> parquet
    return {"shard": shard, "n_ok": n_ok}

remote_parallel_map(
    score_batch, batch_args,
    func_cpu=2, func_ram=8,
    max_parallelism=1000,   # 1k concurrent at peak
    grow=True,
)

# s04 tier 2: embed top 200K reviews with SBERT,
# one parquet shard per worker on ./shared.
from burla import remote_parallel_map
from sentence_transformers import SentenceTransformer

def embed_batch(args):
    model = SentenceTransformer(
        "all-MiniLM-L6-v2",
        cache_folder="./shared/sbert",
    )
    rows = read_slice(
        args.input_path, args.row_start, args.row_end,
    )
    vecs = model.encode(
        rows["comments"].tolist(), batch_size=128,
    )
    write_shard(args.output_root, rows, vecs)
    return {"n_ok": len(rows)}

remote_parallel_map(
    embed_batch, embed_args,
    func_cpu=2, func_ram=8, max_parallelism=200,
    grow=True,
)

# s05c: Haiku Vision double-checks the CLIP
# shortlists. Rate-limited at 64 workers.
from burla import remote_parallel_map
import anthropic, json

def validate_pet(args):
    client = anthropic.Anthropic()
    rows = []
    for url, listing_id in args.batch:
        msg = client.messages.create(
            model="claude-haiku-4-5", max_tokens=200,
            messages=pet_prompt(fetch(url)),
        )
        verdict = json.loads(msg.content[0].text)
        rows.append({"listing_id": listing_id, **verdict})
    write_shard(args.output_path, rows)
    return {"n_ok": len(rows)}

remote_parallel_map(
    validate_pet, pet_batches,
    func_cpu=2, func_ram=8, max_parallelism=64,
    grow=True,
)

Does any of this actually predict demand?

For each idea below, we sort every listing into a few groups (like “darkest photos” vs “brightest photos”) and check whether the higher-occupancy ones really do land in one group. We accept an idea only when no two groups overlap.

How to read these cards

The bar shows the median % booked over the next 365 nights for each group, our demand proxy. Further right means more booked. The tick is our best guess; the wider band is the range we're confident covers the real number.

n = 240.5K n is how many listings ended up in that group. 240.5K means 240,500. Bigger groups give us tighter, more trustworthy bars.

ACCEPTED REJECTED Accepted means the bars in the card are clearly separated, so the groups really are different. Rejected means the bars overlap, so we cannot tell the groups apart.

Every Airbnb,
looked at all at once.

Every flagged listing on a map

Listings with drug-den vibes

The most hectic kitchens

Cats and dogs Claude said are actually real

Worst TV placements across every public Airbnb

Funniest reviews from 50 million

How it ran on Burla

Does any of this actually predict demand?

How to read these cards