Every public listing in Inside Airbnb's open dump, 119 cities, 4 quarterly snapshots. We scored 1.7M photos with CLIP (a model that turns an image into a vector you can compare to a text prompt), shortlisted the most suspicious ones, and had Claude Haiku Vision double-check each shortlist. We also scored every review and reranked the weirdest 12K with Haiku. Everything was parallelized on Burla, on a single dynamic cluster that scaled to ~1.7K CPU workers for photo download and CLIP, with 20 A100 GPUs running embedding clusters in parallel on the same cluster.
Each dot is a listing flagged by one of the Haiku-validated photo detectors below, color-coded by category. Drag, zoom, click for the listing.
CLIP shortlisted “messy room” candidates, then Claude Haiku Vision kept only the ones that look less like an Airbnb and more like an opium den. Bare bulb, mattress on the floor, peeling walls, you can almost smell it through the photo.
CLIP shortlisted “messy room” candidates, then Claude Haiku Vision said the photo is genuinely a chaotic kitchen, not just a small one.
CLIP shortlisted pet-shaped candidates from 1.7M photos, then Claude Haiku Vision said “yes, that is a real cat or dog.” Paintings, throw pillows, and rugs that looked vaguely animal-shaped were rejected.
CLIP shortlisted “TV mounted way too high” candidates from 1.7M photos, then Claude Haiku Vision confirmed each one as either above-fireplace or unusually-high.
A 3-tier funnel: regex on every review, embedding cluster on the top 200k, Claude Haiku on the top 12k. Filter by category, city, or year, or just type any word to search. Click any card to read it in full.
Burla
is a high-performance parallel processing library for data teams that
iterate quickly. You write a Python function, you call
remote_parallel_map, and it runs across a cluster with a
shared filesystem mounted at ./shared. No Docker, no
Kubernetes, no orchestration glue.
For this run a single dynamic cluster scaled CPU workers up to ~1.7K for photo download and CLIP scoring, and the same cluster ran 20 A100 GPUs for embedding-cluster work, in parallel with the CPU jobs. Claude Haiku validation ran rate-limited on top.
Full writeup is on
GitHub.
Burla docs are at docs.burla.dev.
# s02b: download every photo URL, score with CLIP, # write parquet shards to ./shared. 6K batches. from burla import remote_parallel_map import open_clip def score_batch(args): model, _, prep = open_clip.create_model_and_transforms( "ViT-B-32", pretrained="laion2b_s34b_b79k", cache_dir="./shared/clip_weights", ) # download -> encode -> cosine vs PROMPTS -> parquet return {"shard": shard, "n_ok": n_ok} remote_parallel_map( score_batch, batch_args, func_cpu=2, func_ram=8, max_parallelism=1000, # 1k concurrent at peak grow=True, )
# s04 tier 2: embed top 200K reviews with SBERT, # one parquet shard per worker on ./shared. from burla import remote_parallel_map from sentence_transformers import SentenceTransformer def embed_batch(args): model = SentenceTransformer( "all-MiniLM-L6-v2", cache_folder="./shared/sbert", ) rows = read_slice( args.input_path, args.row_start, args.row_end, ) vecs = model.encode( rows["comments"].tolist(), batch_size=128, ) write_shard(args.output_root, rows, vecs) return {"n_ok": len(rows)} remote_parallel_map( embed_batch, embed_args, func_cpu=2, func_ram=8, max_parallelism=200, grow=True, )
# s05c: Haiku Vision double-checks the CLIP # shortlists. Rate-limited at 64 workers. from burla import remote_parallel_map import anthropic, json def validate_pet(args): client = anthropic.Anthropic() rows = [] for url, listing_id in args.batch: msg = client.messages.create( model="claude-haiku-4-5", max_tokens=200, messages=pet_prompt(fetch(url)), ) verdict = json.loads(msg.content[0].text) rows.append({"listing_id": listing_id, **verdict}) write_shard(args.output_path, rows) return {"n_ok": len(rows)} remote_parallel_map( validate_pet, pet_batches, func_cpu=2, func_ram=8, max_parallelism=64, grow=True, )
For each idea below, we sort every listing into a few groups (like “darkest photos” vs “brightest photos”) and check whether the higher-occupancy ones really do land in one group. We accept an idea only when no two groups overlap.