Bugle
An AI agent that knows hunting law — and predicts your draw odds.
A conversational research assistant over the hunting & trapping regulations of four states, with cited sources and a Monte Carlo engine that simulates draw odds years into the future. Built solo, end to end.
- 4
- states of law
- 348
- eval cases
- ~70
- regulation docs
- 7
- draw mechanics modeled
Hunters live by a maze of regulations that differ by state, species, weapon, and season — and the draw systems that hand out limited tags are opaque enough that companies charge for a human to explain them. Bugle answers both: it reads the actual law and cites it, and it simulates your odds of drawing a tag.
The problem
Trustworthy answers over law that changes every season
Regulations are scattered across statutes, administrative code, and annual proclamations — different formats, different agencies, four states. A wrong answer isn't a typo; it's a citation or a fine. The system has to be grounded in the real text, current, and honest about what it doesn't know.
Ingestion
A versioned corpus, built by a robust CLI
A five-stage, idempotent pipeline (fetch → parse → materialize → embed → validate) driven from the command line turns ~70 source documents into a searchable, embedded corpus. The knowledge base is treated as an immutable, versioned artifact: every ingest targets a candidate version, promotion to production is atomic, rollback is trivial, and superseded regulations keep an audit trail instead of vanishing.
- HTML + PDF parsers per source (admin rules, IDAPA, proclamations)
- Section-aware chunking with structural breadcrumbs for citations
- Topic / species / weapon metadata extracted per chunk
Retrieval
Six steps from question to cited answer
Each query is decomposed, run through hybrid search (vector + keyword + metadata), reranked, answered by Claude with streamed citations, and then verified for faithfulness — with every step traced. Conversations are bound to a state so retrieval can't cross jurisdictions, and guardrails refuse legal-advice and out-of-scope questions instead of guessing.
Evaluation
An eval harness that runs the real pipeline
Quality is gated by 348 expert-validated cases — including hallucination traps and must-refuse questions — run against the exact production retrieval code, not a copy. Faithfulness and citation-accuracy are hard promotion gates; LLM judges are calibrated to human raters. A corpus version doesn't ship unless it clears the suite.
The marquee feature
A Monte Carlo draw-odds engine
The draw engine answers "how many years until I'm likely to draw?" It simulates the draw forward year by year, accounting for point creep, pool churn, resident vs. non-resident quotas, and rule changes (Colorado flips from preference to a hybrid split in 2028). Draw mechanics live as validated config data, not code — so a new state or a rule change is a row, not a deploy. The plan: wire it into the chat so a hunter can just describe a scenario in plain language.
- Seven mechanic types as pure functions (preference solved analytically; the rest via Monte Carlo)
- 5+ years of historical odds backfilled for calibration
- Research-framed projections — never presented as guarantees
Why it matters
Senior-grade discipline, built fast
Corpus versioning, an eval harness over real code, config-as-data, and ESLint-enforced architecture — one codebase serving a web app, an ingestion CLI, and an eval suite. The kind of engineering that lets a single developer operate a knowledge-heavy AI product with zero downtime and a straight face about correctness.
Related writing
Next project
Paiv