Hotpot.ai
AI & ML interests
None defined yet.
HotpotBio
HotpotBio is the data lab and research group of Hotpot.ai dedicated to biomedicine.
Due to publishing restrictions on GenAI research, we established this arm to advance science in other ways. We draw inspiration from open source where ephemeral teams drive innovation by attracting talent across organizational boundaries.
We provide expert-verified datasets in clinical reasoning, general clinical AI, oncology, genomics, neurology, pediatrics, drug discovery and development, and other specialty areas. Datasets may contain board-level challenges and multimodal integration.
Data annotation is performed by a curated network of MDs, PhDs, and postdocs from Stanford, UCSF, and other top-tier institutions.
While some data errors are tolerable, perhaps even desirable, for general ML models, uncommon variants in biomedicine can drive pathology. Training on imprecise medical information may lead to misdiagnosis, clinical errors, misfolded proteins, or pharmaceutical drugs with increased MAEs (major adverse events).
Complicating matters, shifting medical facts may invalidate training data and model knowledge. What was true last year may be false today. For instance, in April 2024 the U.S. Preventive Services Task Force reversed its longstanding advice and now urges biennial mammograms starting at age 40 -- down from the previous benchmark of 50 -- for average-risk women, citing rising breast-cancer incidence in younger patients.
Accurate annotation of medical data is challenging and demands verification by experts based on the latest guidelines. Even Google DeepMind's relabeled effort of MedQA from 2024 contains errors, which we uncovered.
This is why HotpotBio exists: to provide rigorously validated, expert-curated datasets and benchmarks in pursuit of advancing ML/AI in clinical and broader biomedical applications.