PhD Level
Posts
Biggest Gene Data Drop Ever! 🧬

Biggest Gene Data Drop Ever! 🧬

Daily news that is actually intellectually stimulating.

Anthony Ao
June 30, 2025

PhD Level: Daily Curated Tech News for Entrepreneurs, by Entrepreneurs

Dear reader,

If we want to merge AI and biotech, we need data. And freshly last week, the biggest gene data has dropped!

Let’s get into it →

Xaira Drops Massive Perturb‑seq Dataset: X‑Atlas/Orion

AI-driven biotech firm Xaira Therapeutics has released X‑Atlas/Orion, the largest publicly available genome-wide Perturb‑seq dataset. Built using its scalable FiCS platform, the resource includes 8 million single cells across all human protein‑coding genes, with deep sequencing (~16,000 UMIs per cell) and dose-dependent perturbation measurements—marking an unprecedented scale for genetic effect profiling

My take: If you haven’t followed Bo Wang, you need to follow Bo Wang. He was the guys I have been following and he did a great job releasing scGPT (single-cell GPT) and more works on this. Xaira is just a tip of the iceberg. This is a game-changer—by democratizing vast, high-resolution data and embracing dose-response dynamics, X‑Atlas/Orion empowers both AI researchers and wet‑lab scientists to simulate and prioritize experiments with unparalleled precision.

Takeaways

Why It Matters

Scale & open access: Vast data enabling the development of “virtual cell” AI models

Nuanced insight: Moves beyond binary gene edits by capturing continuous, dose-dependent changes

Boosts drug discovery: Models built on this dataset can simulate gene behavior under varied conditions—accelerating target identification and reducing lab workload

Some affiliate links we endorse:

Stay curious,
The PhDLevel Team
☕️🐻 Powered by caffeine & curiosity