top of page

Research Infrastructure for Understanding AI Progress

We develop tools and datasets that help researchers, policymakers, and industry understand how AI capabilities evolve. Our work provides the empirical foundation needed for informed decisions about AI development and deployment.

Introducing: ArXivDLInstruct

A new open-source dataset designed for instruction tuning on Python research code.

> Learn More

Code

ARIA Benchmarks

A set of benchmarks evaluating machine learning knowledge in advanced AI models.

> Learn More

Benchmarking AI Knowledge

Our mission is to build agentic tools and systems that expand the frontiers of innovation while keeping AI safety at the center of everything we create. We develop state-of-the-art models and agents designed to accelerate discovery and deepen understanding.

Our Team

Our team brings extensive experience in deploying machine learning models and conducting AI research. We’ve worked at leading organizations like Apple, NASA, Adobe, and advanced research at Duke University and Northwestern University. This blend of industry and academic expertise fuels our innovative AI solutions.

Our Team

Engineering & Consulting

We provide tailored consulting on AI strategy, research pipeline design, evaluation frameworks, and deployment risks. Whether you’re a startup, enterprise, or research lab, we help you build and assess AI systems with scientific rigor.

bottom of page