top of page

Our Research

Code

Our open-source baseline agent for ML Research Benchmark. This agent provides a foundation for comparing and evaluating machine learning research and development tasks that agents can perform.

The tasks for ML Research Benchmark, a benchmarkdesigned to evaluate the capabilities of AI agents in accelerating ML research and development. The benchmark consists of 9 competition-level tasks that span the spectrum of activities typically undertaken by ML researchers.

The AI Agent State Library is a library designed to manage the state and decision-making processes of AI agents. At its core, it implements the concept of finite state machines, a computational model used to design systems with a finite number of states and transitions between those states. 

ARIA Benchmarks is a suite of closed-book benchmarks designed to assess a models knowledge and understanding of machine learning research and methodologies

Datasets & Models

ArXiv DL Instruct Dataset

ArXivDLInstruct is a dataset for instruct tuning for Python research code. The dataset is comprised of 778,152 functions from research code on ArXiv, and provides a detailed prompt for generating the function, in addition to a short description.

​​

ArXiv Research Code Dataset

The arxiv_research_code dataset contains over 21.8GB of source code files referenced strictly in ArXiv papers. The dataset serves as a curated dataset for Code LLMs.
 


We have also broken the dataset out into the most prominent languages

ArXiv Instruct Tuning Dataset

A series of datasets consisting of 50,000 question-answer pairs derived from ArXiv abstracts. Questions are generated using the t5-base model, while the answers are generated using the GPT-3.5-turbo model
 

Arxiv QA Bier Datasets

A series of BEIR style question-answer dataset derived from ArXiv.

​

ArXiv Semantic Search Models

A series of Axiv Semantic Search models trained on ArtifactAI/arxiv-beir-500k-generated-queries, a large corpus of 500k question/abstract pairs extracted from the ArXiv dataset. It is designed to encode and transform sentences from academic papers, allowing for effective semantic similarity and information retrieval tasks. It maps sentences & paragraphs to a 768 dimensional dense vector space.

​

ArXiv LED Summarization Models

A led-large-16384 model to summarize ArXiv papers. Inputs are the abstracts of papers and full documents, and outputs are the summaries of the papers.

​

Advancing AI Together

We value the power of collaboration and are actively seeking partnerships with academic institutions, AI research labs, and individual researchers to drive innovation together.

Algorithmic Research Group
5540 Centerview Dr Ste 204 PMB 296182 Raleigh, NC, 27606 US

Thanks for submitting!

©2024 Algorithmic Research Group. All Rights Reserved.

bottom of page