CosineAI / experimentsLinks
Open sourced predictions, execution logs, trajectories, and results from model inference + evaluation runs on the SWE-bench task.
☆15Updated 9 months ago
Alternatives and similar repositories for experiments
Users that are interested in experiments are comparing it to the libraries listed below
Sorting:
- ☆21Updated 3 weeks ago
- ☆23Updated last year
- ☆22Updated last year
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆16Updated last year
- BH hackathon☆14Updated last year
- ☆50Updated 3 weeks ago
- ☆15Updated last week
- Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'☆16Updated last year
- ☆10Updated 2 months ago
- Proceedings of Innovative Use of NLP for Building Educational Applications 2023: SIGHT: A Large Annotated Dataset on Student Insights Gat…☆9Updated 11 months ago
- ☆13Updated 3 months ago
- ☆14Updated last year
- ☆21Updated 7 months ago
- Tutorial for DSPy☆23Updated last year
- Advanced Coding AI Assistant that uses a Gradio interface to stream coding related responses. ChatRAG supports local and API inference an…☆22Updated last month
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆32Updated last week
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 7 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆30Updated 2 months ago
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- LlamaWorksDB is a Retrieval Augmented Generation (RAG) product designed to interact with the documentation of various products such as Ll…☆16Updated last year
- Interactive Textbook Demo☆44Updated last year
- The official evaluation suite and dynamic data release for MixEval.☆11Updated 9 months ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated 2 months ago
- Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.☆28Updated last year
- Lego for GRPO☆28Updated last month
- An agent to generate stunning images :)☆19Updated last month
- Apps that run on modal.com☆12Updated last year
- ☆1Updated 11 months ago
- Verifiers for LLM Reinforcement Learning☆60Updated 2 months ago
- This repository implements DSPy programs to tasks in Indian Languages☆13Updated last year