SakanaAI / DiscoPOP
Code for Discovering Preference Optimization Algorithms with and for Large Language Models
☆159Updated 3 months ago
Related projects: ⓘ
- ☆53Updated this week
- Finetune Llama-3-8b on the MathInstruct dataset☆91Updated 3 weeks ago
- ☆130Updated last week
- An automated tool for discovering insights from research papaer corpora☆131Updated 3 months ago
- 2D Positional Embeddings for Webpage Structural Understanding 🦙👀☆93Updated 2 weeks ago
- GPT-2 (124M) quality in 5B tokens☆227Updated 2 weeks ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated last year
- Evaluating LLMs with CommonGen-Lite☆83Updated 6 months ago
- ☆91Updated last month
- Alice in Wonderland code base for experiments and raw experiments data☆96Updated last week
- Video+code lecture on building nanoGPT from scratch☆64Updated 3 months ago
- Function Calling Benchmark & Testing☆73Updated 2 months ago
- Automated testing and benchmarking for code generation agents.☆17Updated last year
- Low-Rank adapter extraction for fine-tuned transformers model☆154Updated 4 months ago
- ☆109Updated last month
- Repository for the paper Stream of Search: Learning to Search in Language☆70Updated last month
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems.☆48Updated 3 weeks ago
- run paligemma in real time☆122Updated 4 months ago
- Banishing LLM Hallucinations Requires Rethinking Generalization☆253Updated 2 months ago
- ☆101Updated 6 months ago
- The history files when recording human interaction while solving ARC tasks☆91Updated this week
- σ-GPT: A New Approach to Autoregressive Models☆53Updated last month
- Code for the paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆140Updated 3 months ago
- ☆75Updated 3 weeks ago
- Automating enterprise workflows with multimodal agents☆83Updated last month
- ☆59Updated last week
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆161Updated 8 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆143Updated this week
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆223Updated 4 months ago
- ☆68Updated 2 months ago
- ☆85Updated 7 months ago