SalesforceAIResearch / LaTRO
☆46Updated last week
Related projects ⓘ
Alternatives and complementary repositories for LaTRO
- The official implementation of Self-Exploring Language Models (SELM)☆55Updated 5 months ago
- Flow of Reasoning: Training LLMs for Divergent Problem Solving with Minimal Examples☆39Updated last month
- A repository for research on medium sized language models.☆74Updated 5 months ago
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆75Updated last month
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆28Updated 8 months ago
- This is the official repository for Inheritune.☆105Updated last month
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- ☆62Updated 3 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆40Updated last month
- ☆40Updated 2 weeks ago
- How to create rational LLM-based agents? Using game-theoretic workflows!☆20Updated this week
- Code for Paper: Autonomous Evaluation and Refinement of Digital Agents [COLM 2024]☆97Updated last month
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 4 months ago
- ☆28Updated 7 months ago
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"☆44Updated 10 months ago
- ☆22Updated 2 months ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆78Updated 8 months ago
- Repository for NPHardEval, a quantified-dynamic benchmark of LLMs☆48Updated 7 months ago
- ☆54Updated last month
- ☆27Updated 5 months ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆129Updated 2 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆49Updated 8 months ago
- ☆112Updated last month
- ☆78Updated 11 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆103Updated 6 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆37Updated 5 months ago
- Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)☆50Updated 7 months ago
- ☆63Updated last month
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago