kevinwu23 / StanfordFineTuneBenchLinks
☆30Updated 7 months ago
Alternatives and similar repositories for StanfordFineTuneBench
Users that are interested in StanfordFineTuneBench are comparing it to the libraries listed below
Sorting:
- ☆47Updated 4 months ago
- ☆51Updated 7 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆58Updated last month
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆69Updated last week
- ☆23Updated last year
- Analysis on the cost of encoder based models☆11Updated 4 months ago
- Code for NeurIPS LLM Efficiency Challenge☆59Updated last year
- ☆60Updated last week
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 11 months ago
- Model implementation for the contextual embeddings project☆33Updated 3 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- ☆60Updated 3 weeks ago
- Storing long contexts in tiny caches with self-study☆67Updated this week
- Source code for the collaborative reasoner research project at Meta FAIR.☆91Updated 2 months ago
- Simple repository for training small reasoning models☆33Updated 4 months ago
- Training an LLM to use a calculator with multi-turn reinforcement learning, achieving a **62% absolute increase in evaluation accuracy**.☆41Updated last month
- ☆39Updated 11 months ago
- Small, simple agent task environments for training and evaluation☆18Updated 7 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆32Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.☆78Updated 9 months ago
- 🤝 Trade any tensors over the network☆30Updated last year
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆23Updated 2 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- ☆77Updated last year
- Pre-train Static Word Embeddings☆79Updated 3 weeks ago
- QLoRA for Masked Language Modeling☆22Updated last year
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 5 months ago
- ☆47Updated 9 months ago