philschmid / MixEvalLinks
The official evaluation suite and dynamic data release for MixEval.
☆11Updated last year
Alternatives and similar repositories for MixEval
Users that are interested in MixEval are comparing it to the libraries listed below
Sorting:
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Updated 3 months ago
- Latent Large Language Models☆19Updated last year
- ☆56Updated last year
- Train, tune, and infer Bamba model☆137Updated 8 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Updated last year
- Simple repository for training small reasoning models☆49Updated last year
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆64Updated this week
- XmodelLM☆38Updated last year
- ☆82Updated 2 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆115Updated 10 months ago
- ☆21Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated last year
- Using modal.com to process FineWeb-edu data☆20Updated 10 months ago
- Repository containing awesome resources regarding Hugging Face tooling.☆48Updated 2 years ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆37Updated 4 months ago
- ☆39Updated 6 months ago
- ☆86Updated 2 years ago
- ☆52Updated last year
- ☆80Updated last year
- Supercharge huggingface transformers with model parallelism.☆78Updated 6 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆112Updated 9 months ago
- Very minimal (and stateless) agent framework☆44Updated last year
- Tensor-Slayer : Manipulate weights and tensors of LLMs to achieve performance upgrades and introduce a novel inferenceless mechanistic in…☆27Updated 8 months ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆46Updated 2 years ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆46Updated 6 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆56Updated 5 months ago
- ☆21Updated 2 weeks ago
- Training hybrid models for dummies.☆29Updated 3 months ago
- A repository for research on medium sized language models.☆77Updated last year