Zyphra / transformers_zamba2Links

☆48

Alternatives and similar repositories for transformers_zamba2

Users that are interested in transformers_zamba2 are comparing it to the libraries listed below

Sorting:

s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆55Updated 6 months ago
tiiuae / onebitllms
Lightweight toolkit package to train and fine-tune 1.58bit Language models
☆82Updated 2 months ago
codelion / pts
Pivotal Token Search
☆118Updated 3 weeks ago
huggingface / huggingface-inference-toolkit
Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.
☆83Updated 2 weeks ago
arcee-ai / DAM
☆53Updated 8 months ago
AtakanTekparmak / agento
Very minimal (and stateless) agent framework
☆45Updated 6 months ago
matthewrenze / jhu-concise-cot
The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models
☆22Updated 8 months ago
nahidalam / maya
Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya
☆117Updated 2 weeks ago
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated last year
nyunAI / PruneGPT
☆51Updated last year
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆49Updated 5 months ago
kubernetes-bad / reward-composer
Lego for GRPO
☆28Updated 2 months ago
allenai / infinigram-api
☆73Updated 2 weeks ago
ritabratamaiti / AnyModal
AnyModal is a Flexible Multimodal Language Model Framework for PyTorch
☆101Updated 7 months ago
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆63Updated 11 months ago
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 6 months ago
Zyphra / Zyda_processing
☆37Updated last year
miralab-ai / autoreason
☆40Updated 7 months ago
tanyuqian / cappy
NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer
☆43Updated last year
du-nlp-lab / MLR-Copilot
☆66Updated 4 months ago
StigLidu / DualDistill
The official implementation for paper "Agentic-R1: Distilled Dual-Strategy Reasoning"
☆84Updated 2 weeks ago
SeunghyunSEO / optimized_hf_llama_class_for_training
☆48Updated 11 months ago
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆88Updated 3 months ago
SebastianBodza / EnsembleForecasting
Using multiple LLMs for ensemble Forecasting
☆16Updated last year
LLM360 / crystalcoder-data-prep
Data preparation code for CrystalCoder 7B LLM
☆45Updated last year
luyug / magix
Supercharge huggingface transformers with model parallelism.
☆77Updated last week
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 9 months ago
allenai / IFBench
☆62Updated last month
VikParuchuri / classified
Score LLM pretraining data with classifiers
☆55Updated last year
AlexBodner / How_Much_VRAM
☆102Updated 11 months ago