facebookresearch / large_concept_model
Large Concept Models: Language modeling in a sentence representation space
β2,131Updated 3 months ago
Alternatives and similar repositories for large_concept_model:
Users that are interested in large_concept_model are comparing it to the libraries listed below
- SONAR, a new multilingual and multimodal fixed-size sentence embedding space, with a full suite of speech and text encoders and decoders.β739Updated last month
- Code for BLT research paperβ1,558Updated this week
- A Self-adaptation Frameworkπ that adapts LLMs for unseen tasks in real-time!β1,048Updated 3 months ago
- Training Large Language Model to Reason in a Continuous Latent Spaceβ1,094Updated 3 months ago
- The simplest, fastest repository for training/finetuning small-sized VLMs.β1,126Updated this week
- Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backendsβ1,500Updated this week
- Implementing DeepSeek R1's GRPO algorithm from scratchβ1,277Updated 2 weeks ago
- Stanford NLP Python library for Representation Finetuning (ReFT)β1,465Updated 3 months ago
- Textbook on reinforcement learning from human feedbackβ855Updated this week
- Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRLβ2,127Updated last week
- Sky-T1: Train your own O1 preview model within $450β3,232Updated 2 weeks ago
- Recipes to scale inference-time compute of open modelsβ1,066Updated 2 months ago
- Minimalistic 4D-parallelism distributed training framework for education purposeβ1,445Updated 2 months ago
- Curated list of datasets and tools for post-training.β3,002Updated 3 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifiβ¦β2,671Updated last week
- β1,017Updated 4 months ago
- nanoGPT style version of Llama 3.1β1,363Updated 9 months ago
- NanoGPT (124M) in 3 minutesβ2,520Updated last week
- MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineeringβ692Updated 3 weeks ago
- Synthetic data curation for post-training and structured data extractionβ1,290Updated this week
- Everything about the SmolLM2 and SmolVLM family of modelsβ2,273Updated last month
- A reading list on LLM based Synthetic Data Generation π₯β1,259Updated 2 months ago
- TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.β2,505Updated last month
- DataComp for Language Modelsβ1,292Updated last month
- Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.β4,527Updated last month
- AllenAI's post-training codebaseβ2,942Updated this week
- Fast State-of-the-Art Static Embeddingsβ1,589Updated this week
- Recipes for shrinking, optimizing, customizing cutting edge vision models. πβ1,420Updated this week
- LIMO: Less is More for Reasoningβ927Updated last month
- [ICLR 2025] Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modelingβ867Updated last week