facebookresearch / large_concept_model
Large Concept Models: Language modeling in a sentence representation space
☆108Updated this week
Alternatives and similar repositories for large_concept_model:
Users that are interested in large_concept_model are comparing it to the libraries listed below
- PyTorch implementation of models from the Zamba2 series.☆164Updated 3 weeks ago
- Code for BLT research paper☆358Updated this week
- code for training & evaluating Contextual Document Embedding models☆133Updated 2 weeks ago
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆197Updated 2 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆181Updated 5 months ago
- Manage scalable open LLM inference endpoints in Slurm clusters☆244Updated 5 months ago
- Long context evaluation for large language models☆192Updated last week
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆242Updated last week
- smolLM with Entropix sampler on pytorch☆147Updated last month
- ☆119Updated 3 months ago
- An introduction to LLM Sampling☆66Updated last month
- DeMo: Decoupled Momentum Optimization☆147Updated 2 weeks ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆114Updated last month
- ☆91Updated last year
- Muon optimizer for neural networks: >30% extra sample efficiency, <3% wallclock overhead☆174Updated this week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆251Updated 3 weeks ago
- GRadient-INformed MoE☆261Updated 2 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆261Updated 5 months ago
- Simple Transformer in Jax☆120Updated 5 months ago
- $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources☆117Updated last month
- Automatic Evals for Instruction-Tuned Models☆93Updated last week
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆187Updated last month
- A comprehensive repository of reasoning tasks for LLMs (and beyond)☆291Updated 2 months ago
- ☆95Updated 2 months ago
- Code for the Molmo Vision-Language Model☆136Updated this week
- Normalized Transformer (nGPT)☆136Updated 3 weeks ago
- Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"☆218Updated last week
- A compact LLM pretrained in 9 days by using high quality data☆274Updated 2 weeks ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆223Updated last month