lamm-mit / Cephalo-Phi-3-Vision-MoE
☆11Updated 3 months ago
Related projects: ⓘ
- IBM development fork of https://github.com/huggingface/text-generation-inference☆52Updated this week
- Small and Efficient Mathematical Reasoning LLMs☆69Updated 7 months ago
- ☆24Updated 2 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆28Updated 4 months ago
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆68Updated 2 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 4 months ago
- A pipeline for LLM knowledge distillation☆68Updated last month
- ☆75Updated 3 weeks ago
- ☆61Updated 2 months ago
- Block Transformer: Global-to-Local Language Modeling for Fast Inference (Official Code)☆118Updated 2 weeks ago
- Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆55Updated this week
- Set of scripts to finetune LLMs☆36Updated 5 months ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆46Updated 5 months ago
- A toolkit enhances PyTorch with specialized functions for low-bit quantized neural networks.☆24Updated 2 months ago
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆48Updated last week
- Self-host LLMs with vLLM and BentoML☆62Updated this week
- vLLM: A high-throughput and memory-efficient inference and serving engine for LLMs☆87Updated this week
- Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"☆87Updated 8 months ago
- ☆50Updated 2 months ago
- My fork os allen AI's OLMo for educational purposes.☆27Updated 5 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 2 weeks ago
- ☆59Updated last week
- ☆25Updated this week
- ☆85Updated 7 months ago
- Data preparation code for Amber 7B LLM☆76Updated 4 months ago
- The official repo for "LLoCo: Learning Long Contexts Offline"☆104Updated 3 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 6 months ago
- Code for NeurIPS LLM Efficiency Challenge☆52Updated 5 months ago
- ☆35Updated last week