eqimp / hogwild_llm
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
☆89Updated this week
Alternatives and similar repositories for hogwild_llm:
Users that are interested in hogwild_llm are comparing it to the libraries listed below
- OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.☆170Updated 3 months ago
- EvaByte: Efficient Byte-level Language Models at Scale☆87Updated last month
- Train your own SOTA deductive reasoning model☆86Updated last month
- PyTorch implementation of models from the Zamba2 series.☆179Updated 2 months ago
- ☆129Updated 8 months ago
- Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…☆317Updated 4 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆139Updated 2 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆197Updated 9 months ago
- ☆114Updated 2 months ago
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 4 months ago
- code for training & evaluating Contextual Document Embedding models☆180Updated this week
- Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"☆230Updated 2 months ago
- Code for "LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding", ACL 2024☆286Updated last week
- ☆176Updated 4 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆62Updated last month
- ☆194Updated last month
- DeMo: Decoupled Momentum Optimization☆186Updated 4 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆96Updated last month
- ☆65Updated this week
- Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems☆84Updated last month
- ☆126Updated 3 weeks ago
- Fine-tunes a student LLM using teacher feedback for improved reasoning and answer quality. Implements GRPO with teacher-provided evaluati…☆41Updated last month
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆108Updated 2 months ago
- Reward-guided Speculative Decoding (RSD) for efficiency and effectiveness.☆23Updated last month
- ☆50Updated 10 months ago
- BABILong is a benchmark for LLM evaluation using the needle-in-a-haystack approach.☆198Updated last week
- LLM Inference on consumer devices☆105Updated last month
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 11 months ago
- ☆166Updated 2 months ago
- Official implementation of the paper "Linear Transformers with Learnable Kernel Functions are Better In-Context Models"☆159Updated 3 months ago