Source code for the paper "LongGenBench: Long-context Generation Benchmark"
☆23Oct 8, 2024Updated last year
Alternatives and similar repositories for LongGenBench
Users that are interested in LongGenBench are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆36Oct 4, 2025Updated 8 months ago
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 5 months ago
- Implementation of AdaCQR(COLING 2025)☆15Dec 30, 2024Updated last year
- ☆47Nov 25, 2024Updated last year
- ☆12Sep 1, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Pytorch implementation of our paper accepted by ICML 2024 -- CaM: Cache Merging for Memory-efficient LLMs Inference☆50Jun 19, 2024Updated last year
- FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models. FS-DFM accepted for ICLR 2026☆43Jan 6, 2026Updated 5 months ago
- ☆27Apr 14, 2025Updated last year
- The Official Implementation of Ada-KV [NeurIPS 2025]☆134Nov 26, 2025Updated 6 months ago
- KV cache compression for high-throughput LLM inference☆157Feb 5, 2025Updated last year
- 面向大模型的民族文化数据集☆12May 26, 2025Updated last year
- [ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference☆391Jul 10, 2025Updated 11 months ago
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆386Sep 25, 2024Updated last year
- ☆318Jul 10, 2025Updated 11 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Imperative deep learning framework with customized GPU and CPU backend☆30Jul 25, 2023Updated 2 years ago
- End to End steps for adding custom ops in PyTorch.☆24Aug 20, 2020Updated 5 years ago
- WavSpA: Wavelet Space Attention for Enhancing Transformer's Long Sequence Learning☆13Feb 24, 2024Updated 2 years ago
- [ICML 2026] code & model for arxiv paper "Autoregressive Image Generation with Masked Bit Modeling"☆56May 1, 2026Updated last month
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆51Oct 18, 2024Updated last year
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection☆55Oct 29, 2024Updated last year
- ☆47Sep 13, 2025Updated 8 months ago
- This repository provides an improved LLamaGen Model, fine-tuned on 500,000 high-quality images, each accompanied by over 300 token prompt…☆30Oct 21, 2024Updated last year
- Examples for MS-AMP package.☆30Jul 17, 2025Updated 10 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"☆12Mar 25, 2025Updated last year
- A selective knowledge distillation algorithm for efficient speculative decoders☆40Nov 27, 2025Updated 6 months ago
- HALO: Hadamard-Assisted Low-Precision Optimization and Training method for finetuning LLMs. 🚀 The official implementation of https://arx…☆29Feb 17, 2025Updated last year
- Official code and resources for the paper "EXIT: Context-Aware Extractive Compression for Enhancing Retrieval-Augmented Generation."☆24Dec 23, 2024Updated last year
- ☆12Jan 17, 2024Updated 2 years ago
- [EMNLP 2024 Findings🔥] Official implementation of ": LOOK-M: Look-Once Optimization in KV Cache for Efficient Multimodal Long-Context In…☆103Nov 9, 2024Updated last year
- ☆16Jul 12, 2024Updated last year
- ☆13Mar 9, 2024Updated 2 years ago
- [EMNLP 2023 Industry Track] A simple prompting approach that enables the LLMs to run inference in batches.☆76Mar 8, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Fast and memory-efficient exact attention☆21Apr 10, 2026Updated 2 months ago
- Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference☆168Oct 13, 2025Updated 7 months ago
- [ACL 25] SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities☆30Apr 2, 2025Updated last year
- ☆24May 6, 2022Updated 4 years ago
- Official code for Guiding Language Model Math Reasoning with Planning Tokens☆19Feb 29, 2024Updated 2 years ago
- The official implementation of dLLM-Var☆35Nov 6, 2025Updated 7 months ago
- ☆37Feb 12, 2025Updated last year