☆57Apr 11, 2024Updated 2 years ago
Alternatives and similar repositories for in-context-pretraining
Users that are interested in in-context-pretraining are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.☆14May 2, 2024Updated 2 years ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodings☆170Jun 13, 2024Updated last year
- Code for ACL2023 paper: Pre-Training to Learn in Context☆106Jul 26, 2024Updated last year
- Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"☆18Mar 15, 2024Updated 2 years ago
- The official implemetation of "Evidentiality-guided Generation for Knowledge-Intensive NLP Tasks" (NAACL 2022).☆44Dec 25, 2022Updated 3 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- ☆25Dec 12, 2025Updated 5 months ago
- self-adaptive in-context learning☆45May 5, 2023Updated 3 years ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year
- ☆19Sep 1, 2025Updated 8 months ago
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- Learning adapter weights from task descriptions☆19Nov 12, 2023Updated 2 years ago
- REST: Retrieval-Based Speculative Decoding, NAACL 2024☆219Mar 5, 2026Updated 2 months ago
- MAIR: A Massive Benchmark for Evaluating Instructed Retrieval. Evaluate your retrieval models on 126 diverse tasks. [EMNLP 2024]☆26Nov 3, 2024Updated last year
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"☆450Oct 16, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆17Oct 11, 2023Updated 2 years ago
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆209May 20, 2024Updated 2 years ago
- Jax implementation of "Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models"☆15May 10, 2024Updated 2 years ago
- ☆177Jul 24, 2024Updated last year
- Jaehyung Kim et al's ACL 2023 paper on "infoVerse: A Universal Framework for Dataset Characterization with Multidimensional Meta-informat…☆16Jun 28, 2023Updated 2 years ago
- Code for "Reasoning to Learn from Latent Thoughts"☆130Mar 28, 2025Updated last year
- DSIR large-scale data selection framework for language model training☆274Apr 7, 2024Updated 2 years ago
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LA…☆30Nov 24, 2024Updated last year
- Repo for the paper "Large Language Models Struggle to Learn Long-Tail Knowledge"☆77Apr 12, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 5 months ago
- Data and code for the preprint "In-Context Learning with Long-Context Models: An In-Depth Exploration"☆44Aug 20, 2024Updated last year
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Jun 13, 2024Updated last year
- Implementation of paper Data Engineering for Scaling Language Models to 128K Context☆496Mar 19, 2024Updated 2 years ago
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆180Jul 4, 2024Updated last year
- "FiD-ICL: A Fusion-in-Decoder Approach for Efficient In-Context Learning" (ACL 2023)☆15Jul 24, 2023Updated 2 years ago
- ☆10Jul 15, 2024Updated last year
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆239Aug 2, 2024Updated last year
- ☆13Oct 18, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆138May 29, 2025Updated last year
- Introduction and scripts for ACL-2020 paper "On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation"☆21Jun 23, 2020Updated 5 years ago
- Code for COLING 2022 long paper: Answering Numerical Reasoning Questions in Table-Text Hybrid Contents with Graph-based Encoder and Tree-…☆22Dec 15, 2022Updated 3 years ago
- ☆28Jul 22, 2022Updated 3 years ago
- [EMNLP 2024] LongAlign: A Recipe for Long Context Alignment of LLMs☆260Dec 16, 2024Updated last year
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning" [TMLR2025]☆114Feb 20, 2025Updated last year
- ☆110Jul 15, 2025Updated 10 months ago