google-deepmind / loft
LOFT: A 1 Million+ Token Long-Context Benchmark
☆146Updated 3 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for loft
- Benchmarking LLMs with Challenging Tasks from Real Users☆195Updated 2 weeks ago
- ACL 2024 | LooGLE: Long Context Evaluation for Long-Context Language Models☆167Updated last month
- Code and Data for "Long-context LLMs Struggle with Long In-context Learning"☆91Updated 4 months ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆124Updated 3 weeks ago
- Self-Alignment with Principle-Following Reward Models☆148Updated 8 months ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆118Updated 3 weeks ago
- ☆112Updated last month
- open-source code for paper: Retrieval Head Mechanistically Explains Long-Context Factuality☆160Updated 3 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]☆127Updated 2 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA☆92Updated last week
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆129Updated this week
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆115Updated last week
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆199Updated 6 months ago
- Reformatted Alignment☆112Updated last month
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆110Updated 3 weeks ago
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆213Updated last year
- A Survey on Data Selection for Language Models☆182Updated last month
- Official Repo for ICLR 2024 paper MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback by Xingyao Wang*, Ziha…☆104Updated 5 months ago
- The official repo for "LLoCo: Learning Long Contexts Offline"☆113Updated 5 months ago
- Code and data accompanying our paper on arXiv "Faithful Chain-of-Thought Reasoning".☆155Updated 6 months ago
- Scalable Meta-Evaluation of LLMs as Evaluators☆41Updated 9 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆141Updated 6 months ago
- This is the repo for the paper Shepherd -- A Critic for Language Model Generation☆213Updated last year
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆145Updated 5 months ago
- Code and example data for the paper: Rule Based Rewards for Language Model Safety☆158Updated 4 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆83Updated last week
- A dataset of LLM-generated chain-of-thought steps annotated with mistake location.☆73Updated 3 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆108Updated 2 months ago
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆74Updated 10 months ago