AutoLLM / ArxivDigest
ArXiv Digest and Personalized Recommendations using Large Language Models
☆333Updated 7 months ago
Alternatives and similar repositories for ArxivDigest:
Users that are interested in ArxivDigest are comparing it to the libraries listed below
- ☆258Updated this week
- GPT4 based personalized ArXiv paper assistant bot☆500Updated 9 months ago
- ☆264Updated 6 months ago
- Build Hierarchical Autonomous Agents through Config. Collaborative Growth of Specialized Agents.☆307Updated last year
- A curated collection of LLM reasoning and planning resources, including key papers, limitations, benchmarks, and additional learning mate…☆211Updated 4 months ago
- NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day☆253Updated last year
- ☆205Updated 5 months ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898☆203Updated 8 months ago
- 🤖🌊 aiFlows: The building blocks of your collaborative AI☆244Updated 8 months ago
- The official evaluation suite and dynamic data release for MixEval.☆233Updated 2 months ago
- RuLES: a benchmark for evaluating rule-following in language models☆215Updated this week
- Public repository for "The Surprising Effectiveness of Test-Time Training for Abstract Reasoning"☆278Updated last month
- Generate textbook-quality synthetic LLM pretraining data☆492Updated last year
- ☆413Updated last year
- [NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents☆297Updated 4 months ago
- Website for hosting the Open Foundation Models Cheat Sheet.☆262Updated 6 months ago
- ☆484Updated last month
- Fast & more realistic evaluation of chat language models. Includes leaderboard.☆183Updated last year
- data cleaning and curation for unstructured text☆328Updated 5 months ago
- ☆255Updated last month
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets☆213Updated last year
- An Analytical Evaluation Board of Multi-turn LLM Agents☆270Updated 7 months ago
- Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'☆175Updated last month
- 🤠 Agent-as-a-Judge and DevAI dataset☆308Updated 3 weeks ago
- awesome synthetic (text) datasets☆253Updated 2 months ago
- A puzzle to learn about prompting☆123Updated last year
- Code and data for "Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs"☆459Updated 9 months ago
- Scaling Data-Constrained Language Models☆330Updated 3 months ago
- A set of utilities for running few-shot prompting experiments on large-language models☆116Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆221Updated last year