tencent-ailab / CogKernel
☆23Updated this week
Related projects: ⓘ
- ☆105Updated this week
- 🌍 Repository for "AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agent", ACL'24 Best Resource Pap…☆81Updated last month
- Resources for our paper: "EvoAgent: Towards Automatic Multi-Agent Generation via Evolutionary Algorithms"☆73Updated 2 months ago
- DSBench: How Far are Data Science Agents Becoming Data Science Experts?☆20Updated this week
- Code for the paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"☆30Updated 3 months ago
- This is the official repository for Inheritune.☆89Updated 4 months ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆89Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆39Updated 3 weeks ago
- Code for PHATGOOSE introduced in "Learning to Route Among Specialized Experts for Zero-Shot Generalization"☆76Updated 6 months ago
- Meta-CoT: Generalizable Chain-of-Thought Prompting in Mixed-task Scenarios with Large Language Models☆84Updated 11 months ago
- Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; arXiv preprint arXiv:2403.…☆34Updated 2 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆45Updated 6 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆73Updated 7 months ago
- Official implementation for the paper "LongEmbed: Extending Embedding Models for Long Context Retrieval"☆108Updated 4 months ago
- Source code for our paper: "Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction A…☆39Updated 7 months ago
- Cascade Speculative Drafting☆23Updated 5 months ago
- This is the official repository of the paper "OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI"☆79Updated last month
- ☆118Updated 5 months ago
- Code and data for CoachLM, an automatic instruction revision approach LLM instruction tuning.☆56Updated 6 months ago
- Repository for paper Tools Are Instrumental for Language Agents in Complex Environments☆32Updated 8 months ago
- "Improving Mathematical Reasoning with Process Supervision" by OPENAI☆55Updated last week
- Agent Planning with World Knowledge Model☆27Updated 2 months ago
- ☆31Updated 3 months ago
- 🚢 Data Toolkit for Sailor Language Models☆74Updated 2 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆96Updated last week
- ☆16Updated 6 months ago
- Benchmarking LLMs with Challenging Tasks from Real Users☆182Updated last month
- ☆66Updated last year
- ☆87Updated 3 months ago
- Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models☆73Updated 5 months ago