Bui1dMySea / MemLong
☆94Updated 5 months ago
Alternatives and similar repositories for MemLong:
Users that are interested in MemLong are comparing it to the libraries listed below
- ☆47Updated 4 months ago
- From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation☆89Updated last month
- ☆94Updated 4 months ago
- The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…☆48Updated 6 months ago
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆242Updated 2 weeks ago
- ☆57Updated 6 months ago
- ☆81Updated last year
- Advancing Language Model Reasoning through Reinforcement Learning and Inference Scaling☆101Updated 3 months ago
- FuseAI Project☆85Updated 3 months ago
- Reformatted Alignment☆115Updated 7 months ago
- ☆149Updated this week
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆132Updated 10 months ago
- ☆102Updated 4 months ago
- ☆143Updated 10 months ago
- ☆162Updated last month
- Code for "Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate"☆141Updated 2 weeks ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆32Updated 11 months ago
- ☆46Updated 10 months ago
- [preprint] We propose a novel fine-tuning method, Separate Memory and Reasoning, which combines prompt tuning with LoRA.☆43Updated 4 months ago
- AutoCoA (Automatic generation of Chain-of-Action) is an agent model framework that enhances the multi-turn tool usage capability of reaso…☆103Updated last month
- ☆55Updated 6 months ago
- Unleashing the Power of Cognitive Dynamics on Large Language Models☆61Updated 7 months ago
- [ICLR 2025] Benchmarking Agentic Workflow Generation☆85Updated 2 months ago
- Code implementation of synthetic continued pretraining☆107Updated 4 months ago
- Repo for for paper "AgentRE: An Agent-Based Framework for Navigating Complex Information Landscapes in Relation Extraction".☆65Updated 9 months ago
- CLongEval: A Chinese Benchmark for Evaluating Long-Context Large Language Models☆40Updated last year
- ☆17Updated 3 weeks ago
- This is a personal reimplementation of Google's Infini-transformer, utilizing a small 2b model. The project includes both model and train…☆56Updated last year
- ☆49Updated last year
- ☆115Updated last week