terminal-agent / reptileLinks
π» Terminal-Agent with Human-in-the-Loop Learning
β32Updated 2 weeks ago
Alternatives and similar repositories for reptile
Users that are interested in reptile are comparing it to the libraries listed below
Sorting:
- The repository of the project "Fine-tuning Large Language Models with Sequential Instructions", code base comes from open-instruct and LAβ¦β30Updated last year
- [NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejectionβ53Updated last year
- The code and data for the paper JiuZhang3.0β49Updated last year
- β49Updated 4 months ago
- β20Updated 3 months ago
- [AAAI 2025 oral] Evaluating Mathematical Reasoning Beyond Accuracyβ76Updated 3 months ago
- β30Updated last year
- LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verificationβ72Updated 5 months ago
- The Good, The Bad, and The Greedy: Evaluation of LLMs Should Not Ignore Non-Determinismβ30Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"β107Updated 3 months ago
- A curated list of awesome resources dedicated to Scaling Laws for LLMsβ80Updated 2 years ago
- Long Context Extension and Generalization in LLMsβ62Updated last year
- Code for the preprint "Cache Me If You Can: How Many KVs Do You Need for Effective Long-Context LMs?"β47Updated 5 months ago
- The rule-based evaluation subset and code implementation of Omni-MATHβ26Updated last year
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":β44Updated last year
- Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"β22Updated 9 months ago
- [NeurIPS'24] Official code for *π―DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*β119Updated last year
- β34Updated last year
- [ICML 2025] Teaching Language Models to Critique via Reinforcement Learningβ120Updated 8 months ago
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.β64Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"β76Updated 7 months ago
- Use the tokenizer in parallel to achieve superior accelerationβ20Updated last year
- LongProc: Benchmarking Long-Context Language Models on Long Procedural Generationβ33Updated 3 months ago
- This repo is to demo the concept of lossless compression with Transformers as encoder and decoder.β14Updated last year
- [ACL 2024 Findings] CriticBench: Benchmarking LLMs for Critique-Correct Reasoningβ29Updated last year
- [ICML 2025] M-STAR (Multimodal Self-Evolving TrAining for Reasoning) Project. Diving into Self-Evolving Training for Multimodal Reasoningβ69Updated 5 months ago
- Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradientβ63Updated 5 months ago
- β35Updated last year
- Efficient retrieval head analysis with triton flash attention that supports topK probabilityβ13Updated last year
- β58Updated last year