NJUDeepEngine / CAEF
Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"
☆11Updated 5 months ago
Alternatives and similar repositories for CAEF:
Users that are interested in CAEF are comparing it to the libraries listed below
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆23Updated 6 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆20Updated last month
- ☆23Updated 9 months ago
- Code for Paper: Teaching Language Models to Critique via Reinforcement Learning☆84Updated last month
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆31Updated last month
- ☆25Updated last month
- ☆35Updated last month
- The code and data for the paper JiuZhang3.0☆43Updated 10 months ago
- Control LLM☆13Updated last week
- ☆29Updated 4 months ago
- [ICLR 2025] Official Pytorch Implementation of "Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN" by Pengxia…☆19Updated 3 months ago
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆30Updated 10 months ago
- ☆16Updated 8 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆41Updated 9 months ago
- ☆44Updated 3 weeks ago
- ☆39Updated last month
- Conic10K: A large-scale dataset for closed-vocabulary math problem understanding. Accepted to EMNLP2023 Findings.☆25Updated last year
- From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Ajay Jaiswal, Lu Yin, Zhenyu Zhang, Shiwei Liu,…☆43Updated 8 months ago
- ☆72Updated last week
- The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.☆44Updated 5 months ago
- implementation of dualformer☆13Updated last month
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- [NeurIPS 2024] | An Efficient Recipe for Long Context Extension via Middle-Focused Positional Encoding☆15Updated 5 months ago
- ☆17Updated last month
- Official repository for Decentralized Arena via Collective LLM Intelligence☆9Updated 5 months ago
- ☆16Updated 3 months ago
- This repo contains code for the paper "Both Text and Images Leaked! A Systematic Analysis of Data Contamination in Multimodal LLM"☆13Updated this week
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆26Updated 3 months ago
- Source code for GreaTer - Gradient Over Reasoning makes Smaller Language Models Strong Prompt Optimizers☆17Updated last month
- Official implementation of the paper "MMInA: Benchmarking Multihop Multimodal Internet Agents"☆41Updated last month