ZihanWang314 / coeCheckLinks
☆16Updated 3 months ago
Alternatives and similar repositories for coeCheck
Users that are interested in coeCheck are comparing it to the libraries listed below
Sorting:
- ☆20Updated this week
- AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories☆17Updated last month
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆38Updated 3 months ago
- Official Repository for Task-Circuit Quantization☆20Updated 3 weeks ago
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 5 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning☆19Updated this week
- Verifiers for LLM Reinforcement Learning☆60Updated 2 months ago
- ☆13Updated 6 months ago
- ☆10Updated last month
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆36Updated 2 months ago
- A repository for research on medium sized language models.☆76Updated last year
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆33Updated 3 months ago
- Lego for GRPO☆28Updated 3 weeks ago
- ☆65Updated 2 months ago
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 6 months ago
- ☆24Updated 9 months ago
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆32Updated 3 months ago
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated last week
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
- Lottery Ticket Adaptation☆39Updated 7 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆20Updated 6 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆68Updated 3 months ago
- The first dense retrieval model that can be prompted like an LM☆73Updated last month
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆29Updated 2 months ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Open-Source LLM Coders with Co-Evolving Reinforcement Learning☆78Updated 2 weeks ago
- Official implementation of ECCV24 paper: POA☆24Updated 10 months ago
- Code for Paper: Harnessing Webpage Uis For Text Rich Visual Understanding☆51Updated 6 months ago