woshixiaobai2019/agent-gym

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/woshixiaobai2019/agent-gym)

woshixiaobai2019 / agent-gym

☆47

Alternatives and similar repositories for agent-gym

Users that are interested in agent-gym are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

MozerWang / DEMO
View on GitHub
[ACL 2025 (Findings)] DEMO: Reframing Dialogue Interaction with Fine-grained Element Modeling
☆22Dec 16, 2024Updated last year
THU-KEG / VerIF
View on GitHub
[EMNLP 2025] Verification Engineering for RL in Instruction Following
☆57Mar 30, 2026Updated 3 months ago
MNeMoNiCuZ / ImageSortingManual
View on GitHub
A tool that helps you sort images and supplementary files with the same name into folders manually
☆14Jun 15, 2024Updated 2 years ago
StarDewXXX / AdaR1
View on GitHub
The official repository of NeurIPS'25 paper "Ada-R1: From Long-Cot to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization"
☆24May 6, 2026Updated 2 months ago
AChen-qaq / ProML
View on GitHub
Code for paper "Prompt-Based Metric Learning for Few-shot NER".
☆22Nov 14, 2023Updated 2 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
AI45Lab / DEAN
View on GitHub
☆11Oct 25, 2024Updated last year
Raj-08 / Q-Flow
View on GitHub
Complete Reinforcement Learning Toolkit for Large Language Models!
☆21Aug 2, 2025Updated 11 months ago
Mryangkaitong / deepseek-r1-gsm8k
View on GitHub
☆49Feb 10, 2025Updated last year
mxzheng / TrojViT
View on GitHub
[CVPR 2023] "TrojViT: Trojan Insertion in Vision Transformers" by Mengxin Zheng, Qian Lou, Lei Jiang
☆15Jan 5, 2024Updated 2 years ago
tanganke / subspace_fusion
View on GitHub
Code for paper "Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion"
☆14Mar 28, 2024Updated 2 years ago
linjh1118 / Llama3-Chinese-ORPO
View on GitHub
基于Llama3，通过进一步CPT，SFT，ORPO得到的中文版Llama3
☆16Apr 24, 2024Updated 2 years ago
srush / mamba-scans
View on GitHub
Blog post
☆17Feb 16, 2024Updated 2 years ago
BishopLiu / ETEGRec
View on GitHub
[SIGIR'25] Code of "Generative Recommender with End-to-End Learnable Item Tokenization".
☆40Apr 23, 2025Updated last year
sayakpaul / big_vision_experiments
View on GitHub
Contains my experiments with the `big_vision` repo to train ViTs on ImageNet-1k.
☆22Jan 16, 2023Updated 3 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
yingweima2022 / SWE-Reasoner
View on GitHub
☆25Aug 2, 2025Updated 11 months ago
CLUEbenchmark / Math24o
View on GitHub
Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark
☆14Mar 27, 2025Updated last year
ZrW00 / MuScleLoRA
View on GitHub
The code implementation of MuScleLoRA (Accepted in ACL 2024)
☆10Dec 1, 2024Updated last year
hills-code / open-instruct
View on GitHub
☆16May 8, 2024Updated 2 years ago
XueruiSu / Trust-Region-Preference-Approximation
View on GitHub
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
☆15Jun 28, 2025Updated last year
dataSnail / RSpapers
View on GitHub
papers about recommender system.
☆10May 18, 2021Updated 5 years ago
zjunlp / KnowRL
View on GitHub
KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality
☆48May 19, 2026Updated 2 months ago
nambo / menu-rag
View on GitHub
Beyond Basic RAG, Empowering Real-Time Deep Research
☆20Sep 12, 2025Updated 10 months ago
psunlpgroup / FoVer
View on GitHub
This repository includes code and materials for the paper "Efficient PRM Training Data Synthesis via Formal Verification" (ACL 2026 Findi…
☆18Apr 7, 2026Updated 3 months ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
shaohao011 / MedCCO
View on GitHub
[ACM MM2026] This is the official implementation of MedCCO
☆17Jul 12, 2026Updated last week
CryptoAILab / MergeGuard
View on GitHub
[CCS-LAMPS'24] LLM IP Protection Against Model Merging
☆16Oct 14, 2024Updated last year
CalvinHaynes / MIT6.S081-2020Fall-LabSolution
View on GitHub
MIT6.S081实验记录，并且利用Docker+code-server（网页版Vscode）进行环境搭建，实现开箱即用的纯净实验环境，具体使用说明请看下面的网站
☆12Jan 28, 2024Updated 2 years ago
srrcboy / dijkstra-CUDA
View on GitHub
Dijkstra's Algorithm implemented in C/C++ using standard C, OpenMP and CUDA
☆13Dec 12, 2015Updated 10 years ago
Whu-Lambda / collection
View on GitHub
Lambda 作品集
☆11Feb 28, 2023Updated 3 years ago
ai-zerolab / pydantic-ai-deepagent
View on GitHub
Reasoning model integration for pydantic-ai's agent
☆15Oct 13, 2025Updated 9 months ago
Tongyi-CCAI / Complex-IF
View on GitHub
☆34Jan 26, 2026Updated 5 months ago
fishiatee / Tumera
View on GitHub
Yet another frontend for LLM, written using .NET and WinUI 3
☆11Sep 14, 2025Updated 10 months ago
phosseini / GisPy
View on GitHub
GisPy: A Tool for Measuring Gist Inference Score in Text https://aclanthology.org/2022.wnu-1.5/
☆13Jul 1, 2024Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
PicoCreator / RWKV-LM-LoRA
View on GitHub
RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …
☆10Nov 3, 2023Updated 2 years ago
km1994 / nlp_paper_study_search_engine
View on GitHub
该仓库主要记录 NLP 算法工程师相关的搜索引擎学习笔记
☆14Apr 9, 2022Updated 4 years ago
IBM / ColPret
View on GitHub
Efficient Scaling laws and collaborative pretraining.
☆23Updated this week
InternLM / Condor
View on GitHub
[ACL 2025] An official pytorch implement of the paper: Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
☆40May 28, 2025Updated last year
zetang94 / ASE2023_kNM-LM
View on GitHub
This is the official implement for the paper 'Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases''
☆14Oct 4, 2023Updated 2 years ago
GumGum10 / QuantumMerge-sdxl
View on GitHub
☆28Feb 10, 2026Updated 5 months ago
Doraemonzzz / nanoTransNormer
View on GitHub
☆11Oct 11, 2023Updated 2 years ago