git-disl / PokeLLMonLinks

☆188

Alternatives and similar repositories for PokeLLMon

Users that are interested in PokeLLMon are comparing it to the libraries listed below

Sorting:

RLHFlow / Self-rewarding-reasoning-LLM
Recipes to train the self-rewarding reasoning LLMs.
☆224Updated 4 months ago
yongchao98 / CodeSteer-v1.0
Code and dataset of CodeSteer
☆58Updated 3 months ago
DCDmllm / WorldGPT
WorldGPT: Empowering LLM as Multimodal World Model
☆117Updated 11 months ago
dvlab-research / MoTCoder
This is the official code repository of MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tas…
☆83Updated 3 months ago
tencent-ailab / Leopard
The repository for the paper titled "Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks"
☆157Updated 6 months ago
RLHFlow / Minimal-RL
☆216Updated 2 months ago
zou-group / avatar
(NeurIPS 2024) AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning
☆211Updated last month
zhiyuanhubj / Meta-Ability-Alignment
Official code of paper "Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models"
☆79Updated last month
uclaml / SPPO
The official implementation of Self-Play Preference Optimization (SPPO)
☆569Updated 5 months ago
Rafa-zy / QLASS
☆31Updated this week
Baiqi-Li / NaturalBench
🚀 [NeurIPS24] Make Vision Matter in Visual-Question-Answering (VQA)! Introducing NaturalBench, a vision-centric VQA benchmark (NeurIPS'2…
☆84Updated 3 weeks ago
IAAR-Shanghai / ICSFSurvey
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasonin…
☆169Updated 7 months ago
OPPO-PersonalAI / TaskCraft
A library for generating difficulty-scalable, multi-tool, and verifiable agentic tasks with execution trajectories.
☆110Updated last week
GreenBitAI / green-bit-llm
A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.
☆185Updated last month
Hannibal046 / nanoRWKV
The nanoGPT-style implementation of RWKV Language Model - an RNN with GPT-level LLM performance.
☆188Updated last year
Ledzy / StreamBP
Official code of "StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs".
☆69Updated 3 weeks ago
aialt / AAGPT
AAGPT is another experimental open-source application showcasing the capabilities of large language models, such as GPT-3.5 and GPT-4.
☆137Updated 2 years ago
luo-junyu / RobustFT
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
☆42Updated 6 months ago
smartyfh / LLM-Uncertainty-Bench
Benchmarking LLMs via Uncertainty Quantification
☆234Updated last year
FanbinLu / STEVE-R1
R1-like Computer-use Agent
☆77Updated 3 months ago
Ablustrund / MPLSandbox
MPLSandbox is an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler a…
☆178Updated 3 months ago
liuzuyan / ElasticCache
[ECCV 2024] Efficient Inference of Vision Instruction-Following Models with Elastic Cache
☆42Updated 11 months ago
chatsci / Aeiva
A general AI agent framework that can be adapted to various tasks and environments.
☆100Updated 5 months ago
HKUDS / SepLLM
[ICML 2025] "SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator"
☆249Updated last week
ZrrSkywalker / MathVerse
[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
☆165Updated 2 months ago
gersteinlab / ML-Bench
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code (https://arxiv.org/abs/2311.098…
☆301Updated 2 weeks ago
YangLinyi / GLUE-X
We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that …
☆93Updated last year
dle666 / R-CoT
Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models
☆176Updated 8 months ago
RLHFlow / Online-DPO-R1
Codebase for Iterative DPO Using Rule-based Rewards
☆252Updated 3 months ago
IAAR-Shanghai / Grimoire
Grimoire is All You Need for Enhancing Large Language Models
☆116Updated last year