Intelligent-Internet / ii-thoughtLinks
II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset
☆17Updated last month
Alternatives and similar repositories for ii-thought
Users that are interested in ii-thought are comparing it to the libraries listed below
Sorting:
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- Open-source examples and guides for building with the Qwen. Browse a collection of snippets, advanced techniques and walkthroughs.☆21Updated 6 months ago
- ☆50Updated this week
- ☆56Updated 6 months ago
- ☆59Updated 2 weeks ago
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- ☆53Updated last year
- Train your own SOTA deductive reasoning model☆92Updated 3 months ago
- ☆41Updated 5 months ago
- Simple examples using Argilla tools to build AI☆53Updated 6 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆24Updated 2 months ago
- accompanying material for sleep-time compute paper☆90Updated last month
- Enhanced fork of SWE-bench, tailored for OpenDevin's ecosystem.☆25Updated last year
- Code for ScribeAgent paper☆57Updated 3 months ago
- LLM reads a paper and produce a working prototype☆57Updated last month
- From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging☆71Updated last week
- ☆24Updated 4 months ago
- A collection of pre-build wrappers over common RAG systems like ChromaDB, Weaviate, Pinecone, and othersz!☆34Updated last week
- Verifiers for LLM Reinforcement Learning☆56Updated last month
- Reasoning by Communicating with Agents☆28Updated last month
- Small, simple agent task environments for training and evaluation☆18Updated 7 months ago
- ☆24Updated 8 months ago
- Lego for GRPO☆28Updated last week
- Computer Agent Arena: Test & compare AI agents in real desktop apps & web environments. Code/data coming soon!☆45Updated last month
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated last year
- Challenges for general-purpose web-browsing AI agents☆58Updated last week
- Pre-training code for CrystalCoder 7B LLM☆54Updated last year
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- 👷♂️Minion is Agent's Brain. Minion is designed to execute any type of queries, offering a variety of features that demonstrate its flex…☆17Updated this week