II-Thought-RL is our initial attempt at developing a large-scale, multi-domain Reinforcement Learning (RL) dataset
☆31Apr 8, 2025Updated last year
Alternatives and similar repositories for ii-thought
Users that are interested in ii-thought are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- II-Researcher: a new open-source framework designed to aid building search / research agents☆496Aug 4, 2025Updated 9 months ago
- ☆59Apr 17, 2026Updated 2 weeks ago
- ☆12Sep 26, 2019Updated 6 years ago
- ☆41Jan 25, 2026Updated 3 months ago
- langchain-streamlit demo with streaming llm, memory, and langsmith feedback☆17Feb 4, 2026Updated 3 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Based on the R1-Zero method, using rule-based rewards and GRPO on the Code Contests dataset.☆18Apr 22, 2025Updated last year
- A peer-to-peer communication system. BIT 小学期软件开发实训。☆11Sep 7, 2018Updated 7 years ago
- ☆20Apr 24, 2025Updated last year
- Cog wrapper for FalconsAi / nsfw_image_detection☆18Aug 6, 2025Updated 8 months ago
- ☆11Mar 14, 2016Updated 10 years ago
- ☆11Nov 2, 2024Updated last year
- The official Python SDK for the Agentica agent framework from Symbolica☆172Feb 13, 2026Updated 2 months ago
- A Random Matrix Approach to Extreme Learning Machine☆15Feb 23, 2018Updated 8 years ago
- My solution code to parallel architecture and programming Spring 2016☆12Aug 15, 2016Updated 9 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Solving Physics Puzzles by Reasoning about Paths (NeurIPS 2020 workshop)☆14Jun 28, 2022Updated 3 years ago
- Cog template for Stable Diffusion 3 (ComfyUI implementation)☆17Jul 16, 2024Updated last year
- Binarizing Documents by Leveraging both Space and Frequency. (ICDAR 2024)☆16May 15, 2025Updated 11 months ago
- Async pipelined version of Verl☆125Apr 8, 2025Updated last year
- Code for our paper "Towards Principled Graph Transformers"☆13Oct 30, 2024Updated last year
- ☆43Mar 23, 2026Updated last month
- A benchmark for testing memorization abilities of LMs☆24Oct 15, 2024Updated last year
- ☆20Dec 14, 2024Updated last year
- ☆19Apr 18, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Using open-source LLM Llama2 by Meta on local CPU inference for document question-and-answer☆15Oct 5, 2023Updated 2 years ago
- [JMLR] TRADES + random smoothing for certifiable robustness☆14Sep 13, 2020Updated 5 years ago
- My solution to Collaboration and Competition using MADDPG algorithm, Udacity 3rd project of Deep RL Nanodegree from the paper "Multi-Agen…☆10Oct 6, 2019Updated 6 years ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆111Mar 7, 2025Updated last year
- ☆49May 20, 2025Updated 11 months ago
- Open-source coding assistant for Visual Studio Code. Connect to LLMs from OpenAI or Google.☆18Aug 14, 2023Updated 2 years ago
- ☆10Mar 3, 2021Updated 5 years ago
- ☆21Oct 9, 2024Updated last year
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Lightning support for Intel Habana accelerators.☆27Aug 1, 2025Updated 9 months ago
- A simple, yet powerful utility to scan and generate QR-Codes.☆12Sep 1, 2023Updated 2 years ago
- Auto Thinking Mode switch for Qwen3 in Open webui☆71May 8, 2025Updated 11 months ago
- ☆13Jul 9, 2018Updated 7 years ago
- Support for multiple broker hosts and basic "failover" on the client side.☆23Feb 20, 2013Updated 13 years ago
- A list of papers regarding generalization in (deep) reinforcement learning☆11Aug 13, 2023Updated 2 years ago
- Code repository for On the interaction between supervision and self-play in emergent communication (ICLR 2020)☆15Feb 4, 2020Updated 6 years ago