CarlanLark / Lp-Reg-devLinks
Low-probability Tokens Sustain Exploration in Reinforcement Learning with Verifiable Reward
☆42Updated 2 months ago
Alternatives and similar repositories for Lp-Reg-dev
Users that are interested in Lp-Reg-dev are comparing it to the libraries listed below
Sorting:
- [EMNLP 2024 Findings] Official PyTorch Implementation of "Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Ge…☆41Updated 11 months ago
- ☆62Updated last year
- StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving☆21Updated last year
- Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering☆42Updated 3 months ago
- An open-source highly heterogeneous entity alignment (HHEA) toolkit.☆32Updated last year
- This search engine leverages the Boost library for efficient document search, featuring data preprocessing, index creation, and advanced …☆59Updated last year
- ☆43Updated 2 years ago
- A 3D game involves melee combat and parkour system based on UE5.☆28Updated last year
- A PyTorch quantization tool for machine learning models☆78Updated 11 months ago
- Modular multi-agent orchestration framework powered by LangGraph and FastAPI.☆26Updated 3 months ago
- Training and evaluation code of EGTLM model.☆22Updated last year
- ☆57Updated last year
- [ICME 2024] Official Datasets and example of LLM-SAP: Large Language Model Situational Awareness Based Planning☆33Updated 10 months ago
- ☆49Updated 2 years ago
- ☆104Updated last year
- ☆59Updated last year
- ☆40Updated 11 months ago
- This script monitors the remaining traffic of VMs on Vultr, DigitalOcean, and Linode. If the remaining traffic is zero, it shuts down the…☆33Updated last year
- ☆52Updated last year
- ☆33Updated 2 years ago
- Please visit our demonstration website for interactive demonstrations☆33Updated last year
- Voice-to-motion aerial robot using ESP32-S3, Node.js, Deepgram, ChatGPT, and Arduino.☆30Updated 7 months ago
- ☆98Updated 11 months ago
- Remote desktop based on C++☆30Updated last year
- 上应大众点评小程序☆52Updated last year
- REM script that helps you re-add course automatically.☆72Updated 2 years ago
- Musical Chain-of-Thoughts for Image Synthesis☆29Updated last year
- ☆12Updated 11 months ago
- a demo but fun snake game created in https://aide.ink☆66Updated last year
- 🌱 A fully independent Large Language Model (LLM) inference engine, built leveraging cuBLAS and cub. 🧩☆32Updated 7 months ago