Hybrid Linear UCB Multi-arm Bandit library
☆14Oct 5, 2016Updated 9 years ago
Alternatives and similar repositories for hybrid-linucb
Users that are interested in hybrid-linucb are comparing it to the libraries listed below
Sorting:
- 基于RWKV模型的角色扮演,实际上是个改的妈都不认识的 RWKV_Role_Playing☆17Aug 17, 2023Updated 2 years ago
- Complete Reinforcement Learning Toolkit for Large Language Models!☆21Aug 2, 2025Updated 7 months ago
- Dynamic channel allocation in cellular networks by reinforcement learning☆18May 25, 2022Updated 3 years ago
- Capacity comparison between different power allocation schemes with arbitrary input distributions and different channel gains☆10Dec 19, 2018Updated 7 years ago
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- Chatbot that answers frequently asked questions in French, English, and Tunisian using the Rasa NLU framework and RWKV-4-Raven☆13May 19, 2023Updated 2 years ago
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- ☆12Mar 23, 2018Updated 7 years ago
- Reproduction of the paper "Soft Q-Learning with Mutual Information Regularization" CoRL 2019.☆10Jan 10, 2019Updated 7 years ago
- ☆11Jan 12, 2023Updated 3 years ago
- This is a fork of optimization part of RISO project (http://riso.sourceforge.net/)☆13Aug 30, 2015Updated 10 years ago
- Redis入门小实例,没有复杂的配置文件,简单到爆的源码示例☆10Aug 8, 2016Updated 9 years ago
- Communication is an important component in robotic systems. The application goals such as, finding a victim or teleoperate a robot in an …☆12Aug 29, 2017Updated 8 years ago
- Code for abstracting, evaluating, and visualizing Markov Decision Processes.☆10Jan 12, 2017Updated 9 years ago
- A Toolkit for Fine-Tuning Large Language Models with LoRA and DeepSpeed☆11Apr 14, 2023Updated 2 years ago
- python越南语分词器☆10Nov 14, 2019Updated 6 years ago
- GraphQL and Rest API rewrite of the current Open Targets platform API☆15Updated this week
- Learning bisimulation metrics for control, particularly suited to sparse reward settings☆10Feb 28, 2023Updated 3 years ago
- Do you even science, bro? Using RNN's to predict scientific titles.☆14Jun 5, 2017Updated 8 years ago
- A lightweight, dependency-free (besides `libcurl`) command-line tool written in C to download the transcript of any YouTube video. It dir…☆21Aug 25, 2025Updated 6 months ago
- ☆14Nov 11, 2024Updated last year
- ☆41Mar 14, 2024Updated last year
- Reproduced the DFT method without using Verl. https://arxiv.org/abs/2508.05629☆21Oct 14, 2025Updated 4 months ago
- ☆12Mar 23, 2025Updated 11 months ago
- Simple Fast API server that runs Dreambooth fine-tune jobs using Celery workers 🤙☆10Jun 18, 2024Updated last year
- ☆18Sep 17, 2025Updated 5 months ago
- Direct Preference Optimization for RWKV, aiming for RWKV-5 and 6.☆11Mar 1, 2024Updated 2 years ago
- ☆13Jan 22, 2025Updated last year
- Dockerfiles for building llama_index with anaconda/GPU/jupyter support☆13Mar 25, 2023Updated 2 years ago
- Code for Transformers are Adaptable Task Planners, CoRL 2022☆12Mar 28, 2023Updated 2 years ago
- Implementations of extended PCA methods, such as IPCA and EWMPCA☆15Aug 31, 2021Updated 4 years ago
- Companion notebooks for "Dog and human inflammatory bowel disease rely on overlapping yet distinct dysbiosis networks"☆11Jul 13, 2018Updated 7 years ago
- A tool for detecting anomalies in time series data☆11Dec 1, 2022Updated 3 years ago
- papers about reinforcement learning☆13Jan 4, 2021Updated 5 years ago
- ☆15Feb 25, 2018Updated 8 years ago
- Codes for 'Deep Deterministic Information Bottleneck with Matrix-based entropy functional' in ICASSP 2021☆14Jul 27, 2022Updated 3 years ago
- Official code repo for NeurIPS 2025 Spotlight paper, "Debate or Vote: Which Yields Better Decisions in Multi-Agent LLMs?"☆50Oct 15, 2025Updated 4 months ago
- ☆12Apr 13, 2024Updated last year
- This is a multilabel classification layer for mxnet.☆12Apr 1, 2016Updated 9 years ago