☆19Aug 9, 2024Updated last year
Alternatives and similar repositories for Simple_TRL
Users that are interested in Simple_TRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆76Nov 13, 2023Updated 2 years ago
- 基于DPO算法微调语言大模型,简单好上手。☆51Jul 3, 2024Updated last year
- A softmax multi-armed bandit algorithm☆12Dec 30, 2018Updated 7 years ago
- ☆46Aug 9, 2024Updated last year
- 爬取古诗文网,构建中文诗词语料库☆14May 12, 2019Updated 6 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆35Mar 25, 2024Updated 2 years ago
- Redmibook14增强版(i5-10120u)黑苹果EFI☆11Feb 22, 2021Updated 5 years ago
- Casande-RL☆11May 9, 2023Updated 2 years ago
- A Python module for extracting relevant tags from text documents.☆17May 13, 2011Updated 14 years ago
- code for☆11Apr 10, 2021Updated 4 years ago
- Zeta implementation of a reusable and plug in and play feedforward from the paper "Exponentially Faster Language Modeling"☆16Nov 11, 2024Updated last year
- 目前各大高校领域将各种信息分布在不同的部门信息门户下,存在典型的信息孤岛问题,各个部门信息没有形成互通。当前,老师和学生存在很多有关本校相关文件、政策和活动等众多方面智能问答的统一入口的需求,例如财务处、人事处、学工处、教务处、图书馆等存在各种政策和文件规定,目前在校师生都…☆36Aug 5, 2024Updated last year
- 北京联通IPTV播放列表。节目单嗅探工具:https://github.com/zzzz0317/beijing-unicom-iptv-playlist-sniffer☆36Updated this week
- An implementation of Maximum Entropy model☆14Apr 28, 2012Updated 13 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- R package for split test/one-armed bandit analysis☆16May 5, 2014Updated 11 years ago
- 基于ROS的多无人机协同控制☆12May 8, 2021Updated 4 years ago
- Code and dataset for the paper 'Optimized Prediction of Weapon Effectiveness in BVR Air Combat Scenarios Using Enhanced Regression Models…☆17Jun 29, 2025Updated 8 months ago
- Multi-Agent Deep Recurrent Q-Learning with Bayesian epsilon-greedy on AirSim simulator☆13Apr 1, 2022Updated 3 years ago
- Interactive Multi-Agent Reinforcement Learning Environment for the board game Gobblet using PettingZoo.☆12Jul 2, 2023Updated 2 years ago
- Assignment for course AE4301P. Development of control system for an F16 model.☆11Dec 17, 2018Updated 7 years ago
- 算法工程师技术栈学习笔记☆15Aug 22, 2022Updated 3 years ago
- A Challenge on Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG), Co-located with SLT2024 FutureDial-RAG Challenge☆11Aug 10, 2024Updated last year
- ☆13Sep 12, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- The code for paper 'Hierarchical Policy for Non-prehensile Multi-object Rearrangement with Deep Reinforcement Learning and Monte Carlo Tr…☆21Aug 18, 2023Updated 2 years ago
- 深度学习☆13Feb 16, 2023Updated 3 years ago
- This repository implements the model from paper: Rationale-Augmented Convolutional Neural Networks for Text Classification☆12Sep 10, 2016Updated 9 years ago
- speaker-disentangled speech linguistic content quantizer☆24Mar 19, 2025Updated last year
- A Hackintosh - Opencore for Xiaomi RedmiBook 13 2019-2020☆15May 28, 2020Updated 5 years ago
- LLM Tokenizer with BPE algorithm☆48May 7, 2024Updated last year
- Topic Modelling for Humans☆11Jan 12, 2016Updated 10 years ago
- xgboost复现☆15Oct 6, 2024Updated last year
- code for A Large-scale Dataset for Audio-Language Representation Learning☆14Sep 18, 2024Updated last year
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- ☆23Apr 16, 2024Updated last year
- java implementation of Bert Tokenizer, support output onnx tensor for onnx model inference☆13Sep 4, 2023Updated 2 years ago
- A curated list of cutting-edge research papers and resources on Long Chain-of-Thought (CoT) Reasoning with Tools.☆46Dec 17, 2025Updated 3 months ago
- EfficientDet_anchor_free☆11Feb 19, 2020Updated 6 years ago
- ☆17Sep 17, 2023Updated 2 years ago
- ☆17Apr 23, 2025Updated 11 months ago
- Reinforcement learning for operation research problems with OpenAI Gym and CleanRL☆127Apr 13, 2023Updated 2 years ago