Baichenjia / COPOView external linksLinks
Online Preference Alignment for Language Models via Count-based Exploration
☆17Jan 14, 2025Updated last year
Alternatives and similar repositories for COPO
Users that are interested in COPO are comparing it to the libraries listed below
Sorting:
- Official implementation of paper: LiNo: Advancing Recursive Residual Decomposition of Linear and Nonlinear Patterns for Robust Time Serie…☆17Dec 19, 2025Updated last month
- [NeurIPS' 24] The PyTorch implementation of our paper: "Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learnin…☆21Oct 10, 2024Updated last year
- ☆33Jul 15, 2025Updated 7 months ago
- [ICML' 24] The PyTorch implementation of our paper: "Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforc…☆23May 29, 2024Updated last year
- LLM-Empowered State Representation for Reinforcement Learning (ICML2024 Accepted paper)☆37Jun 14, 2024Updated last year
- A feature-rich concurrency kit, yet another DAG framework☆10Jan 18, 2026Updated 3 weeks ago
- 知乎爬虫---知乎点赞数超过1000的问题及回答,知乎神回复☆23May 10, 2016Updated 9 years ago
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- Paster core module using KiteX☆10Aug 30, 2023Updated 2 years ago
- Bambo is a new proxy framework. Compared with mainstream frameworks, it is more lightweight and flexible and can handle various load task…☆33Feb 10, 2025Updated last year
- Linear Attention Sequence Parallelism (LASP)☆88Jun 4, 2024Updated last year
- Implementation of BIMRL: Brain Inspired Meta Reinforcement Learning - Roozbeh Razavi et al. (IROS 2022)☆10Dec 1, 2022Updated 3 years ago
- ☆13May 13, 2025Updated 9 months ago
- ☆11Oct 31, 2024Updated last year
- Part of a research scholarship. I built a basic 2d driving sim with simulated lidar data to train Deep Q Neural Network. So far after abo…☆11Feb 15, 2017Updated 9 years ago
- Source code for journal paper "Multiagent Reinforcement Learning With Sparse Interactions by Negotiation and Knowledge Transfer"☆13Dec 26, 2017Updated 8 years ago
- ☆22Dec 11, 2025Updated 2 months ago
- ☆18Feb 16, 2025Updated 11 months ago
- Official PyTorch Implementation of Federated Learning with Positive and Unlabeled Data☆10Aug 12, 2022Updated 3 years ago
- Math24o: 高中奥林匹克数学竞赛测评集 High School Olympiad Mathematics Chinese Benchmark☆11Mar 27, 2025Updated 10 months ago
- NuART-Py: Python Library of Adaptive Resonance Theory Neural Network☆10Jan 26, 2020Updated 6 years ago
- A job management system for python☆10Jan 16, 2026Updated 3 weeks ago
- A Caffe/C++ implementation of Deep Deterministic Policy Gradient☆10Feb 1, 2019Updated 7 years ago
- Official repo of paper LM2☆46Feb 13, 2025Updated last year
- Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.☆46Jun 11, 2025Updated 8 months ago
- Code Implementation, Evaluations, Documentation, Links and Resources for Min P paper☆46Aug 13, 2025Updated 6 months ago
- Easy to install Text to Speech system for Raspberry Pi 4☆13Mar 4, 2024Updated last year
- Code for paper: "Executing Arithmetic: Fine-Tuning Large Language Models as Turing Machines"☆11Oct 11, 2024Updated last year
- Multi-objective reinforcement learning for covid-19 control☆12Aug 12, 2021Updated 4 years ago
- ardrone simulation in gazebo(for kinetic and gazebo 7). Now it can run.☆10Oct 27, 2017Updated 8 years ago
- Benchmarking Deepseek R1 API response speeds across different providers for performance comparison.☆10Feb 15, 2025Updated 11 months ago
- ☆13Jun 4, 2025Updated 8 months ago
- Simple, Non authoritative Benchmarks for embedded databases running in Github Actions☆11Jul 11, 2024Updated last year
- ☆21Jun 16, 2025Updated 7 months ago
- trending repositories and news related to AI☆10Mar 22, 2019Updated 6 years ago
- ICML 2024 - Self-Driven Entropy Aggregation for Byzantine-Robust Heterogeneous Federated Learning☆10Jul 16, 2024Updated last year
- Long Context Research☆26Jan 26, 2026Updated 2 weeks ago
- Accurate counters with Kafka & RocksDB.☆16Jan 22, 2021Updated 5 years ago
- Task models for human robot collaboration☆12Jul 17, 2018Updated 7 years ago