liziniu/KnapsackRL

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/liziniu/KnapsackRL)

liziniu / KnapsackRL

☆19

Alternatives and similar repositories for KnapsackRL

Users that are interested in KnapsackRL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liziniu / HyperDQN
View on GitHub
Code for ICLR 2022 Paper (HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning)
☆12Nov 28, 2023Updated 2 years ago
liziniu / GEM
View on GitHub
Code for Paper (Preserving Diversity in Supervised Fine-tuning of Large Language Models)
☆58May 12, 2025Updated last year
liziniu / policy_optimization
View on GitHub
Code for Paper (Policy Optimization in RLHF: The Impact of Out-of-preference Data)
☆29Dec 19, 2023Updated 2 years ago
fjzzq2002 / random_transformers
View on GitHub
Official code for "Algorithmic Capabilities of Random Transformers" (NeurIPS 2024)
☆15Sep 28, 2024Updated last year
mghasemi / Irene
View on GitHub
Irene is a python package that aims to be a toolkit for global optimization problems that can be realized algebraically. It generalizes L…
☆15Jul 10, 2026Updated 2 weeks ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
liziniu / ReMax
View on GitHub
Code for Paper (ReMax: A Simple, Efficient and Effective Reinforcement Learning Method for Aligning Large Language Models)
☆202Dec 16, 2023Updated 2 years ago
stratisMarkou / sample-efficient-bayesian-rl
View on GitHub
Source for the sample efficient tabular RL submission to the 2019 NIPS workshop on Biological and Artificial RL
☆25Apr 14, 2022Updated 4 years ago
Victorwz / LaViA
View on GitHub
☆10Jul 13, 2024Updated 2 years ago
zyushun / Adam-mini
View on GitHub
Code for Adam-mini: Use Fewer Learning Rates To Gain More https://arxiv.org/abs/2406.16793
☆457May 13, 2025Updated last year
zyushun / hessian-spectrum
View on GitHub
Code for the paper: Why Transformers Need Adam: A Hessian Perspective
☆65Mar 11, 2025Updated last year
abbyvansoest / maxent
View on GitHub
☆14May 30, 2019Updated 7 years ago
saschaschramm / MonteCarloTreeSearch
View on GitHub
This project applies Monte Carlo Tree Search (MCTS) to a simple grid world.
☆10May 30, 2018Updated 8 years ago
ankitkv / TD-VAE
View on GitHub
TD-VAE in PyTorch
☆10May 28, 2019Updated 7 years ago
zt95 / infinite-horizon-off-policy-estimation
View on GitHub
☆13Apr 3, 2019Updated 7 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
robintyh1 / icml2021-pengqlambda
View on GitHub
Revisiting Peng's Q(lambda) for Modern Reinforcement Learning
☆15Jul 23, 2021Updated 5 years ago
tangzhy / RealCritic
View on GitHub
☆15Jan 27, 2025Updated last year
kzhai / Papers
View on GitHub
☆15Feb 22, 2018Updated 8 years ago
ASU-APG / awesome_attribution_of_generative_models
View on GitHub
☆10Oct 18, 2021Updated 4 years ago
WalterBabyRudin / Courseware
View on GitHub
☆11Jan 12, 2021Updated 5 years ago
bhairavmehta95 / ant-env
View on GitHub
Ant Gather and Ant Maze envs, separated from RLLab
☆11Aug 2, 2018Updated 7 years ago
neonwatty / spectral-clustering-demo
View on GitHub
A fun review of spectral clustering with MATLAB demos I made for the NU machine learning meetiup in 2014
☆12Mar 4, 2016Updated 10 years ago
THUDM / TreeRL
View on GitHub
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search in ACL'25
☆97Jun 16, 2025Updated last year
francescotescari / noiseprint2
View on GitHub
noiseprint2 is a porting of noiseprint to tensorflow 2 and keras
☆12Feb 20, 2021Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Equationliu / Attack-ImageNet
View on GitHub
No.3 solution of Tianchi ImageNet Adversarial Attack Challenge.
☆12Apr 22, 2020Updated 6 years ago
FreedomIntelligence / TinyDeepSeek
View on GitHub
Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.
☆30Mar 11, 2025Updated last year
niopeng / dciknn_cuda
View on GitHub
This is the CUDA GPU implementation + Python interface (using PyTorch) of DCI. The paper can be found at https://arxiv.org/abs/1512.00442…
☆13Dec 20, 2023Updated 2 years ago
joytou / joytou.github.io
View on GitHub
JOYTOU is a BootStrap blog template developed by Joytou Wu.
☆10Feb 5, 2020Updated 6 years ago
pkumusic / E-DRL
View on GitHub
Exploration Strategies for Deep Reinforcement Learning
☆39Oct 31, 2018Updated 7 years ago
thuml / PAN
View on GitHub
☆16Jul 6, 2020Updated 6 years ago
martinResearch / PySparseLP
View on GitHub
python algorithms to solve sparse linear programming problems
☆34Jul 6, 2023Updated 3 years ago
ClosedCharacter / Peach
View on GitHub
我们是第一个完全可商用的角色大模型。
☆39Aug 11, 2024Updated last year
vectozavr / llm-hessian
View on GitHub
Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models
☆29Apr 17, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
rlberry-py / tutorials
View on GitHub
Reinforcement learning tutorials using the rlberry library.
☆18Jan 9, 2023Updated 3 years ago
ICTMCG / Characterizing-Weibo-Multi-Domain-False-News
View on GitHub
Code and Data for "Characterizing Multi-Domain False News on Weibo and the Underlying User Effects"
☆19Aug 24, 2022Updated 3 years ago
haolunc / iGSM-Replication-physics-LLM
View on GitHub
This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.
☆17Sep 13, 2024Updated last year
DavidJanz / successor_uncertainties_atari
View on GitHub
Code for paper "Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning" by David Janz*, Jiri Hron*, Przemys…
☆21Feb 24, 2023Updated 3 years ago
shadyabh / IAGAN
View on GitHub
☆17Dec 6, 2021Updated 4 years ago
liziniu / cold_start_rl
View on GitHub
Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?
☆20Mar 9, 2025Updated last year
MadryLab / journey-TRAK
View on GitHub
Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"
☆25Dec 12, 2023Updated 2 years ago