Learning and research after DeepSeek-R1, around test-time computing, resurgence of RL, and new LLM learning/application paradigms.
☆23Apr 23, 2026Updated 2 months ago
Alternatives and similar repositories for Post-DeepSeek-R1_LLM-RL
Users that are interested in Post-DeepSeek-R1_LLM-RL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [NeurIPS 2025] Official Implementation for "Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding"☆22Dec 8, 2024Updated last year
- ☆56Mar 7, 2025Updated last year
- [ICML 2025] Official repository for paper "OR-Bench: An Over-Refusal Benchmark for Large Language Models"☆28Mar 4, 2025Updated last year
- ☆10Mar 1, 2025Updated last year
- [ICLR 26] The official code repository for the paper "Mirage or Method? How Model–Task Alignment Induces Divergent RL Conclusions".☆18Feb 9, 2026Updated 4 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- 🧌 Live2d models for cnblog themes.☆15Apr 3, 2022Updated 4 years ago
- Forcing Diffuse Distributions out of Language Models☆18Sep 10, 2024Updated last year
- Official implementation for the paper "Transferring Visual Knowledge with Pre-trained Models for Multimodal Machine Translation", publish…☆20Jun 3, 2024Updated 2 years ago
- 西电操作系统课设避坑指南☆10Sep 7, 2020Updated 5 years ago
- Competitive Programming Code Template☆10Nov 6, 2022Updated 3 years ago
- Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents☆42Apr 13, 2026Updated 2 months ago
- ☆20Nov 24, 2020Updated 5 years ago
- Accompanying repo for the DP2O paper accepted by AAAI 2024 main conference☆17Mar 28, 2024Updated 2 years ago
- [ICML 2026] Less Is More: Training-Free Sparse Attention with Global Locality for Efficient Reasoning☆33Sep 12, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 收集用于跨境电商的ChatGPT Prompt☆13Oct 14, 2025Updated 8 months ago
- PyDictionary is an offline English dictionary made using Python along with the Wordnet Lexical Database and Enchant Spell Dictionary. The…☆20May 16, 2021Updated 5 years ago
- [ICML 2024] Official implementation for "HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding"☆115Dec 4, 2024Updated last year
- A simplified cache simulator for instructional purposes☆15Dec 30, 2020Updated 5 years ago
- ☆20May 14, 2026Updated last month
- [ACL2025 Best Paper] Language Models Resist Alignment☆51Jun 11, 2025Updated last year
- Official code repository for the main conference paper in ACL2023: COLA: Contextualized Commonsense Causality Reasoning from the Causal I…☆34May 12, 2023Updated 3 years ago
- Code for the AAAI 2023 Paper "Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Gene…☆16Oct 29, 2024Updated last year
- GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators☆62Dec 23, 2025Updated 6 months ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆27Jan 14, 2025Updated last year
- Lemon Agent☆61Feb 10, 2026Updated 4 months ago
- ☆26Nov 21, 2022Updated 3 years ago
- ☆20May 17, 2023Updated 3 years ago
- AbstainQA, ACL 2024☆29Feb 4, 2026Updated 5 months ago
- A list of post-GPT-era (2022-2026) Best Paper award winners from ICLR/NeurIPS/ICML/ACL/EMNLP/NAACL/AAAI/CVPR/ECCV.☆99Jun 27, 2026Updated last week
- A Python Commonsense Knowledge Inference Toolkit☆63Dec 13, 2023Updated 2 years ago
- Source codes of thie paper "Work Together: Correlation-Identity Reconstruction Hashing for Unsupervised Cross-modal Retrieval"☆19Dec 2, 2021Updated 4 years ago
- 选课☆18Oct 12, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The information of NLP PhD application in the world.☆37Aug 27, 2024Updated last year
- SCOPE: Self-evolving Context Optimization via Prompt Evolution - A framework for automatic prompt optimization☆78Mar 26, 2026Updated 3 months ago
- [EMNLP 2025 Main] ConceptVectors Benchmark and Code for the paper "Intrinsic Evaluation of Unlearning Using Parametric Knowledge Traces"☆40Aug 20, 2025Updated 10 months ago
- [ICML 2025] "From Debate to Equilibrium: Belief-Driven Multi-Agent LLM Reasoning via Bayesian Nash Equilibrium"☆39Nov 23, 2025Updated 7 months ago
- The official repository for "Rongsheng Wang's Arxiv Template"☆64May 7, 2025Updated last year
- [ICML 2025] "From Passive to Active Reasoning: Can Large Language Models Ask the Right Questions under Incomplete Information?"☆47Oct 8, 2025Updated 8 months ago
- This repository contains my solution for the coursera course Algorithm I & II☆44May 3, 2023Updated 3 years ago