We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference modeling and alignment problem.
☆398May 13, 2026Updated last week
Alternatives and similar repositories for AI-Can-Learn-Scientific-Taste
Users that are interested in AI-Can-Learn-Scientific-Taste are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- 觉得有帮助的小伙伴们可以点一颗小星星⭐ 哦!☆51May 15, 2026Updated last week
- We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that S…☆298Mar 21, 2026Updated 2 months ago
- FediLive is a tool designed to quickly collect toots from all Mastodon instances during a specified time period.☆13Apr 2, 2026Updated last month
- [CVPR 2026] FluxMem: Adaptive Hierarchical Memory for Streaming Video Understanding☆61Mar 16, 2026Updated 2 months ago
- A curated list of awesome resources about reward construction for AI agents. This repository covers cutting-edge research, and practical …☆59Sep 1, 2025Updated 8 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Automated bash script to set up a high-performance environment on Ubuntu Linux with RTX5090, including installations of PyTorch, Unsloth,…☆18Apr 1, 2025Updated last year
- Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping…☆93Jan 29, 2026Updated 3 months ago
- KDD2024: This is the code for the paper "Propagation Structure-aware Graph Transformer for Robust and Interpretable Fake News Detection"☆12Aug 31, 2024Updated last year
- ☆128Mar 16, 2026Updated 2 months ago
- code for Scaling Laws of RoPE-based Extrapolation☆73Oct 16, 2023Updated 2 years ago
- The code for “PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search”☆19Mar 13, 2024Updated 2 years ago
- ☆31Jul 24, 2025Updated 9 months ago
- ☆20Jul 5, 2024Updated last year
- ☆142Apr 14, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- Coming soon!☆28Oct 23, 2025Updated 6 months ago
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆153May 13, 2026Updated last week
- ☆22Apr 22, 2025Updated last year
- ☆31Jun 9, 2025Updated 11 months ago
- PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.☆12Nov 27, 2024Updated last year
- ☆23May 25, 2023Updated 2 years ago
- Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning☆28Jun 18, 2025Updated 11 months ago
- SGLang Kernel Wheel Index☆22May 15, 2026Updated last week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ICDCS 2021, "StripeMerge: Efficient Wide-Stripe Generation for Large-Scale Erasure-Coded Storage"☆13Jul 19, 2021Updated 4 years ago
- 卓越杯项目:新原理的视频通话(卫星带宽水平)☆17Jun 18, 2025Updated 11 months ago
- Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits☆41Jan 8, 2026Updated 4 months ago
- ☆20Jan 7, 2024Updated 2 years ago
- Registration of 3D triangular meshes onto a 2D image can be performed using optimisation and fast X-ray simulation on GPU. Automatic esti…☆11Aug 28, 2019Updated 6 years ago
- Google 공식 Rouge Implementation을 한국어에서 사용할 수 있도록 처리☆17Jan 3, 2024Updated 2 years ago
- Extend bert-nmt to context-aware translation.☆11May 24, 2021Updated 4 years ago
- FileGram: Grounding Agent Personalization in File-System Behavioral Traces☆64Apr 12, 2026Updated last month
- ☆15Nov 18, 2025Updated 6 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- [ACL 2026] Psyche-R1 (Chinese Psychological Reasoning LLM)☆26Apr 18, 2026Updated last month
- The OBMO module embedded in PatchNet☆10Feb 21, 2024Updated 2 years ago
- PaiNN in jax☆11Jan 14, 2025Updated last year
- ☆74Apr 1, 2026Updated last month
- ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 2024.☆68Jun 24, 2024Updated last year
- [AAMAS 2026] Don’t Blind Your VLA: Aligning Visual Representations for OOD Generalization. https://blind-vla-paper.github.io☆60Jan 25, 2026Updated 3 months ago
- [ICML 2025] Streamline Without Sacrifice - Squeeze out Computation Redundancy in LMM☆20May 22, 2025Updated last year