We propose Reinforcement Learning from Community Feedback (RLCF), a training paradigm that uses large-scale community signals as supervision, and formulate scientific taste learning as a preference modeling and alignment problem.
☆394Mar 29, 2026Updated last month
Alternatives and similar repositories for AI-Can-Learn-Scientific-Taste
Users that are interested in AI-Can-Learn-Scientific-Taste are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An open-source personal academic homepage template characterized by its user-friendly design and extensive scalability.☆37Oct 6, 2025Updated 6 months ago
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- ☆112Apr 19, 2026Updated last week
- [NeurIPS 2024] Can Language Models Learn to Skip Steps?☆22Jan 25, 2025Updated last year
- ACL 2026 - Muse: Towards Reproducible Long-Form Song Generation with Fine-Grained Style Control☆109Apr 11, 2026Updated 2 weeks ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Codes for the paper "BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping…☆92Jan 29, 2026Updated 3 months ago
- KDD2024: This is the code for the paper "Propagation Structure-aware Graph Transformer for Robust and Interpretable Fake News Detection"☆12Aug 31, 2024Updated last year
- code for Scaling Laws of RoPE-based Extrapolation☆73Oct 16, 2023Updated 2 years ago
- The code for “PromptNER: A Prompting Method for Few-shot Named Entity Recognition via k Nearest Neighbor Search”☆19Mar 13, 2024Updated 2 years ago
- ☆30Jul 24, 2025Updated 9 months ago
- [ICLR 2025 Spotlight] Weak-to-strong preference optimization: stealing reward from weak aligned model☆18Feb 24, 2025Updated last year
- ☆20Jul 5, 2024Updated last year
- An official implementation of Style-Talker for Spoken Dialogue Generation☆23Jan 12, 2025Updated last year
- Game-RL: Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning☆150Updated this week
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆22Apr 22, 2025Updated last year
- Code for M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models☆23Jul 27, 2024Updated last year
- PyTorch Implementation for the paper "Let Me Help You! Neuro-Symbolic Short-Context Action Anticipation" accepted to RA-L'24.☆12Nov 27, 2024Updated last year
- ☆13Jan 14, 2021Updated 5 years ago
- Code for ICML 2025 paper | Joint Localization and Activation Editing for Low-Resource Fine-Tuning☆28Jun 18, 2025Updated 10 months ago
- Java面试总结☆19May 11, 2020Updated 5 years ago
- This repository contains setup instructions and the notebooks for the AI in drug discovery workshops☆22Mar 1, 2023Updated 3 years ago
- 卓越杯项目:新原理的视频通话(卫星带宽水平)☆17Jun 18, 2025Updated 10 months ago
- Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits☆40Jan 8, 2026Updated 3 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Hy3 preview (295B A21B), a leading reasoning and agent model in its size, with great cost efficiency☆234Apr 23, 2026Updated last week
- "다중 도메인 대화 상태 추적" Contest. Public LB 1등, Private LB 1등☆11Jun 26, 2021Updated 4 years ago
- [ACL 2026] A large-scale longitudinal study on robust and fair evaluation of LLMs — 200K+ generative questions across 13 disciplines☆37Apr 13, 2026Updated 2 weeks ago
- ☆20Jan 7, 2024Updated 2 years ago
- Registration of 3D triangular meshes onto a 2D image can be performed using optimisation and fast X-ray simulation on GPU. Automatic esti…☆11Aug 28, 2019Updated 6 years ago
- Bayesian Learning-Based Adaptive Control for Safety Critical Systems☆20Nov 23, 2019Updated 6 years ago
- 基于STM32的智能四足机器狗 - 具备语音交互、自主避障和表情显示功能的桌面级机器宠物☆52Mar 28, 2026Updated last month
- a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation☆61Mar 31, 2025Updated last year
- 一个复旦幻灯片的 Typst 主题。An unofficial Fudan slide theme for Typst.☆16Mar 19, 2024Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Managed Prompt Engineering☆18Oct 5, 2023Updated 2 years ago
- Extend bert-nmt to context-aware translation.☆11May 24, 2021Updated 4 years ago
- ☆15Nov 18, 2025Updated 5 months ago
- multicast learning in network programming course☆10Oct 30, 2020Updated 5 years ago
- FileGram: Grounding Agent Personalization in File-System Behavioral Traces☆64Apr 12, 2026Updated 2 weeks ago
- [EMNLP 2024] Official code for "Beyond Embeddings: The Promise of Visual Table in Multi-Modal Models"☆20Oct 17, 2024Updated last year
- An Ultra-Long Output Reinforcement Learning Approach☆23Jul 31, 2025Updated 9 months ago