This repository contains some of the code used in the paper "Training Language Models with Langauge Feedback at Scale"
☆27Mar 30, 2023Updated 3 years ago
Alternatives and similar repositories for imitation_learning_from_language_feedback
Users that are interested in imitation_learning_from_language_feedback are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for EMNLP 2022 Paper: On the Calibration of Massively Multilingual Language Models☆15Jun 12, 2023Updated 2 years ago
- [ICLR 2022] "Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How" by Yuning You, Yue Cao, Tianl…☆14Aug 19, 2022Updated 3 years ago
- ☆11Oct 3, 2021Updated 4 years ago
- A publishing website of a table collecting meta-learning-related papers in the area of human language processing.☆17Aug 2, 2021Updated 4 years ago
- Code for NAACL 2025 paper "AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge"☆17Mar 2, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- GCN and BERT for relation extraction☆18Jun 29, 2020Updated 5 years ago
- [NAACL 2022] This is the code repo for our paper `ACTUNE: Uncertainty-based Active Self-Training for Active Fine-Tuning of Pretrained Lan…☆15Nov 16, 2022Updated 3 years ago
- ☆11Jun 7, 2023Updated 2 years ago
- Official implementation of the paper "IteraTeR: Understanding Iterative Revision from Human-Written Text" (ACL 2022)☆82Nov 15, 2023Updated 2 years ago
- ☆16Nov 30, 2022Updated 3 years ago
- In-Context Learning User Simulators for Task-Oriented Dialog Systems☆30Jun 2, 2023Updated 2 years ago
- (Personal project) Pruning algorithm for DNNs using "lottery ticket" pruning☆10Dec 8, 2022Updated 3 years ago
- ☆15Feb 21, 2024Updated 2 years ago
- Python code to automatically produce a summary of a piece of text.☆12Sep 8, 2016Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Merging Generated and Retrieved Knowledge for Open-Domain QA (EMNLP 2023)☆22Oct 8, 2023Updated 2 years ago
- Implementation of the most important parts of the Lottery Ticket Hypothesis Paper☆12Jul 2, 2018Updated 7 years ago
- ☆12Aug 15, 2023Updated 2 years ago
- Example of android app written in Qt/Qml which uses MXNet for plant image recognition.☆10Nov 4, 2017Updated 8 years ago
- TPLinker: Single-stage Joint Extraction of Entities and Relations Through Token Pair Linking☆18Apr 15, 2021Updated 5 years ago
- Some python scripts for drawing figures in scientific papers☆28Jun 26, 2019Updated 6 years ago
- The Intermediate Goal of the project is to train a GPT like architecture to learn to summarise reddit posts from human preferences, as th…☆12Jul 14, 2021Updated 4 years ago
- machine translation data process tools☆10Apr 29, 2024Updated last year
- Active and Sample-Efficient Model Evaluation☆27May 22, 2025Updated 10 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Anti exploration in offline reinforcement learning☆11May 17, 2021Updated 4 years ago
- A basic analyser for github repos that allows you to question the whole repo.☆17Mar 31, 2026Updated 2 weeks ago
- RENT (Reinforcement Learning via Entropy Minimization) is an unsupervised method for training reasoning LLMs.☆43Oct 31, 2025Updated 5 months ago
- [ACL'24] Code and data of paper "When is Tree Search Useful for LLM Planning? It Depends on the Discriminator"☆54Feb 23, 2024Updated 2 years ago
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- An implementation of the paper "Solving the Rubik's Cube without Human Knowledge"☆14Dec 9, 2018Updated 7 years ago
- Source code for the paper "Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness"☆25Feb 12, 2020Updated 6 years ago
- {DeepL, Google, WMT-Best, davinci-003, turbo, gpt-4} × {En-De, En-Cs, En-Ru, En-Zh, De-Fr, En-Ja, Uk-En, Uk-Cs, En-Hr, En-Ha, En-Is}☆14Jun 18, 2023Updated 2 years ago
- ☆15Sep 10, 2023Updated 2 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Code of "Instruction Multi-Constraint Molecular Generation Using a Teacher-Student Large Language Model"☆14Jul 8, 2025Updated 9 months ago
- A course on Hugging Face land☆33Apr 8, 2026Updated last week
- EMNLP 2024 | Style-Specific Neurons for Steering LLMs in Text Style Transfer☆13Mar 23, 2025Updated last year
- Pytorch implementation of BEAR in "Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction"☆11Oct 29, 2019Updated 6 years ago
- Hierarchical Story Generation based on (https://arxiv.org/abs/1805.04833)☆13May 6, 2020Updated 5 years ago
- A Framework to Automatically Extract Indicators of Compromise (IoCs) from Twitter☆16Dec 9, 2019Updated 6 years ago
- ☆25Jun 10, 2025Updated 10 months ago