An experimental implementation of the retrieval-enhanced language model
☆74Dec 29, 2022Updated 3 years ago
Alternatives and similar repositories for mengzi-retrieval-lm
Users that are interested in mengzi-retrieval-lm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch☆878Oct 30, 2023Updated 2 years ago
- PyTorch + HuggingFace code for RetoMaton: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022), including an…☆286Oct 20, 2022Updated 3 years ago
- VaLM: Visually-augmented Language Modeling. ICLR 2023.☆56Mar 6, 2023Updated 3 years ago
- Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.☆14Jan 23, 2022Updated 4 years ago
- [ACL2023] Source code for Decouple knowledge from paramters for plug-and-play language modeling☆20Sep 18, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- [EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674☆193Jun 14, 2023Updated 2 years ago
- Code and model release for the paper "Task-aware Retrieval with Instructions" by Asai et al.☆164Oct 4, 2023Updated 2 years ago
- PyTorch code for the RetoMaton paper: "Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval" (ICML 2022)☆76Jul 16, 2022Updated 3 years ago
- Code for RECENT☆13Dec 18, 2022Updated 3 years ago
- final-project-level3-nlp-02 created by GitHub Classroom☆11Dec 31, 2021Updated 4 years ago
- ☆12Nov 21, 2023Updated 2 years ago
- Tuning BERT☆10Jun 28, 2022Updated 3 years ago
- CraftML is a restful web service for easy pipeline creation without code.☆13Apr 18, 2021Updated 5 years ago
- ☆34Nov 18, 2025Updated 6 months ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Source code for paper "Learning from Noisy Labels for Entity-Centric Information Extraction", EMNLP 2021☆55Dec 11, 2021Updated 4 years ago
- Awesome Reinforcement Learning from Human Feedback, the secret behind ChatGPT XD☆23Dec 13, 2022Updated 3 years ago
- Mengzi Pretrained Models☆544Nov 29, 2022Updated 3 years ago
- Korean Benchmark for Korean Legal Language Understanding☆19Nov 16, 2024Updated last year
- [EMNLP 2021] Efficient Contrastive Learning via Novel Data Augmentation and Curriculum Learning☆17Jun 28, 2025Updated 11 months ago
- Code used to create the Linked WikiText-2 dataset☆16May 22, 2023Updated 3 years ago
- Code associated with the paper **SkipBERT: Efficient Inference with Shallow Layer Skipping**, at ACL 2022☆16Jun 22, 2022Updated 3 years ago
- SIGIR 2021: Proactive Retrieval-based Chatbots based on Relevant Knowledge and Goals☆11Jul 30, 2021Updated 4 years ago
- ☆24Jun 24, 2020Updated 5 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (Findings of EMNLP …☆31Jan 6, 2023Updated 3 years ago
- The implementation of the paper "Retrieval-Free Knowledge-Grounded Dialogue Response Generation with Adapters".☆17May 24, 2022Updated 4 years ago
- This repo investigates LLMs' tendency to exhibit acquiescence bias in sequential QA interactions. Includes evaluation methods, datasets, …☆39Apr 24, 2026Updated last month
- The theory of mind module for the SWE agent☆107Updated this week
- Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"☆59Jan 12, 2023Updated 3 years ago
- Serving large language model with transformers☆13Oct 18, 2022Updated 3 years ago
- Pretrain and finetune ELECTRA with fastai and huggingface. (Results of the paper replicated !)☆332Jan 10, 2024Updated 2 years ago
- ☆13Oct 18, 2023Updated 2 years ago
- Embedding-based evaluation metrics for dialogue generation.☆15Jan 8, 2023Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ✱ Understanding the underlying learning dynamics of simple tasks in Transformer networks☆18Aug 16, 2024Updated last year
- ☆22Feb 13, 2026Updated 3 months ago
- The source code of the paper 'Dynamic Knowledge Routing Network For Target-Guided Open-Domain Conversation'☆24Mar 24, 2023Updated 3 years ago
- Dataset and Baseline for SMP-MCC2020☆23Jul 6, 2023Updated 2 years ago
- A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643☆78Sep 4, 2023Updated 2 years ago
- Official code repository for the main conference paper in EMNLP 2022: SubeventWriter: Iterative Sub-event Sequence Generation with Cohere…☆11Oct 16, 2022Updated 3 years ago
- Code for the paper Code for the paper InstructDial: Improving Zero and Few-shot Generalization in Dialogue through Instruction Tuning☆101May 6, 2023Updated 3 years ago