This a minimal implementation of a GPT model but it has some advanced features such as temperature/ top-k/ top-p sampling, and KV Cache.
☆12Oct 17, 2025Updated 7 months ago
Alternatives and similar repositories for milliGPT
Users that are interested in milliGPT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Japanese Spelling Correction - JSC☆14Sep 19, 2023Updated 2 years ago
- Question Answering in Vietnamese. In a nutshell, this project helps us answer a Question of a given Context in Vietnamese. [UPDATED] This…☆26Nov 17, 2022Updated 3 years ago
- ☆14Oct 27, 2023Updated 2 years ago
- PyTorch implementation of the End-to-End Memory Network with attention layer vizualisation support.☆12Jun 30, 2018Updated 7 years ago
- Personalized Response Generation via Generative Split Memory Network☆12Sep 6, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- OffensEval2020 Shared Task☆17Apr 5, 2021Updated 5 years ago
- simply implement "Personalizing Dialogue Agents: I have a dog, do you have pets too? "☆14Nov 27, 2018Updated 7 years ago
- Using BERT for long sentence classification (more than 512 word pieces).☆17May 9, 2021Updated 5 years ago
- We attempt to do few shot learning with BERT and prototypical network for Intent classification☆22Jun 27, 2020Updated 5 years ago
- BERT-based Biomedical Text Summarizer☆23Oct 2, 2019Updated 6 years ago
- Implementation, trained models and result data for the paper "Pairwise Multi-Class Document Classification for Semantic Relations between…☆31Jun 12, 2023Updated 2 years ago
- Depth Image Homography Estimation with Noise in Pytorch☆22Jan 22, 2020Updated 6 years ago
- Memory Attention Networks, in IJCAI 2018☆25Dec 5, 2018Updated 7 years ago
- ☆24Nov 27, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Predicting Political Ideology of Twitter Users.☆24Sep 20, 2020Updated 5 years ago
- Code for the paper "Contextual and Sequential User Embeddings for Large-Scale Music Recommendation".☆39Oct 12, 2020Updated 5 years ago
- Code for the paper "Adaptive Transformers for Learning Multimodal Representations" (ACL SRW 2020)☆43Oct 20, 2022Updated 3 years ago
- A PyTorch implementation of the document classification by Hierarchical Attention Network☆32Jul 15, 2019Updated 6 years ago
- End-To-End Memory Networks in PyTorch☆38Jan 24, 2018Updated 8 years ago
- Hands-On One-shot Learning with Python, published by Packt☆56Mar 2, 2026Updated 2 months ago
- Meta-learning for NLP☆48Nov 27, 2020Updated 5 years ago
- Official PyTorch Implementation of SSMix (Findings of ACL 2021)☆63Jun 16, 2021Updated 4 years ago
- Code for ACL 2021 main conference paper "Conversations Are Not Flat: Modeling the Dynamic Information Flow across Dialogue Utterances".☆94Jun 30, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A community-built high-quality repository of NLP corpora☆66Jan 8, 2022Updated 4 years ago
- pyTorch implementation of Recurrence over BERT (RoBERT) based on this paper https://arxiv.org/abs/1910.10781 and comparison with pyTorch …☆82Oct 26, 2022Updated 3 years ago
- Source Code for DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances (https://arxiv.org/pdf/2012.0…☆79Jan 2, 2022Updated 4 years ago
- code and data to faciliate BERT/ELECTRA for document ranking. Details refer to the paper - PARADE: Passage Representation Aggregation for…☆96Mar 25, 2023Updated 3 years ago
- Implementation of End-to-End Memory Network in PyTorch☆106Aug 28, 2017Updated 8 years ago
- Codes and Datasets for paper RecSys'20 "SSE-PT: Sequential Recommendation Via Personalized Transformer" and NurIPS'19 "Stochastic Shared …☆112Nov 17, 2020Updated 5 years ago
- The released codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'☆128Sep 13, 2021Updated 4 years ago
- Code for explaining and evaluating late chunking (chunked pooling)☆516Dec 23, 2024Updated last year
- A list of recent papers about Meta / few-shot learning methods applied in NLP areas.☆231Dec 29, 2020Updated 5 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Pytorch implementation of Supporting Clustering with Contrastive Learning, NAACL 2021☆307Oct 23, 2023Updated 2 years ago
- PyTorch code for "Unifying Vision-and-Language Tasks via Text Generation" (ICML 2021)☆372Jul 29, 2023Updated 2 years ago
- [EMNLP 2022] Unifying and multi-tasking structured knowledge grounding with language models☆565Aug 22, 2023Updated 2 years ago
- Paper List for Contrastive Learning for Natural Language Processing☆573Apr 27, 2023Updated 3 years ago
- PyTorch(1.6+) implementation of https://github.com/kang205/SASRec☆595Mar 19, 2026Updated 2 months ago
- [ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723☆728Aug 29, 2022Updated 3 years ago
- High-performance retrieval engine for unstructured data☆1,583Nov 10, 2025Updated 6 months ago