The simplest repository for training medium-sized BackpackLM for cs224n
☆25Aug 13, 2023Updated 2 years ago
Alternatives and similar repositories for nanoBackpackLM
Users that are interested in nanoBackpackLM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The original Backpack Language Model implementation, a fork of FlashAttention☆71May 29, 2023Updated 2 years ago
- Codebase describing experiments in Truncation Sampling as Language Model Desmoothing☆13Dec 6, 2022Updated 3 years ago
- Code for the paper "Distinguishing the Knowable from the Unknowable with Language Models"☆11Apr 15, 2024Updated 2 years ago
- ☆18Oct 5, 2017Updated 8 years ago
- Code release for "TempLM: Distilling Language Models into Template-Based Generators"☆14Jul 21, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆14Jul 5, 2023Updated 2 years ago
- Bias Benchmark for Natural Language Inference. Code repo for the Findings of NAACL 2022 paper "On Measuring Social Biases in Prompt-Based…☆15Apr 28, 2022Updated 3 years ago
- Repository for "Propagating Knowledge Updates to LMs Through Distillation" (NeurIPS 2023).☆26Aug 25, 2024Updated last year
- ☆23Sep 2, 2024Updated last year
- MDL Complexity computations and experiments from the paper "Revisiting complexity and the bias-variance tradeoff".☆18Jun 12, 2023Updated 2 years ago
- Code for ModularQA☆27Jun 8, 2021Updated 4 years ago
- Group-conditional DRO to alleviate spurious correlations☆15Jul 15, 2021Updated 4 years ago
- Framework for probing tasks☆31Mar 24, 2024Updated 2 years ago
- Code for EMNLP 2020 paper `Connecting the Dots: Event Graph Schema Induction with Path Language Modeling`☆23Nov 16, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆15Nov 14, 2022Updated 3 years ago
- Data and code for the paper "Future is not One-dimensional: Complex Event Schema Induction via Graph Modeling".☆30Apr 24, 2021Updated 4 years ago
- ☆13Jul 8, 2020Updated 5 years ago
- Website for release of TellMeWhy dataset for why question answering☆14Nov 11, 2022Updated 3 years ago
- ☆11Apr 24, 2023Updated 2 years ago
- ☆26Oct 18, 2021Updated 4 years ago
- ☆10Jun 11, 2019Updated 6 years ago
- Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP☆11Jan 29, 2024Updated 2 years ago
- The code to reproduce CVPR 2021 paper "Towards Robust Classification Model by Counterfactual and Invariant Data Generation"☆16Jul 29, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A Wikipedia-based summarization dataset☆14Mar 27, 2023Updated 3 years ago
- [ACL 2026] A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models☆21Apr 6, 2026Updated last week
- Code for MERMAID : Metaphor Generation with Symbolism and Discriminative Decoding☆11May 2, 2022Updated 3 years ago
- A multi-threaded C++ implementation of Nickel & Kiela's "Poincare Embeddings" paper from NIPS 2017, following the implementation of the a…☆17Jun 6, 2018Updated 7 years ago
- ☆25Dec 20, 2023Updated 2 years ago
- [NAACL(2019)] Generating Knowledge Graph Paths from Textual Definitions using Sequence-to-Sequence Models☆11Apr 27, 2022Updated 3 years ago
- ☆12Jul 19, 2018Updated 7 years ago
- Code and data for experiments on semantic fragments☆11Jun 23, 2022Updated 3 years ago
- ☆16Jul 20, 2023Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- https://arxiv.org/abs/2404.10917☆14Mar 18, 2025Updated last year
- 대부분의 신문사 뉴스를 수집하는 것을 목적으로 하는 크롤러 제작 프로젝트☆10Jul 29, 2019Updated 6 years ago
- This is the repository for the resources in CoNLL 2020 Paper "What Are You Trying Todo? Semantic Typing of Event Processes"☆10Jan 5, 2021Updated 5 years ago
- Intrinsic Evaluation of pre-trained word embeddings, using large Word Association Dataset: SWOW (Small World of Words)☆11Feb 28, 2024Updated 2 years ago
- 💻 A command line websh client with bash-like interactive UI☆25Jul 14, 2024Updated last year
- Align, a general text alignment function☆15Dec 7, 2023Updated 2 years ago
- Implementation of the Paper "Goal-Driven Explainable Clustering via Language Descriptions"☆40May 24, 2023Updated 2 years ago