kernelmachine/cbtm

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/kernelmachine/cbtm)

kernelmachine / cbtm

Code repository for the c-BTM paper

☆109

Alternatives and similar repositories for cbtm

Users that are interested in cbtm are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

hadasah / btm
View on GitHub
☆79Apr 29, 2024Updated 2 years ago
kaiokendev / cutoff-len-is-context-len
View on GitHub
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Jun 21, 2023Updated 3 years ago
emrgnt-cmplxty / SmolTrainer
View on GitHub
☆21Oct 6, 2023Updated 2 years ago
machelreid / m2d2
View on GitHub
M2D2: A Massively Multi-domain Language Modeling Dataset (EMNLP 2022) by Machel Reid, Victor Zhong, Suchin Gururangan, Luke Zettlemoyer
☆54Nov 21, 2022Updated 3 years ago
Yuanhy1997 / HyPe
View on GitHub
HyPe: Better Pre-trained Language Model Fine-tuning with Hidden Representation Perturbation [ACL 2023]
☆14Jul 11, 2023Updated 3 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
jb-01 / LoRA-TLE
View on GitHub
Token-level adaptation of LoRA matrices for downstream task generalization.
☆15Apr 14, 2024Updated 2 years ago
CarperAI / squeakily
View on GitHub
A library for squeakily cleaning and filtering language datasets.
☆50Jul 10, 2023Updated 3 years ago
huu4ontocord / MDEL
View on GitHub
Multi-Domain Expert Learning
☆66Jan 23, 2024Updated 2 years ago
iliaschalkidis / flash-roberta
View on GitHub
Hugging Face RoBERTa with Flash Attention 2
☆24Sep 14, 2025Updated 10 months ago
MikeWangWZHL / Zemi
View on GitHub
Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings
☆15May 3, 2023Updated 3 years ago
belindal / TaskBench500
View on GitHub
Suite of 500 procedurally-generated NLP tasks to study language model adaptability
☆21Jul 16, 2022Updated 4 years ago
ChrisHayduk / QLoRA-for-MLM
View on GitHub
QLoRA for Masked Language Modeling
☆23Sep 11, 2023Updated 2 years ago
lucidrains / CoLT5-attention
View on GitHub
Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch
☆230Sep 6, 2024Updated last year
joeljang / ELM
View on GitHub
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆99Apr 26, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
seonghyeonye / Flipped-Learning
View on GitHub
[ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
☆117Jun 28, 2025Updated last year
da03 / criticize_text_generation
View on GitHub
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆12Mar 18, 2023Updated 3 years ago
JunjieHu / amber
View on GitHub
Explicit Alignment Objectives for Multilingual Bidirectional Encoders
☆14Apr 14, 2021Updated 5 years ago
scottlogic-alex / prm800k-denorm
View on GitHub
Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format
☆27Jul 12, 2023Updated 3 years ago
AblateIt / finetune-study
View on GitHub
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Sep 10, 2023Updated 2 years ago
manantomar / video-occupancy-models
View on GitHub
☆13Jul 16, 2024Updated 2 years ago
andersonbcdefg / dpo-lora
View on GitHub
direct preference optimization with only 1 model copy :)
☆14Oct 2, 2023Updated 2 years ago
google-deepmind / hierarchical_perceiver
View on GitHub
☆33Jul 10, 2026Updated last week
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
GPU virtual machines on DigitalOcean Gradient AI • Ad
Get to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
kernelmachine / balanced-kmeans
View on GitHub
☆21Apr 16, 2024Updated 2 years ago
Alignment-Lab-AI / Our-Projects
View on GitHub
A repository of projects and datasets under active development by Alignment Lab AI
☆22Dec 22, 2023Updated 2 years ago
oriram / spider
View on GitHub
☆55Jan 18, 2023Updated 3 years ago
hydrallm / llama-moe-v1
View on GitHub
☆95Jul 26, 2023Updated 2 years ago
vaguenebula / AlpacaDataReflect
View on GitHub
An experiment to see if chatgpt can improve the output of the stanford alpaca dataset
☆12Mar 29, 2023Updated 3 years ago
frankxu2004 / knnlm-why
View on GitHub
Repo for ICML23 "Why do Nearest Neighbor Language Models Work?"
☆59Jan 12, 2023Updated 3 years ago
CarperAI / decontamination
View on GitHub
This repository contains code for cleaning your training data of benchmark data to help combat data snooping.
☆28Apr 21, 2023Updated 3 years ago
facebookresearch / NPM
View on GitHub
The original implementation of Min et al. "Nonparametric Masked Language Modeling" (paper https//arxiv.org/abs/2212.01349)
☆159Jan 6, 2023Updated 3 years ago
microsoft / mutransformers
View on GitHub
some common Huggingface transformers in maximal update parametrization (µP)
☆87Mar 14, 2022Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
primeqa / primeqa
View on GitHub
The prime repository for state-of-the-art Multilingual Question Answering research and development.
☆740Sep 18, 2025Updated 10 months ago
nverma1 / merging-text-transformers
View on GitHub
Code for "Merging Text Transformers from Different Initializations"
☆20Feb 2, 2025Updated last year
Guitaricet / relora
View on GitHub
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
☆474Apr 21, 2024Updated 2 years ago
codekansas / rwkv
View on GitHub
RWKV model implementation
☆37Jul 15, 2023Updated 3 years ago
lucidrains / mixture-of-attention
View on GitHub
Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-experts
☆122Oct 17, 2024Updated last year
HazyResearch / TART
View on GitHub
TART: A plug-and-play Transformer module for task-agnostic reasoning
☆201Jun 22, 2023Updated 3 years ago
eth-easl / mixtera
View on GitHub
A lightweight, user-friendly data-plane for LLM training.
☆40Sep 10, 2025Updated 10 months ago