schwartz-lab-NLP/Tokens2Words

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/schwartz-lab-NLP/Tokens2Words)

schwartz-lab-NLP / Tokens2Words

☆16

Alternatives and similar repositories for Tokens2Words

Users that are interested in Tokens2Words are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

goombalab / Gather-and-Aggregate
View on GitHub
Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆16Apr 30, 2025Updated last year
SALT-NLP / multi-value
View on GitHub
Complete set of English dialect transformation rules and evaluation code
☆16Jun 7, 2024Updated 2 years ago
orevaahia / magnet-tokenization
View on GitHub
☆11Mar 17, 2026Updated 4 months ago
liujch1998 / memo-trap
View on GitHub
☆23Jan 25, 2023Updated 3 years ago
assafbk / OPRM
View on GitHub
Overflow Prevention Enhances Long-Context Recurrent LLMs (COLM 2025)
☆18Jul 8, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Aleph-Alpha-Research / trigrams
View on GitHub
☆60Nov 18, 2025Updated 8 months ago
rioyokotalab / Megatron-Llama2
View on GitHub
2023 ABCI Llama-2 継続学習プロジェクト
☆14Jan 22, 2024Updated 2 years ago
paul-rottger / issuebench
View on GitHub
Röttger et al. (2024): "IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance"
☆17Mar 6, 2026Updated 4 months ago
utahnlp / DirectProbe
View on GitHub
☆21Oct 15, 2022Updated 3 years ago
eliyahabba / PromptSuite
View on GitHub
☆16Nov 24, 2025Updated 8 months ago
free-news-api / news-crawlers
View on GitHub
This project compares five open-source news crawlers: "news-please", "fundus", "news-crawler", "news-crawl" and "newspaper4k" - focusin…
☆51Oct 22, 2024Updated last year
circle-hit / Lens
View on GitHub
Code for our paper titled "Lens: Rethinking Multilingual Enhancement for Large Language Models"
☆12Oct 15, 2024Updated last year
Lizn-zn / Nesy-Programming
View on GitHub
☆10Oct 28, 2024Updated last year
cimeister / tokenizer-intrinsic-evals
View on GitHub
TokEval: intrinsic quality metrics for tokenizers across natural language, code, and math
☆46Jul 4, 2026Updated 3 weeks ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
allenai / fluid-benchmarking
View on GitHub
Fluid Language Model Benchmarking
☆29Sep 16, 2025Updated 10 months ago
umyomyomyon / whisper-chatgpt-voicevox
View on GitHub
音声を文字起こししてChatGPTと会話したい
☆22Mar 8, 2023Updated 3 years ago
ndl-lab / ndl-minhon-ocrdataset
View on GitHub
NDL古典籍OCR学習用データセット（みんなで翻刻加工データ）
☆22Mar 13, 2026Updated 4 months ago
edahanoam / Awesome-Summarization-Datasets
View on GitHub
Updating collection of summarization datasets in 100+ languages, based on our paper "The State and Fate of Summarization Datasets: A Surv…
☆31Apr 29, 2025Updated last year
idoatad / TensorLens
View on GitHub
Official PyTorch implementation for "TensorLens: End-to-End Transformer Analysis via High-Order Attention Tensors" [ACL 2026]
☆47Apr 14, 2026Updated 3 months ago
KhoomeiK / complexity-scaling
View on GitHub
gzip Predicts Data-dependent Scaling Laws
☆35May 28, 2024Updated 2 years ago
leia-llm / leia
View on GitHub
LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation
☆23Apr 24, 2024Updated 2 years ago
hyintell / LLMSymbolic
View on GitHub
☆22Feb 29, 2024Updated 2 years ago
xtinkt / editable
View on GitHub
A supplementary code for Editable Neural Networks, an ICLR 2020 submission.
☆46Jan 21, 2020Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
jxiw / MambaByte
View on GitHub
[CoLM 24] Official Repository of MambaByte: Token-free Selective State Space Model
☆27Oct 12, 2024Updated last year
kojima-takeshi188 / lang_neuron
View on GitHub
☆21Jun 24, 2024Updated 2 years ago
Weyaxi / scrape-open-llm-leaderboard
View on GitHub
Scrape and export data from the Open LLM Leaderboard.
☆48Dec 17, 2024Updated last year
jfkback / hypencoder-paper
View on GitHub
Official Repository for "Hypencoder: Hypernetworks for Information Retrieval"
☆41Sep 20, 2025Updated 10 months ago
alessiodevoto / l2compress
View on GitHub
Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."
☆19Dec 13, 2024Updated last year
fenglang918 / HiCache
View on GitHub
HiCache: Hermite Polynomial-based Feature Cache for diffusion inference
☆15Jan 27, 2026Updated 5 months ago
chwoong / LiRE
View on GitHub
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning (ICML 2024)
☆18Jun 18, 2024Updated 2 years ago
bigcode-project / bigcode-tokenizer
View on GitHub
☆15Oct 24, 2023Updated 2 years ago
Thytu / StockLLM
View on GitHub
Elevating Chess Strategy with Fine-Tuned Large Language Model
☆19Dec 8, 2023Updated 2 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
mlabonne / chessllm
View on GitHub
☆47Jan 24, 2024Updated 2 years ago
alisawuffles / ambient
View on GitHub
Code and data associated with the AmbiEnt dataset in "We're Afraid Language Models Aren't Modeling Ambiguity" (Liu et al., 2023)
☆72Feb 5, 2024Updated 2 years ago
kennycason / cellular-automata-pokemon-types
View on GitHub
Cellular Automata - Pokemon Type Battle Simulation
☆11Oct 26, 2024Updated last year
nyonicai / nyonic-public
View on GitHub
Reference implementation of models from Nyonic Model Factory
☆12May 13, 2024Updated 2 years ago
tanganke / opcm
View on GitHub
official code repo for paper "Merging Models on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging"
☆25Oct 11, 2025Updated 9 months ago
KunstDerFuge / Q-notebook
View on GitHub
☆14Jul 26, 2021Updated 4 years ago
iankur / vqllm
View on GitHub
Residual vector quantization for KV cache compression in large language model
☆12Oct 22, 2024Updated last year