magicproduct / hash-hopLinks

Long context evaluation for large language models

☆224

Alternatives and similar repositories for hash-hop

Users that are interested in hash-hop are comparing it to the libraries listed below

Sorting:

HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆218Updated last month
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 10 months ago
LeonGuertler / UnstableBaselines
☆107Updated this week
dust-tt / llama-ssp
Experiments on speculative sampling with Llama models
☆127Updated 2 years ago
PrimeIntellect-ai / genesys
☆136Updated 8 months ago
tokenbender / avataRL
rl from zero pretrain, can it be done? yes.
☆281Updated 2 months ago
mcleish7 / arithmetic
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆195Updated last year
HazyResearch / lolcats
Repo for "LoLCATs: On Low-Rank Linearizing of Large Language Models"
☆249Updated 10 months ago
Zyphra / tree_attention
Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters
☆130Updated last year
OpenEvaByte / evabyte
EvaByte: Efficient Byte-level Language Models at Scale
☆111Updated 7 months ago
HazyResearch / based
Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"
☆243Updated 5 months ago
PrimeIntellect-ai / prime-environments
Training-Ready RL Environments + Evals
☆182Updated last week
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated last year
kernelmachine / cbtm
Code repository for the c-BTM paper
☆108Updated 2 years ago
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆149Updated last year
jerber / lang-jepa
☆128Updated 11 months ago
IBM / ModuleFormer
ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…
☆226Updated 2 months ago
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 8 months ago
facebookresearch / memory
Memory layers use a trainable key-value lookup mechanism to add extra parameters to a model without increasing FLOPs. Conceptually, spars…
☆360Updated 11 months ago
Alex-Gurung / ReasoningNCP
Official repo for Learning to Reason for Long-Form Story Generation
☆72Updated 7 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆146Updated 9 months ago
HazyResearch / zoology
Understand and test language model architectures on synthetic tasks.
☆243Updated 2 months ago
OSU-NLP-Group / GrokkedTransformer
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆234Updated 4 months ago
IST-DASLab / qmoe
Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".
☆278Updated 2 years ago
rgreenblatt / arc_draw_more_samples_pub
Draw more samples
☆196Updated last year
ScalingIntelligence / Archon
Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.
☆189Updated 8 months ago
kanishkg / stream-of-search
Repository for the paper Stream of Search: Learning to Search in Language
☆151Updated 10 months ago
justinchiu / openlogprobs
Extract full next-token probabilities via language model APIs
☆248Updated last year
Zyphra / Zamba2
PyTorch implementation of models from the Zamba2 series.
☆186Updated 10 months ago
microsoft / ArchScale
Simple & Scalable Pretraining for Neural Architecture Research
☆304Updated last month