jxmorris12 / embzipLinks

lossily compress representation vectors using product quantization

☆59

Alternatives and similar repositories for embzip

Users that are interested in embzip are comparing it to the libraries listed below

Sorting:

Pleias / Quest-Best-Tokens
An introduction to LLM Sampling
☆79Updated 11 months ago
Pleias / Pleias-RAG-Library
Python library to use Pleias-RAG models
☆67Updated 7 months ago
xjdr-alt / llmri
look how they massacred my boy
☆63Updated last year
xjdr-alt / muzero_sketch
☆40Updated last year
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆52Updated 9 months ago
haizelabs / j1-micro
j1-micro (1.7B) & j1-nano (600M) are absurdly tiny but mighty reward models.
☆99Updated 4 months ago
Columbia-NLP-Lab / PAPILLON
Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles
☆60Updated 6 months ago
s-smits / grpo-optuna
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆59Updated last month
minosvasilias / simple_grpo
Simple GRPO scripts and configurations.
☆59Updated 9 months ago
brendanhogan / picoDeepResearch
☆68Updated 6 months ago
HazyResearch / cartridges
Storing long contexts in tiny caches with self-study
☆218Updated last month
joshuacnf / Ctrl-G
☆104Updated 10 months ago
rosmineb / unit_test_rl
Project code for training LLMs to write better unit tests + code
☆21Updated 6 months ago
Mihaiii / backtrack_sampler
An easy-to-understand framework for LLM samplers that rewind and revise generated tokens
☆146Updated 9 months ago
dropbox / aana_sdk
Aana SDK is a powerful framework for building AI enabled multimodal applications.
☆53Updated 3 months ago
allenai / infinigram-api
☆87Updated this week
alexzhang13 / rlm
Super basic implementation (gist-like) of RLMs with REPL environments.
☆273Updated last month
MinishLab / tokenlearn
Pre-train Static Word Embeddings
☆92Updated 2 months ago
stephantul / skeletoken
Datamodels for hugging face tokenizers
☆86Updated this week
taylorai / onnx_embedding_models
utilities for loading and running text embeddings with onnx
☆44Updated 3 months ago
codelion / pts
Pivotal Token Search
☆131Updated 4 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆106Updated this week
Hannibal046 / nanoColBERT
Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).
☆79Updated last year
catid / lllm
Latent Large Language Models
☆19Updated last year
SinatrasC / entropix-smollm
smolLM with Entropix sampler on pytorch
☆149Updated last year
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆107Updated 8 months ago
google-deepmind / mishax
☆143Updated 2 months ago
jxmorris12 / cde
code for training & evaluating Contextual Document Embedding models
☆201Updated 6 months ago
huggingface / wikirace-llms
☆25Updated 6 months ago
facebookresearch / ExploreToM
Code for ExploreTom
☆87Updated 5 months ago