mixedbread-ai / ofenLinks
WIP: Ofen is a toolkit aimed at making transformer models production-ready. API included
☆16Updated 10 months ago
Alternatives and similar repositories for ofen
Users that are interested in ofen are comparing it to the libraries listed below
Sorting:
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆18Updated last year
- Synthesizing realistic and diverse text-datasets from augmented LLMs☆13Updated 4 months ago
- ☆20Updated 4 months ago
- ☆51Updated 6 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆21Updated last year
- Code for paper: Long cOntext aliGnment via efficient preference Optimization☆14Updated 5 months ago
- Crispy reranking models by Mixedbread☆34Updated 3 weeks ago
- MEXMA: Token-level objectives improve sentence representations☆41Updated 7 months ago
- The repository contains generative AI analytics platform application code.☆26Updated 3 months ago
- Repository for Sparse Finetuning of LLMs via modified version of the MosaicML llmfoundry☆42Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)☆41Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆56Updated 2 weeks ago
- KV Cache Steering for Inducing Reasoning in Small Language Models☆36Updated 2 weeks ago
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆45Updated 6 months ago
- ☆75Updated 3 months ago
- GoldFinch and other hybrid transformer components☆46Updated last year
- This is the official project of paper: Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conver…☆21Updated 8 months ago
- DPO, but faster 🚀☆44Updated 8 months ago
- Official Repository for Task-Circuit Quantization☆22Updated 2 months ago
- Code for KaLM-Embedding models☆89Updated last month
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆59Updated last year
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆83Updated last week
- A repository for research on medium sized language models.☆78Updated last year
- ☆53Updated 9 months ago
- XmodelLM☆39Updated 8 months ago
- ☆34Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 3 weeks ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆63Updated 2 months ago