A stable, fast and easy-to-use inference library with a focus on a sync-to-async API
☆47Sep 26, 2024Updated last year
Alternatives and similar repositories for embed
Users that are interested in embed are comparing it to the libraries listed below
Sorting:
- TaCo: Enhancing Cross-Lingual Transfer for Low-Resource Languages in LLMs through Translation-Assisted Chain-of-Thought Processes☆13Jul 1, 2025Updated 7 months ago
- "a towel is about the most massively useful thing an interstellar AI hitchhiker can have"☆48Oct 9, 2024Updated last year
- A collection of experimental Retrieval Augmented Generation (RAG) Techniques to elevate your pipelines, all with code and intuitive expla…☆34Jul 21, 2025Updated 7 months ago
- ☆17Dec 16, 2024Updated last year
- coded with and corrected by Google Anti-Gravity☆13Nov 23, 2025Updated 3 months ago
- A Python-based voice assistant integrating speech-to-text (STT), text-to-speech (TTS), and powerful AI capabilities using either a local …☆13Dec 8, 2025Updated 2 months ago
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- REBUS: A Robust Evaluation Benchmark of Understanding Symbols☆13Aug 13, 2024Updated last year
- Multilingual Entity Linking model by BELA model☆12Jul 20, 2023Updated 2 years ago
- code for training and using chess embeddings models☆13Jun 9, 2024Updated last year
- My Gen AI research☆11Jun 3, 2024Updated last year
- Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali☆2,676Feb 5, 2026Updated 3 weeks ago
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆88Feb 7, 2026Updated 3 weeks ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆34Mar 2, 2024Updated last year
- One Line To Build Zero-Data Classifiers in Minutes☆64Sep 25, 2024Updated last year
- ☆134Dec 11, 2025Updated 2 months ago
- Simple examples using Argilla tools to build AI☆57Nov 18, 2024Updated last year
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Oct 28, 2025Updated 4 months ago
- This is the code for the paper "Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation".☆37Sep 1, 2025Updated 5 months ago
- Test your local LLMs on the AIME problems☆32Jun 7, 2025Updated 8 months ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆67Aug 21, 2024Updated last year
- Using modal.com to process FineWeb-edu data☆20Apr 5, 2025Updated 10 months ago
- A Google Chrome extension to create Markdown links for the current page☆18Apr 28, 2020Updated 5 years ago
- A framework that uses multi-agents to enable users to perform a systematic data science pipeline with just two inputs.☆42Aug 8, 2024Updated last year
- LlamaCards is a web application that provides a dynamic interface for interacting with LLM models in real-time. This app allows users to …☆38Aug 28, 2024Updated last year
- ☆68May 26, 2024Updated last year
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆84Oct 29, 2024Updated last year
- QLoRA with Enhanced Multi GPU Support☆38Aug 8, 2023Updated 2 years ago
- Automated LLM novelist☆46Apr 11, 2024Updated last year
- Exploring limitations of LLM-as-a-judge☆20Aug 17, 2024Updated last year
- Senna is an advanced AI-powered search engine designed to provide users with immediate answers to their queries by leveraging natural lan…☆19Sep 5, 2024Updated last year
- A combination of Oobabooga's fork and the main cuda branch of GPTQ-for-LLaMa in a package format.☆23Oct 6, 2023Updated 2 years ago
- Simple LLM inference server☆20Jun 13, 2024Updated last year
- JacQues is a Dash-based interactive web application that facilitates real-time chat and document management.☆22Jan 5, 2026Updated last month
- ☆49May 13, 2024Updated last year
- Bamboo-7B Large Language Model☆93Mar 28, 2024Updated last year
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆56Feb 10, 2025Updated last year
- 5X faster 60% less memory QLoRA finetuning☆21May 28, 2024Updated last year
- ☆20Jun 26, 2024Updated last year