C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welcome.
☆23Mar 4, 2024Updated 2 years ago
Alternatives and similar repositories for blitz-embed
Users that are interested in blitz-embed are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Analyze trends in articles published on arXiv☆19Apr 13, 2023Updated 3 years ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆85Oct 29, 2024Updated last year
- Headless, zero-runtime video editing using MCP and FFMPEG | Pure Bash - no Python/Node runtime needed☆20Jun 5, 2025Updated 10 months ago
- Run embedding models using ONNX☆36Jan 29, 2024Updated 2 years ago
- Confluent s2s Demo☆11Apr 28, 2023Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆24Jan 30, 2025Updated last year
- The web API server that runs program codes in an isolated environment using Docker.☆18Jul 20, 2023Updated 2 years ago
- Index and search your personal data quickly and privately.☆28Nov 20, 2021Updated 4 years ago
- ☆17Dec 16, 2024Updated last year
- A stable, fast and easy-to-use inference library with a focus on a sync-to-async API☆48Sep 26, 2024Updated last year
- QLoRA with Enhanced Multi GPU Support☆38Aug 8, 2023Updated 2 years ago
- ☆13Jul 16, 2022Updated 3 years ago
- text2sql with modern LLMs (duckdb-nsql, SQLCoder etc ...)☆18Apr 13, 2024Updated 2 years ago
- Minimalist agent framework for AI engineers☆20Jan 22, 2026Updated 2 months ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- GGML implementation of BERT model with Python bindings and quantization.☆57Feb 19, 2024Updated 2 years ago
- A fork of Nginx + RTMP on Windows including +PHP +video.js +dash.js☆10Sep 29, 2020Updated 5 years ago
- ☆160Apr 17, 2025Updated 11 months ago
- ☆16Jun 20, 2023Updated 2 years ago
- A universal Qdrant table frontend based on transformers.js☆20Mar 26, 2024Updated 2 years ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- The OpenCitations RDF Resource Browser☆15Oct 29, 2025Updated 5 months ago
- ☆13Jun 29, 2024Updated last year
- Fine Tune Multimodal LLM "Idefics 2" using QLoRA.☆11Apr 20, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A Python implementation of an agent swarm system that works with local LLM servers. The system allows you to create multiple agents that …☆13Nov 20, 2024Updated last year
- Confusion Matrix in Python: plot a pretty confusion matrix (like Matlab) in python using seaborn and matplotlib☆19Nov 19, 2021Updated 4 years ago
- ☆15Aug 2, 2024Updated last year
- RWKV is a RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best …☆10Nov 3, 2023Updated 2 years ago
- Token-aware HTML chunking that preserves structure and attributes, with optional cleaning and attribute length control.☆15Aug 12, 2025Updated 8 months ago
- ☆18Apr 10, 2023Updated 3 years ago
- Kosmos-2.5 is a cutting-edge Multimodal-LLM (MLLM) specializing in image OCR. However, its stringent software requirements & Python-scrip…☆68Jul 22, 2024Updated last year
- Linkedin CLI for automation☆37Mar 9, 2026Updated last month
- Python Package for Named Entity Recognition (NER) - Based on Dictionary and Fuzzy Matching (Lexical Fuzzy Named Entity Recognition)☆16Jul 25, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧☆10Aug 30, 2024Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.☆39Apr 11, 2024Updated 2 years ago
- Agglomerative hierarchical clustering in JavaScript☆19Dec 17, 2024Updated last year
- Testing speed and accuracy of RAG with, and without Cross Encoder Reranker.☆49Jan 12, 2024Updated 2 years ago
- Load multiple LoRA modules simultaneously and automatically switch the appropriate combination of LoRA modules to generate the best answe…☆159Feb 9, 2024Updated 2 years ago
- ☆17Feb 1, 2024Updated 2 years ago
- Exploratory Data Analysis☆12Dec 12, 2017Updated 8 years ago