Create embeddings with infinity as serverless endpoint
☆42Nov 21, 2025Updated 4 months ago
Alternatives and similar repositories for worker-infinity-embedding
Users that are interested in worker-infinity-embedding are comparing it to the libraries listed below
Sorting:
- Ask Poddy: Run Open Source LLMs and Embeddings as OpenAI-Compatible Serverless Endpoints (Tutorial)☆11Jul 19, 2024Updated last year
- ☆12Feb 22, 2024Updated 2 years ago
- ☆14Dec 21, 2025Updated 3 months ago
- Apache Arrow-compatible space-efficient "tape" class in pure Rust to be used with StringZilla for GPU, NUMA, and disk transfers of variab…☆29Nov 21, 2025Updated 4 months ago
- WIP: Ofen is a toolkit aimed at making transformer models production-ready. API included☆17Oct 2, 2024Updated last year
- Multi-Agent Reinforcement Learning Environment for the card game SkyJo, compatible with PettingZoo and RLLIB☆16Feb 21, 2026Updated last month
- ☆35Mar 5, 2026Updated 2 weeks ago
- Golang SDK for Truss☆40Feb 26, 2026Updated 3 weeks ago
- Hugging Face RoBERTa with Flash Attention 2☆24Sep 14, 2025Updated 6 months ago
- My Gen AI research☆11Jun 3, 2024Updated last year
- A simple website to manage your Hyper-V VMs and IIS sites☆12Jan 19, 2023Updated 3 years ago
- ☆14Sep 18, 2024Updated last year
- A Generative Adversarial Network Model Alternative to Animal Studies for Clinical Pathology Assessment☆15Jan 10, 2024Updated 2 years ago
- A cross platform and file system Python module for linking files.☆14Aug 3, 2017Updated 8 years ago
- [NeurIPS 2025] GuideFlow3D: Optimization-Guided Rectified Flow For Appearance Transfer☆24Dec 1, 2025Updated 3 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆19May 26, 2024Updated last year
- Finetuning a codegen model with python instruction set using QLORA technique for better efficacy☆11Aug 31, 2023Updated 2 years ago
- A vllm proxy server to add security and multi model management for vllm servers☆12May 30, 2024Updated last year
- An Offline and Secure Retrieval-Augmented Generation (RAG) system designed for efficient processing of diverse content types with minimal…☆20Dec 29, 2024Updated last year
- THINK LESS, SCREAM MORE!☆11Feb 17, 2016Updated 10 years ago
- Example of Langchain-Elasticsearch integrations & RAG.☆12Sep 20, 2024Updated last year
- A LLaMA2-7b chatbot with memory running on CPU, and optimized using smooth quantization, 4-bit quantization or Intel® Extension For PyTor…☆15Feb 27, 2024Updated 2 years ago
- A dual-chatbot system for learning languages based on LangChain☆13Jun 25, 2023Updated 2 years ago
- Enabling Live Migration for Computational Notebooks.☆14Mar 11, 2024Updated 2 years ago
- Where we keep our notes about model training runs.☆16Mar 12, 2023Updated 3 years ago
- Create and manage isolated Git worktrees for AI coding agents.☆27Mar 3, 2026Updated 2 weeks ago
- [UNMAINTAINED] Tessel 1's getting started page☆32Oct 26, 2015Updated 10 years ago
- Personnal collection of pipes and filters I use for open-webui☆26Mar 10, 2026Updated last week
- Web UI for Bark by Suno.ai built with next.js☆12Jun 15, 2023Updated 2 years ago
- ☆15Dec 3, 2024Updated last year
- Unofficial Mirror of https://github.com/aireveries/RarePlanes.git☆13May 20, 2022Updated 3 years ago
- German Anki Decks based on KIT lectures☆15Feb 7, 2023Updated 3 years ago
- ☆14Dec 3, 2025Updated 3 months ago
- Basic Geometry and Linear Algebra library☆15Feb 14, 2023Updated 3 years ago
- Convert your Raspberry Pi into a DMX512 controller☆11Apr 14, 2024Updated last year
- Title says it all, doesn't it?☆21Aug 3, 2014Updated 11 years ago
- A minifier based on typescript compiler to safely rename all of the variables, classes, functions, properties and methods to short.☆11Jan 3, 2026Updated 2 months ago
- ☆78Nov 26, 2024Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Nov 11, 2024Updated last year