☆22Oct 1, 2024Updated last year
Alternatives and similar repositories for LLM-inference-speed-benchmarks
Users that are interested in LLM-inference-speed-benchmarks are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Attend - to what matters.☆17Feb 22, 2025Updated last year
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆11Jul 22, 2023Updated 2 years ago
- Precision Knowledge Editing (PKE): A novel method to reduce toxicity in LLMs while preserving performance, with robust evaluations and ha…☆11Nov 26, 2024Updated last year
- An RPG Maker MZ plugin☆12Nov 2, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Explore semantic caching to reduce your OpenAI/LLM API bill☆11Jul 21, 2023Updated 2 years ago
- SYN flood implementation using Boost.Asio☆12Nov 20, 2014Updated 11 years ago
- openai-proxy-vercel☆12Aug 11, 2023Updated 2 years ago
- Work with your business data using natural language☆19Nov 20, 2024Updated last year
- ☆14Sep 24, 2024Updated last year
- Code repository for the paper on "Predicting the Performance of Black-Box LLMs through Self-Queries".☆12Jan 9, 2025Updated last year
- A discord theme which makes the guild and user sidebars collapsible☆14Jan 25, 2026Updated 3 months ago
- Simple implementation of an AABB Tree (Axis Aligned Bounding Box Tree) to optimize 3d collision detection☆10Oct 22, 2024Updated last year
- A curated collection of OpenClaw resources: GCP installation guide, best practices, and use cases☆20Feb 19, 2026Updated 2 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Load and run Llama from safetensors files in C☆15Oct 24, 2024Updated last year
- A simple implementation of anti-spam bot for itmo opensource chat☆11Sep 29, 2025Updated 7 months ago
- how to deal with multi-armed bandit problem through different approaches☆13Jul 6, 2023Updated 2 years ago
- G'MIC-Qt is a versatile front-end to the image processing framework G'MIC.☆17Mar 18, 2026Updated last month
- a character-ai like UI for LLM☆10Dec 3, 2024Updated last year
- ☆22Mar 25, 2025Updated last year
- ☆14Nov 23, 2018Updated 7 years ago
- ☆11Aug 5, 2024Updated last year
- Locust on k8s example for scalable load tests☆14Apr 16, 2022Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31May 22, 2024Updated last year
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago
- ☆16Mar 14, 2025Updated last year
- A simple implementation of Llama 1, 2. Llama Architecture built from scratch using PyTorch all the models are built from scratch that inc…☆14May 6, 2024Updated 2 years ago
- Revision of official yolov7-pose to support custom dataset for keypoint detection☆11Nov 12, 2023Updated 2 years ago
- ☆14Aug 22, 2024Updated last year
- 《GPT-4, ChatGPT, 라마인덱스, 랭체인을 활용한 인공지능 프로그래밍》 예제 코드☆10Jan 16, 2024Updated 2 years ago
- CLI tool for setting G400 mouse settings (sampling rate and DPI) w/libusb☆26Nov 6, 2020Updated 5 years ago
- ☆14Mar 18, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆13Mar 6, 2024Updated 2 years ago
- Implementation of the Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning by Tianmin Shu, Caiming Xiong…☆11Jun 18, 2018Updated 7 years ago
- ☆11May 20, 2022Updated 3 years ago
- KMM: Key Frame Mask Mamba for Extended Motion Generation☆19Sep 22, 2025Updated 7 months ago
- Script that converts 7.1 surround sound files to virtual surround stereo using HeSuVi. For Dolby Atmos, check out: https://github.com/Thr…☆17Dec 26, 2020Updated 5 years ago
- Build your own offline AI from any documents. Free. No coding. LoRA fine-tuning + RAG + GGUF export.☆91Mar 21, 2026Updated last month
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Nov 22, 2023Updated 2 years ago