replicate / cog-tritonLinks
A cog implementation of Nvidia's Triton server
☆17Updated last year
Alternatives and similar repositories for cog-triton
Users that are interested in cog-triton are comparing it to the libraries listed below
Sorting:
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Updated last year
- Chat AI (↓↓Scroll to see more↓↓)☆27Updated last year
- ☆11Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated last year
- ☆49Updated 10 months ago
- Website with current metrics on the fastest AI models.☆43Updated last year
- Generate visual podcasts about novels using open source models☆25Updated 2 years ago
- Using modal.com to process FineWeb-edu data☆20Updated 10 months ago
- A library to convert Pydantic models to TypedDict☆37Updated last year
- A mono-repo to house the various supported Transport options to be used with Pipecat's client-js package☆30Updated last week
- Apps that run on modal.com☆12Updated 4 months ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆46Updated 2 years ago
- Proof of concept for running moshi/hibiki using webrtc☆20Updated 11 months ago
- Tool4AI: A model agnostic, LLM friendly router for tool/function call☆19Updated last year
- A Next.js chatbot app demonstrating seamless integration with window.ai.☆15Updated 2 years ago
- Run LLMs on Replicate with vLLM☆26Updated 6 months ago
- Deploy your autonomous agents to production grade environments with 99% Uptime Guarantee, Infinite Scalability, and self-healing.☆50Updated 3 months ago
- ☆19Updated last year
- An app for generating prompts☆28Updated 5 months ago
- ☆33Updated 2 years ago
- Create embeddings with infinity as serverless endpoint☆42Updated 2 months ago
- ☆15Updated 2 years ago
- Pipeline is an open source python SDK for building AI/ML workflows☆138Updated last year
- Cog wrapper for collabora/WhisperSpeech☆25Updated last year
- ☆40Updated 8 months ago
- A function to do all☆34Updated last year
- A multimodal RAG application that enables semantic search on multimedia sources like audio, video and images☆41Updated 2 years ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Updated 2 years ago
- Langchain Agent utilizing OpenAI Function Calls to execute Git commands using Natural Language☆44Updated 2 years ago
- ☆34Updated last year