replicate / cog-tritonLinks
A cog implementation of Nvidia's Triton server
☆17Updated last year
Alternatives and similar repositories for cog-triton
Users that are interested in cog-triton are comparing it to the libraries listed below
Sorting:
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated last year
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Updated last year
- ☆11Updated last year
- Chat AI (↓↓Scroll to see more↓↓)☆27Updated last year
- Simple script to quiz LLMs☆29Updated 2 years ago
- A mono-repo to house the various supported Transport options to be used with Pipecat's client-js package☆30Updated last week
- Apps that run on modal.com☆12Updated 4 months ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆48Updated last year
- ☆15Updated 2 years ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆46Updated 2 years ago
- Using modal.com to process FineWeb-edu data☆20Updated 10 months ago
- Proof of concept for running moshi/hibiki using webrtc☆20Updated 11 months ago
- A library to convert Pydantic models to TypedDict☆37Updated last year
- Run LLMs on Replicate with vLLM☆26Updated 6 months ago
- An app for generating prompts☆28Updated 5 months ago
- Tool4AI: A model agnostic, LLM friendly router for tool/function call☆19Updated last year
- ☆33Updated 2 years ago
- Embedding models from Jina AI☆65Updated 2 years ago
- Generate visual podcasts about novels using open source models☆25Updated 2 years ago
- ☆34Updated last year
- 👩🤝🤖 A curated list of datasets for large language models (LLMs), RLHF and related resources (continually updated)☆24Updated 2 years ago
- Code Interpreter Replica☆26Updated 2 years ago
- Convert an audio file to a waveform video☆11Updated 2 years ago
- A modular framework for building massively parallel agentic systems☆29Updated 5 months ago
- ☆19Updated last year
- ☆40Updated 8 months ago
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆26Updated 11 months ago
- Langchain Agent utilizing OpenAI Function Calls to execute Git commands using Natural Language☆44Updated 2 years ago
- GPU accelerated client-side embeddings for vector search, RAG etc.☆65Updated 2 years ago
- ☆15Updated 2 years ago