plaggy / fast-whisper-server
ASR + diarization model server with speculative decoding
☆44Updated 3 months ago
Related projects: ⓘ
- Scripts to create your own moe models using mlx☆86Updated 6 months ago
- ☆81Updated 2 weeks ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆52Updated 3 weeks ago
- C++ inference wrappers for running blazing fast embedding services on your favourite serverless like AWS Lambda. By Prithivi Da, PRs welc…☆21Updated 6 months ago
- Using multiple LLMs for ensemble Forecasting☆17Updated 8 months ago
- Transcribe and summarize videos using whisper and llms on apple mlx framework☆68Updated 7 months ago
- OpenMindedChatbot is a Proof Of Concept that leverages the power of Open source Large Language Models (LLM) with Function Calling capabil…☆26Updated 9 months ago
- ☆20Updated 6 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆58Updated 2 weeks ago
- ☆101Updated 5 months ago
- A framework for evaluating function calls made by LLMs☆34Updated last month
- auto fine tune of models with synthetic data☆71Updated 7 months ago
- A pipeline parallel training script for LLMs.☆79Updated 3 weeks ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆38Updated 2 weeks ago
- Beating the GAIA benchmark with Transformers Agents. 🚀☆56Updated 2 weeks ago
- Llama3.1 learns to Listen☆134Updated this week
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆20Updated 7 months ago
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆101Updated last week
- Open source and AI-powered web search engine: local, private, dockerized and supported by a fluffy llama🦙☆49Updated last month
- Writing Blog Posts with Generative Feedback Loops!☆41Updated 6 months ago
- RAG example using DSPy, Gradio, FastAPI☆57Updated 5 months ago
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆41Updated last month
- Fast approximate inference on a single GPU with sparsity aware offloading☆39Updated 8 months ago
- huggingface chat-ui integration with mlx-lm server☆59Updated 7 months ago
- All the world is a play, we are but actors in it.☆46Updated 2 months ago
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems.☆48Updated 3 weeks ago
- ☆59Updated last week
- A streamlit app for visualizing LLM evals.☆38Updated 8 months ago
- StructuredRAG Benchmarker☆85Updated this week
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆53Updated 2 months ago