replicate / cog
Containers for machine learning
β8,090Updated this week
Related projects β
Alternatives and complementary repositories for cog
- πΈ Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloadingβ9,248Updated 2 months ago
- Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with strucβ¦β11,533Updated this week
- the AI-native open-source embedding databaseβ15,448Updated this week
- Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adβ¦β5,994Updated 2 months ago
- Large Language Model Text Generation Inferenceβ9,122Updated this week
- tiktoken is a fast BPE tokeniser for use with OpenAI's models.β12,427Updated last month
- Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the clβ¦β20,632Updated this week
- AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (Nβ¦β4,561Updated 3 weeks ago
- π‘ All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflowsβ9,375Updated this week
- Hackable and optimized Transformers building blocks, supporting a composable construction.β8,660Updated this week
- Postgres with GPUs for ML/AI apps.β6,038Updated last week
- Train transformer language models with reinforcement learning.β10,086Updated this week
- QLoRA: Efficient Finetuning of Quantized LLMsβ10,059Updated 5 months ago
- SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability vβ¦β6,814Updated this week
- Fast and memory-efficient exact attentionβ14,279Updated this week
- Tensor library for machine learningβ11,233Updated this week
- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) trainingβ20,199Updated 3 months ago
- Accessible large language models via k-bit quantization for PyTorch.β6,299Updated this week
- AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file convertβ¦β17,780Updated this week
- π A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (iβ¦β7,958Updated this week
- Structured Text Generationβ9,487Updated this week
- A collection of libraries to optimise AI model performancesβ8,375Updated 3 months ago
- A high-throughput and memory-efficient inference and serving engine for LLMsβ30,423Updated this week
- Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"β10,776Updated 3 months ago
- Running large language models on a single GPU for throughput-oriented scenarios.β9,198Updated 3 weeks ago
- OpenLLaMA, a permissively licensed open source reproduction of Meta AIβs LLaMA 7B trained on the RedPajama datasetβ7,384Updated last year
- π€ PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.β16,471Updated this week
- Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.β15,049Updated last month
- An open source implementation of CLIP.β10,344Updated last week