☆14Apr 26, 2024Updated 2 years ago
Alternatives and similar repositories for llama2-from-scratch
Users that are interested in llama2-from-scratch are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆22Feb 14, 2024Updated 2 years ago
- A tool allowing students of Coursera's Heterogeneous Parallel Programming to work on homework using a machine without a CUDA GPU.☆11Mar 11, 2015Updated 11 years ago
- 小彭老师推出 SyCL 2020 课程(施工中,日后会在直播中放出)☆15Sep 3, 2023Updated 2 years ago
- ☆15Feb 1, 2016Updated 10 years ago
- ☆10Nov 14, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Asynchronous Rust bindings for SPDK.☆18Nov 1, 2022Updated 3 years ago
- CUDA_C编程权威指南示例代码☆13Mar 22, 2023Updated 3 years ago
- [TOG 2024] BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation☆16Jun 14, 2024Updated 2 years ago
- ☆14Jun 23, 2025Updated last year
- an implementation of parallel skills like amp, ddp, pp, tp for learning purposes☆14Nov 18, 2023Updated 2 years ago
- Mini CCL - A lightweight collective communication library☆34Jan 2, 2026Updated 5 months ago
- A NodeJS application to upload, watch and stream live videos.☆12Jan 24, 2023Updated 3 years ago
- Rust FTL + WebRTC live streaming software.☆13Mar 12, 2022Updated 4 years ago
- Reverse engineered Twitter's API☆12Nov 28, 2023Updated 2 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Brax + Pufferlib + CARBS for gpu-accelerated robotics RL☆12Jun 12, 2025Updated last year
- SMASH is a hardware-software cooperative mechanism that enables highly-efficient indexing and storage of sparse matrices. The key idea of…☆19May 17, 2020Updated 6 years ago
- Lab 5 project of MIT-6.5940, deploying LLaMA2-7B-chat on one's laptop with TinyChatEngine.☆18Dec 1, 2023Updated 2 years ago
- Optimizing diffusion for production-ready speeds☆40Jan 10, 2026Updated 5 months ago
- Expand -> Retrieve -> Rerank - simple method with strong results on BRIGHT benchmark☆22Aug 22, 2025Updated 10 months ago
- A universal thread-safe memory pool.☆26Jul 20, 2018Updated 7 years ago
- ☆19May 17, 2016Updated 10 years ago
- DGEMM on KNL, achieve 75% MKL☆19May 19, 2022Updated 4 years ago
- SYCL materials for ENCCS workshop☆25Apr 25, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An MLIR frontend for tensor expressions☆24Sep 5, 2020Updated 5 years ago
- Efficient implementations of Merge Sort and Bitonic Sort algorithms using CUDA for GPU parallel processing, resulting in accelerated sort…☆22Jul 27, 2023Updated 2 years ago
- Developing a high-precision legal expert LLM application called Contract Advisor RAG. The project's goal is to create a Retrieval Augment…☆16Apr 10, 2024Updated 2 years ago
- Multimodal RAG using LlamaIndex, Qdrant, llama.cpp for document QA with local VisonLLM and embedding models☆20Nov 8, 2024Updated last year
- C++20 Memory Allocator library☆37Apr 30, 2025Updated last year
- Hacker News Search and RAG built using Rust actix-web, minijinja, SolidJS, Vite, and Redis queue's☆31Dec 11, 2024Updated last year
- Open-source Search GPT engine☆21Nov 4, 2024Updated last year
- Rust futures on spdk☆32Oct 6, 2019Updated 6 years ago
- Document Summarization App using large language model (LLM) and Langchain framework. Used a pre-trained T5 model and its tokenizer from H…☆15Oct 5, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Text summation using python, deep learning, machine learning, transformer, huggingface, openai and langchain☆13Nov 26, 2024Updated last year
- Robot Learning Algorithms☆25Aug 19, 2024Updated last year
- AI Search Engine in 70 lines☆12Feb 23, 2025Updated last year
- React-based reader and editor for creating notes and flashcards directly from PDF documents.☆16Apr 23, 2024Updated 2 years ago
- UniRL is a Framework for Unified Multimodal Model Reinforcement Learning☆707Updated this week
- Various version (CPU, CUDA_NAIVE, CUDA_TILED, GEMM) convolutional neural network implementations by Heechul Lim☆30Feb 21, 2018Updated 8 years ago
- image retrieval/tagging with CLIP☆13Jul 13, 2024Updated last year