poudels14 / llama2_rs_oldLinks
☆23Updated 2 years ago
Alternatives and similar repositories for llama2_rs_old
Users that are interested in llama2_rs_old are comparing it to the libraries listed below
Sorting:
- ☆133Updated last year
- Rust Implementation of micrograd☆53Updated last year
- ☆138Updated last year
- implement llava using candle☆15Updated last year
- Inference Llama 2 in one file of zero-dependency, zero-unsafe Rust☆39Updated 2 years ago
- This repository has code for fine-tuning LLMs with GRPO specifically for Rust Programming using cargo as feedback☆107Updated 7 months ago
- Inference Llama 2 in one file of pure Rust 🦀☆233Updated 2 years ago
- A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust☆79Updated last year
- "PyTorch in Rust"☆16Updated last year
- inference code for mixtral-8x7b-32kseqlen☆101Updated last year
- Full finetuning of large language models without large memory requirements☆93Updated 2 weeks ago
- Simplex Random Feature attention, in PyTorch☆73Updated 2 years ago
- ☆61Updated last year
- ☆88Updated last year
- ☆46Updated last year
- ☆144Updated 2 years ago
- ☆157Updated 2 years ago
- A sketch of a Transformer in Rust for a blog post☆34Updated 3 years ago
- Port of Andrej Karpathy's nanoGPT to Apple MLX framework.☆112Updated last year
- Make triton easier☆47Updated last year
- This code repository contains the code used for my "Optimizing Memory Usage for Training LLMs and Vision Transformers in PyTorch" blog po…☆92Updated 2 years ago
- A really tiny autograd engine☆95Updated 4 months ago
- Mixtral finetuning☆19Updated last year
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆169Updated last year
- gzip Predicts Data-dependent Scaling Laws☆34Updated last year
- ☆54Updated last year
- Functional local implementations of main model parallelism approaches☆96Updated 2 years ago
- ☆12Updated 8 months ago
- ☆28Updated last year
- ModuleFormer is a MoE-based architecture that includes two different types of experts: stick-breaking attention heads and feedforward exp…☆223Updated 3 weeks ago