xjdr-alt / mla_blog_translation
☆13Updated 10 months ago
Alternatives and similar repositories for mla_blog_translation:
Users that are interested in mla_blog_translation are comparing it to the libraries listed below
- NanoGPT (124M) quality in 2.67B tokens☆28Updated last week
- ☆71Updated this week
- RWKV-7: Surpassing GPT☆83Updated 5 months ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆96Updated last month
- Lego for GRPO☆27Updated 3 weeks ago
- Collection of autoregressive model implementation☆85Updated 2 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- train with kittens!☆57Updated 6 months ago
- Cerule - A Tiny Mighty Vision Model☆67Updated 7 months ago
- ☆49Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 11 months ago
- look how they massacred my boy☆63Updated 6 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆62Updated this week
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆126Updated 4 months ago
- ☆27Updated 9 months ago
- https://x.com/BlinkDL_AI/status/1884768989743882276☆27Updated 2 months ago
- A tree-based prefix cache library that allows rapid creation of looms: hierarchal branching pathways of LLM generations.☆68Updated 2 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated 11 months ago
- Because it's there.☆16Updated 7 months ago
- ☆25Updated 3 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆69Updated last year
- Make triton easier☆47Updated 10 months ago
- A repository of prompts and Python scripts for intelligent transformation of raw text into diverse formats.☆30Updated last year
- Modeling code for a BitNet b1.58 Llama-style model.☆24Updated 11 months ago
- ☆53Updated last month
- ☆66Updated 11 months ago
- Simplex Random Feature attention, in PyTorch☆74Updated last year
- train entropix like a champ!☆20Updated 6 months ago
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- QuIP quantization☆51Updated last year