pharaouk / dharma
☆13Updated 11 months ago
Alternatives and similar repositories for dharma:
Users that are interested in dharma are comparing it to the libraries listed below
- ☆112Updated 4 months ago
- look how they massacred my boy☆63Updated 6 months ago
- Full finetuning of large language models without large memory requirements☆94Updated last year
- Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.☆82Updated last year
- KMD is a collection of conversational exchanges between patients and doctors on various medical topics. It aims to capture the intricaci…☆24Updated last year
- ☆48Updated last year
- smolLM with Entropix sampler on pytorch☆151Updated 5 months ago
- Entropy Based Sampling and Parallel CoT Decoding☆17Updated 6 months ago
- 🦾💻🌐 distributed training & serverless inference at scale on RunPod☆17Updated 10 months ago
- MLX port for xjdr's entropix sampler (mimics jax implementation)☆63Updated 5 months ago
- Modify Entropy Based Sampling to work with Mac Silicon via MLX☆50Updated 5 months ago
- Just a bunch of benchmark logs for different LLMs☆119Updated 8 months ago
- ☆94Updated last year
- ☆49Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆95Updated last month
- ☆66Updated 10 months ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Updated last year
- ☆129Updated 8 months ago
- ☆28Updated last year
- ☆38Updated 8 months ago
- ☆73Updated last year
- Cerule - A Tiny Mighty Vision Model☆67Updated 7 months ago
- Lego for GRPO☆26Updated 2 weeks ago
- ☆48Updated 5 months ago
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- entropix style sampling + GUI☆25Updated 5 months ago
- autologic is a Python package that implements the SELF-DISCOVER framework proposed in the paper SELF-DISCOVER: Large Language Models Self…☆57Updated last year
- LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT☆27Updated last year
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆31Updated last month
- inference code for mixtral-8x7b-32kseqlen☆99Updated last year