Zyphra / zcookbook
Training hybrid models for dummies.
☆15Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for zcookbook
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆20Updated last week
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆28Updated last month
- Latent Large Language Models☆16Updated 2 months ago
- See https://github.com/cuda-mode/triton-index/ instead!☆11Updated 6 months ago
- ☆36Updated 3 months ago
- ☆38Updated this week
- [WIP] Transformer to embed Danbooru labelsets☆13Updated 7 months ago
- ☆22Updated last year
- Implementation of Spectral State Space Models☆17Updated 8 months ago
- alternative way to calculating self attention☆18Updated 5 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆19Updated 4 months ago
- A place to store reusable transformer components of my own creation or found on the interwebs☆43Updated this week
- ☆11Updated 3 weeks ago
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 8 months ago
- BH hackathon☆14Updated 7 months ago
- Alpha-Zero Connect Four NN trained via self play☆13Updated last month
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated this week
- ☆12Updated 3 weeks ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆36Updated 7 months ago
- Rust bindings for CTranslate2☆13Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆22Updated 8 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆23Updated last week
- An example implementation of RLHF (or, more accurately, RLAIF) built on MLX and HuggingFace.☆21Updated 4 months ago
- GoldFinch and other hybrid transformer components☆39Updated 3 months ago
- Exploration using DSPy to optimize modules to maximize performance on the OpenToM dataset☆13Updated 8 months ago
- ☆26Updated 4 months ago
- ☆28Updated 2 weeks ago
- Collection of autoregressive model implementation☆66Updated last week
- ☆35Updated 2 weeks ago