HackerCupAI / starter-kits
☆64Updated 7 months ago
Alternatives and similar repositories for starter-kits
Users that are interested in starter-kits are comparing it to the libraries listed below
Sorting:
- A competition to get you started on the NeurIPS AI Hackercup☆28Updated 7 months ago
- ☆23Updated 7 months ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆29Updated 3 weeks ago
- Building GPT ...☆17Updated 5 months ago
- ☆90Updated 2 weeks ago
- ML/DL Math and Method notes☆60Updated last year
- Collection of autoregressive model implementation☆85Updated 3 weeks ago
- A repository to unravel the language of GPUs, making their kernel conversations easy to understand☆180Updated this week
- ☆40Updated last year
- Notebooks for fine tuning pali gemma☆102Updated last month
- Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)☆189Updated 11 months ago
- A set of scripts and notebooks on LLM finetunning and dataset creation☆109Updated 7 months ago
- ☆74Updated 3 weeks ago
- ☆19Updated 10 months ago
- ☆129Updated last month
- Code for NeurIPS LLM Efficiency Challenge☆57Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*☆82Updated last year
- Just a bunch of benchmark logs for different LLMs☆119Updated 9 months ago
- Set of scripts to finetune LLMs☆37Updated last year
- A puzzle to learn about prompting☆128Updated 2 years ago
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆98Updated 2 months ago
- Open source interpretability artefacts for R1.☆109Updated 3 weeks ago
- ☆54Updated 3 months ago
- LLM attention pattern visualizer☆10Updated last year
- Compiling useful links, papers, benchmarks, ideas, etc.☆46Updated last month
- ☆123Updated 6 months ago
- Write a fast kernel and run it on Discord. See how you compare against the best!☆44Updated last week
- Mixture-of-Transformers A Sparse and Scalable Architecture for Multi-Modal Foundation Models. TMLR 2025. 🔗 https//arxiv.org/abs/2411.049…☆31Updated this week
- Simple repository for training small reasoning models☆27Updated 3 months ago