apple / ml-hypercloning
☆34Updated last week
Related projects ⓘ
Alternatives and complementary repositories for ml-hypercloning
- ☆43Updated 2 months ago
- ☆76Updated 6 months ago
- ☆44Updated 2 months ago
- MEXMA: Token-level objectives improve sentence representations☆32Updated this week
- ☆40Updated this week
- Code for NeurIPS LLM Efficiency Challenge☆53Updated 7 months ago
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆49Updated last week
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- ☆31Updated 10 months ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆34Updated 7 months ago
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"☆84Updated 2 months ago
- ☆39Updated 9 months ago
- Collection of autoregressive model implementation☆66Updated last week
- some common Huggingface transformers in maximal update parametrization (µP)☆76Updated 2 years ago
- Supercharge huggingface transformers with model parallelism.☆74Updated last month
- ☆40Updated 6 months ago
- My explorations into editing the knowledge and memories of an attention network☆34Updated last year
- A place to store reusable transformer components of my own creation or found on the interwebs☆43Updated this week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆46Updated 2 months ago
- code for training & evaluating Contextual Document Embedding models☆93Updated this week
- Training hybrid models for dummies.☆15Updated 2 weeks ago
- ☆37Updated last year
- Index of URLs to pdf files all over the internet and scripts☆21Updated last year
- Repository containing awesome resources regarding Hugging Face tooling.☆43Updated 10 months ago
- ☆26Updated 4 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆29Updated last week
- See https://github.com/cuda-mode/triton-index/ instead!☆11Updated 6 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆22Updated 8 months ago