ajeetkharel / gpt2-from-scratch
Build and Train a GPT-2 from scratch using PyTorch
☆14Updated 6 months ago
Alternatives and similar repositories for gpt2-from-scratch:
Users that are interested in gpt2-from-scratch are comparing it to the libraries listed below
- Delta-CoMe can achieve near loss-less 1-bit compressin which has been accepted by NeurIPS 2024☆52Updated 2 months ago
- ☆10Updated 3 months ago
- ☆18Updated 4 months ago
- CausalMatch is a Bytedance research project aimed at integrating cutting-edge machine learning and econometrics methods to bring about au…☆45Updated this week
- ☆96Updated 4 months ago
- ☆16Updated 2 weeks ago
- ☆39Updated last month
- Multi-Layer Key-Value sharing experiments on Pythia models☆32Updated 7 months ago
- A REST API for vLLM, production ready☆17Updated this week
- Self-host LLMs with vLLM and BentoML☆79Updated last week
- A library integrating embedding and reranker models from OpenAI, SentenceTransformers etc for semantic search in vector database.☆29Updated this week
- Repository containing awesome resources regarding Hugging Face tooling.☆46Updated last year
- Accelerate Model Training with PyTorch 2.X, published by Packt☆35Updated 7 months ago
- 🧰 The AutoTokenizer that TikToken always needed -- Load any tokenizer with TikToken now! ✨☆33Updated 2 weeks ago
- ☆20Updated 7 months ago
- ☆12Updated last week
- Code for KaLM-Embedding models☆64Updated last week
- DST(Dialogue State Tracker) for LLM(Large Language Model)☆22Updated last year
- ☆26Updated 5 months ago
- World's Smallest Vision-Language Model☆24Updated 9 months ago
- Deployment a light and full OpenAI API for production with vLLM to support /v1/embeddings with all embeddings models.☆39Updated 6 months ago
- Inference Llama/Llama2/Llama3 Modes in NumPy☆20Updated last year
- ☆40Updated 9 months ago
- ☆27Updated 2 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆37Updated 2 months ago
- Repo designed to help learn the Hugging Face ecosystem (transformers, datasets, accelerate + more).☆50Updated 3 months ago
- ☆29Updated 7 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated last month
- POS for African languages☆17Updated 11 months ago
- ☆63Updated last month