eniompw / nanoGPTshakespeare
finetuning shakespeare on karpathy/nanoGPT
☆16Updated last year
Alternatives and similar repositories for nanoGPTshakespeare:
Users that are interested in nanoGPTshakespeare are comparing it to the libraries listed below
- A library for simplifying fine tuning with multi gpu setups in the Huggingface ecosystem.☆16Updated 2 months ago
- Testing KAN-based text generation GPT models☆15Updated 8 months ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆23Updated 2 months ago
- fine tuning mistral 7B using Huggingface, Weights and Biases, Choline, and Vast AI☆38Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 10 months ago
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated 8 months ago
- Trying to deconstruct RWKV in understandable terms☆14Updated last year
- ☆60Updated last year
- Machine Learning Serving focused on GenAI with simplicity as the top priority.☆58Updated last week
- A repository of Python scripts to scrape code contents of the public repositories of `huggingface`.☆46Updated 10 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated 2 months ago
- ☆45Updated 2 weeks ago
- Effort to open-source 10.5 trillion parameter Gemini model.☆17Updated last year
- Conversational agents for engineering simulations with minimal human input using Microsoft AutoGen & GPT-4o.☆27Updated 5 months ago
- Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models☆70Updated last year
- Chat with Qwen2-VL. Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆10Updated 4 months ago
- ☆47Updated last week
- Notebooks using the Neural Magic libraries 📓☆41Updated 5 months ago
- Retrieval-Augmented Generation (RAG) over a Large Language Model (LLM) For PDF data extraction☆15Updated 11 months ago
- It is almost the best 3B model in the current open source industry, surpassing Dolly v2-3b, open lama-3b, and even outperforming the Eleu…☆13Updated last year
- Tools for merging pretrained large language models.☆19Updated 7 months ago
- A langchain agent that retries☆48Updated last year
- Using langchain, deeplake and openai to create a Q&A on the Mojo lang programming manual☆22Updated last year
- Run Llama 2 using MLX on macOS☆32Updated last year
- ☆50Updated last month
- alternative way to calculating self attention☆18Updated 7 months ago
- Score LLM pretraining data with classifiers☆55Updated last year
- A tutorial for building autonomous agents: with LangChain and from scratch☆22Updated last year
- Very minimal (and stateless) agent framework☆41Updated last week