uygarkurt / BERT-PyTorch
☆17Updated 3 months ago
Alternatives and similar repositories for BERT-PyTorch:
Users that are interested in BERT-PyTorch are comparing it to the libraries listed below
- LORA: Low-Rank Adaptation of Large Language Models implemented using PyTorch☆101Updated last year
- Playground for Transformers☆49Updated last year
- BERT explained from scratch☆12Updated last year
- Set of scripts to finetune LLMs☆37Updated last year
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated last year
- LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.☆69Updated last year
- Quantization of LLMs and benchmarking.☆10Updated last year
- Combining ViT and GPT-2 for image captioning. Trained on MS-COCO. The model was implemented mostly from scratch.☆43Updated last year
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆42Updated 11 months ago
- Experimenting with small language models☆65Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆74Updated 6 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 5 months ago
- ☆16Updated last year
- RL significantly the reasoning capability of Qwen2.5-1.5B-Instruct☆28Updated 2 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 11 months ago
- A chatbot UI for RAG, multimodal, text completion. (support Transformers, llama.cpp, MLX, vLLM)☆19Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- This is the code that went into our practical dive using mamba as information extraction☆54Updated last year
- Distributed training (multi-node) of a Transformer model☆64Updated last year
- Notes on Direct Preference Optimization☆19Updated last year
- Reweight GPT - a simple neural network using transformer architecture for next character prediction☆53Updated last year
- A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages☆106Updated 6 months ago
- minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever☆38Updated last month
- MathPrompter Implementation: This repository hosts an implementation based on the 'MathPrompter: Mathematical Reasoning Using Large Langu…☆13Updated 2 weeks ago
- nanogpt turned into a chat model☆68Updated last year
- Tutorial for how to build BERT from scratch☆92Updated 11 months ago
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆158Updated 11 months ago
- Fine-tuning large language models (LLMs) is crucial for enhancing performance across domain-specific task applications. This comprehensiv…☆12Updated 7 months ago
- Implementation of the Mamba SSM with hf_integration.☆56Updated 7 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆31Updated 2 months ago