LLaMa Tuning with Stanford Alpaca Dataset using Deepspeed and Transformers
☆49Mar 15, 2023Updated 3 years ago
Alternatives and similar repositories for llama-tune
Users that are interested in llama-tune are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Train llama with lora on one 4090 and merge weight of lora to work as stanford alpaca.☆52Jun 16, 2023Updated 2 years ago
- ☆10Jun 1, 2024Updated last year
- Train large COMET (T5-3B/GPT2-XL) with small memory (on 11GB memory GPUs like 1080/2080) using DeepSpeed.☆14Jan 23, 2022Updated 4 years ago
- train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism☆223Nov 21, 2023Updated 2 years ago
- 本项目采用BERT等预训练模型实现多项选择型阅读理解任务(Multiple Choice MRC)☆16Jun 20, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- From Symbolic Logic Reasoning to Soft Reasoning: A Neural-Symbolic Paradigm☆12Jul 18, 2022Updated 3 years ago
- Fast inference of Instruct tuned LLaMa on your personal devices.☆23Mar 16, 2023Updated 3 years ago
- KuaiSearch PERKS☆12Nov 16, 2021Updated 4 years ago
- Alpaca-lora for huggingface implementation using Deepspeed and FullyShardedDataParallel☆24Apr 3, 2023Updated 3 years ago
- realize the reinforcement learning training for gpt2 llama bloom and so on llm model☆27Sep 19, 2023Updated 2 years ago
- Collection of scripts to pretrain T5 in unsupervised text, using PyTorch Lightning. CORD-19 pretraining provided as example.☆32Apr 26, 2021Updated 4 years ago
- AdaLoGN: Adaptive Logic Graph Network for Reasoning-Based Machine Reading Comprehension (ACL 2022)☆27May 20, 2022Updated 3 years ago
- A Python implementation of Toolformer using Huggingface Transformers☆14Mar 20, 2023Updated 3 years ago
- [Findings of ACL 2022] Meta-Path Guided Contrastive Learning for Logical Reasoning of Text☆28Apr 8, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- PyTorch implementation of StableMask (ICML'24)☆15Jun 27, 2024Updated last year
- A Keras implementation of the AAAI21 paper "a lightweight neural model for biomedical entity linking"☆53Jul 24, 2022Updated 3 years ago
- Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3☆23May 20, 2021Updated 4 years ago
- NanoDet for Jetson Nano☆11Sep 30, 2023Updated 2 years ago
- Example of Alpaca-LoRA with llama index.☆31Mar 30, 2023Updated 3 years ago
- [EMNLP 2020] PyTorch code of PRover: Proof Generation for Interpretable Reasoning over Rules☆19Jul 6, 2023Updated 2 years ago
- Author implementation of the paper "Don’t paraphrase, detect! Rapid and Effective Data Collection for Semantic Parsing"☆20Oct 5, 2020Updated 5 years ago
- [ECCV] HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning☆26Sep 6, 2025Updated 7 months ago
- NumGLUE: A Suite of Fundamental yet Challenging Mathematical Reasoning Tasks☆20May 10, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Code for Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking, EMNLP 2022, https://aclan…☆14Mar 30, 2026Updated 2 weeks ago
- Transferability of Natural Language Inference to Biomedical Question Answering☆12Mar 25, 2021Updated 5 years ago
- Balanced K-means in Pytorch with strong GPU acceleration☆12Apr 30, 2020Updated 5 years ago
- Submission archive for the MS MARCO passage ranking leaderboard☆13Apr 21, 2023Updated 2 years ago
- The aim of this repository is to utilize LLaMA to reproduce and enhance the Stanford Alpaca☆98Apr 5, 2023Updated 3 years ago
- [EACL 2023] CoTEVer: Chain of Thought Prompting Annotation Toolkit for Explanation Verification☆42Apr 29, 2023Updated 2 years ago
- An implementation of an autoregressive language model using an improved Transformer and DeepSpeed pipeline parallelism.☆30Jan 12, 2026Updated 3 months ago
- Dockerized Expo application☆12Apr 4, 2023Updated 3 years ago
- a tiny, exploitable chatbot that can use tools☆32Apr 5, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Rust crate for submitting inference requests to machine learning models☆15May 24, 2024Updated last year
- Deepspeed、LLM、Medical_Dialogue、医疗大模型、预训练、微调☆298Jun 7, 2024Updated last year
- 基于向量召回的检索式对话系统解决方案,dense retrieval,FAQ……☆35Nov 10, 2021Updated 4 years ago
- ☆10Aug 11, 2019Updated 6 years ago
- ☆10Apr 4, 2023Updated 3 years ago
- Generic build server☆65May 25, 2014Updated 11 years ago
- Annotating Columns with Pre-trained Language Models☆34Jun 10, 2022Updated 3 years ago