samlhuillier / spider-sql-finetune
☆17Updated last year
Alternatives and similar repositories for spider-sql-finetune:
Users that are interested in spider-sql-finetune are comparing it to the libraries listed below
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆40Updated 10 months ago
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆36Updated last year
- ☆20Updated 8 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 9 months ago
- PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"☆23Updated last week
- ☆23Updated 5 months ago
- Understanding the correlation between different LLM benchmarks☆29Updated last year
- ☆20Updated 3 months ago
- A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.☆83Updated last year
- Self-Controlled Memory System for LLMs☆45Updated 9 months ago
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆15Updated 3 months ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆36Updated last year
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆14Updated last year
- ☆31Updated 8 months ago
- ☆60Updated last week
- ☆18Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- Finetune any model on HF in less than 30 seconds☆58Updated 3 weeks ago
- [ACL 2024] RelayAttention for Efficient Large Language Model Serving with Long System Prompts☆38Updated 11 months ago
- ☆26Updated last year
- LLMs as Collaboratively Edited Knowledge Bases☆44Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated 11 months ago
- Tutorial for LLM developers about engine design, service deployment, evaluation/benchmark, etc. Provide a C/S style optimized LLM inferen…☆19Updated last year
- Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719☆22Updated 8 months ago
- Multi-Layer Key-Value sharing experiments on Pythia models☆32Updated 8 months ago
- Reasoning by Communicating with Agents☆24Updated 4 months ago
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated 9 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated 8 months ago
- LMTuner: Make the LLM Better for Everyone☆33Updated last year