noanabeshima / wikipedia-downloader
Downloads 2020 English Wikipedia articles as plaintext
☆21Updated last year
Related projects ⓘ
Alternatives and complementary repositories for wikipedia-downloader
- ☆86Updated 2 years ago
- Script for downloading GitHub.☆88Updated 4 months ago
- ☆76Updated 11 months ago
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆76Updated 11 months ago
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆63Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 3 weeks ago
- Repository for analysis and experiments in the BigCode project.☆115Updated 7 months ago
- SparseGPT + GPTQ Compression of LLMs like LLaMa, OPT, Pythia☆41Updated last year
- distill chatGPT coding ability into small model (1b)☆24Updated last year
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- ☆64Updated 2 years ago
- Techniques used to run BLOOM at inference in parallel☆37Updated 2 years ago
- Reward Model framework for LLM RLHF☆58Updated last year
- The data processing pipeline for the Koala chatbot language model☆117Updated last year
- ☆75Updated last year
- Experiments on speculative sampling with Llama models☆117Updated last year
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆206Updated 10 months ago
- Unofficial implementation of AlpaGasus☆84Updated last year
- ☆83Updated last year
- QLoRA with Enhanced Multi GPU Support☆36Updated last year
- Fine-tuning 6-Billion GPT-J (& other models) with LoRA and 8-bit compression☆65Updated 2 years ago
- Code repository for the c-BTM paper☆105Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆62Updated last year
- [ICLR 2023] Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners☆111Updated 2 months ago
- Pre-training code for CrystalCoder 7B LLM☆53Updated 6 months ago
- ☆22Updated last year
- Large Scale Distributed Model Training strategy with Colossal AI and Lightning AI☆58Updated last year
- QuIP quantization☆46Updated 7 months ago