noanabeshima / wikipedia-downloaderLinks
Downloads 2020 English Wikipedia articles as plaintext
☆24Updated 2 years ago
Alternatives and similar repositories for wikipedia-downloader
Users that are interested in wikipedia-downloader are comparing it to the libraries listed below
Sorting:
- ☆91Updated 3 years ago
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆83Updated last year
- The data processing pipeline for the Koala chatbot language model☆118Updated 2 years ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated 2 years ago
- Script for downloading GitHub.☆97Updated last year
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- Safety Score for Pre-Trained Language Models☆96Updated 2 years ago
- OLMost every training recipe you need to perform data interventions with the OLMo family of models.☆50Updated last week
- A library for squeakily cleaning and filtering language datasets.☆47Updated 2 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- A GPT-based generative LM for combined text and math formulas, leveraging tree-based formula encoding. Published as "Tree-Based Represent…☆40Updated 2 years ago
- ☆16Updated 6 months ago
- ☆79Updated last year
- Developing tools to automatically analyze datasets☆75Updated 11 months ago
- An Implementation of "Orca: Progressive Learning from Complex Explanation Traces of GPT-4"☆42Updated last year
- Reward Model framework for LLM RLHF☆61Updated 2 years ago
- Repository for analysis and experiments in the BigCode project.☆124Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- ☆33Updated 2 years ago
- Finetune Falcon, LLaMA, MPT, and RedPajama on consumer hardware using PEFT LoRA☆103Updated 5 months ago
- ☆57Updated last year
- DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.☆169Updated 3 weeks ago
- Evaluating LLMs with CommonGen-Lite☆91Updated last year
- ☆26Updated 2 months ago
- distill chatGPT coding ability into small model (1b)☆30Updated 2 years ago
- Small and Efficient Mathematical Reasoning LLMs☆72Updated last year
- Open Implementations of LLM Analyses☆107Updated last year
- 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.☆55Updated 3 years ago
- [EMNLP 2023 Industry Track] A simple prompting approach that enables the LLMs to run inference in batches.☆76Updated last year
- ☆158Updated 4 years ago