bigcode-project / bigcode-analysis
Repository for analysis and experiments in the BigCode project.
☆113Updated 5 months ago
Related projects: ⓘ
- Code for the curation of The Stack v2 and StarCoder2 training data☆85Updated 5 months ago
- ☆110Updated 4 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆141Updated 4 months ago
- A framework for few-shot evaluation of autoregressive language models.☆98Updated last year
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)☆195Updated 3 months ago
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasets☆174Updated last week
- Open Instruction Generalist is an assistant trained on massive synthetic instructions to perform many millions of tasks☆204Updated 8 months ago
- Tk-Instruct is a Transformer model that is tuned to solve many NLP tasks by following instructions.☆177Updated last year
- ☆29Updated last year
- ☆99Updated last year
- Chain-of-Hindsight, A Scalable RLHF Method☆213Updated 11 months ago
- An experimental implementation of the retrieval-enhanced language model☆75Updated last year
- [EMNLP 2023] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning☆201Updated 10 months ago
- ☆166Updated last year
- ☆174Updated last year
- Scaling Data-Constrained Language Models☆310Updated this week
- evol augment any dataset online☆55Updated last year
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆72Updated 8 months ago
- The dataset and code for paper: TheoremQA: A Theorem-driven Question Answering dataset☆153Updated 4 months ago
- DSIR large-scale data selection framework for language model training☆221Updated 5 months ago
- ☆357Updated last month
- Accepted by Transactions on Machine Learning Research (TMLR)☆115Updated 8 months ago
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆74Updated 9 months ago
- CRUXEval: Code Reasoning, Understanding, and Execution Evaluation☆99Updated last month
- The data processing pipeline for the Koala chatbot language model☆115Updated last year
- Multipack distributed sampler for fast padding-free training of LLMs☆170Updated last month
- distill chatGPT coding ability into small model (1b)☆24Updated last year
- ☆73Updated last year
- ☆158Updated last year
- Code for the paper "Efficient Training of Language Models to Fill in the Middle"☆162Updated last year