evol augment any dataset online
☆61Aug 3, 2023Updated 2 years ago
Alternatives and similar repositories for evol-dataset
Users that are interested in evol-dataset are comparing it to the libraries listed below
Sorting:
- Open Source WizardCoder Dataset☆164Jul 12, 2023Updated 2 years ago
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆27Apr 21, 2023Updated 2 years ago
- A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.☆62Oct 21, 2024Updated last year
- Official Code and Data repository of our ACL 2021 paper X-FACT: A New Benchmark Dataset for Multilingual Fact Checking.☆27Oct 4, 2024Updated last year
- Replication package for ISSTA2023 paper - Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond☆23Apr 9, 2023Updated 2 years ago
- LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development☆21Jul 24, 2023Updated 2 years ago
- A bagel, with everything.☆326Apr 11, 2024Updated last year
- Generate the WizardCoder Instruct from the CodeAlpaca☆21Jun 27, 2023Updated 2 years ago
- ☆24Sep 2, 2024Updated last year
- Generate textbook-quality synthetic LLM pretraining data☆509Oct 19, 2023Updated 2 years ago
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆175Aug 15, 2025Updated 6 months ago
- Zero-shot evaluation on LEXGLUE tasks with GTP3.5☆29Mar 11, 2023Updated 2 years ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆409May 17, 2024Updated last year
- ☆33Updated this week
- ☆27Aug 30, 2023Updated 2 years ago
- Artifact associated with the paper "Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking"☆25May 4, 2020Updated 5 years ago
- Seq2seq Type Inference using Static Analysis and CodeT5☆32Jul 9, 2023Updated 2 years ago
- Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"☆28Feb 22, 2022Updated 4 years ago
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Jun 25, 2024Updated last year
- Official code for the ICLR 2020 paper 'ARE PPE-TRAINED LANGUAGE MODELS AWARE OF PHRASES? SIMPLE BUT STRONG BASELINES FOR GRAMMAR INDCUTIO…☆30Jun 12, 2023Updated 2 years ago
- ☆23Apr 8, 2024Updated last year
- ☆72Apr 2, 2024Updated last year
- ☆11Jun 22, 2016Updated 9 years ago
- ☆282Apr 25, 2023Updated 2 years ago
- InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning☆285Aug 20, 2023Updated 2 years ago
- The Multitask Long Document Benchmark☆42Nov 2, 2022Updated 3 years ago
- This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.☆552Mar 10, 2024Updated 2 years ago
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆233Oct 31, 2024Updated last year
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆29Oct 23, 2025Updated 4 months ago
- A Deepfake detector based on hybrid EfficientNet CNN and Vision Transformer archietcture. The model is explainable by rendering a heatma…☆15Mar 16, 2022Updated 3 years ago
- Continual Resilient (CoRe) Optimizer for PyTorch☆11Jun 10, 2024Updated last year
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆49Nov 10, 2025Updated 3 months ago
- IntructIR, a novel benchmark specifically designed to evaluate the instruction following ability in information retrieval models. Our foc…☆32Jun 13, 2024Updated last year
- This is our own implementation of 'Layer Selective Rank Reduction'☆240May 26, 2024Updated last year
- Implementation of a modular, high-performance, and simplistic mamba for high-speed applications☆40Nov 11, 2024Updated last year
- A framework for the evaluation of autoregressive code generation language models.☆1,020Jul 22, 2025Updated 7 months ago
- ☆12Aug 15, 2023Updated 2 years ago
- Multi Stopwatch for Python☆12Sep 28, 2019Updated 6 years ago