evol augment any dataset online
☆61Aug 3, 2023Updated 2 years ago
Alternatives and similar repositories for evol-dataset
Users that are interested in evol-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source WizardCoder Dataset☆166Jul 12, 2023Updated 2 years ago
- A repository to perform self-instruct with a model on HF Hub☆32Sep 29, 2023Updated 2 years ago
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- distill chatGPT coding ability into small model (1b)☆31Sep 7, 2023Updated 2 years ago
- Accepted by Transactions on Machine Learning Research (TMLR)☆135Oct 5, 2024Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.☆65Oct 21, 2024Updated last year
- Generate the WizardCoder Instruct from the CodeAlpaca☆21Jun 27, 2023Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆28Apr 21, 2023Updated 3 years ago
- A bagel, with everything.☆326Apr 11, 2024Updated 2 years ago
- ☆34Mar 21, 2026Updated 2 months ago
- ☆86May 15, 2026Updated 2 weeks ago
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- Generate textbook-quality synthetic LLM pretraining data☆508Oct 19, 2023Updated 2 years ago
- ☆285Apr 25, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- LLM training in simple, raw C/CUDA☆18May 6, 2024Updated 2 years ago
- ☆28Aug 30, 2023Updated 2 years ago
- [EMNLP 2024] Multi-modal reasoning problems via code generation.☆28Apr 14, 2026Updated last month
- Các thí nghiệm liên quan tới LLMs cho tiếng Việt (insprised by Physics of LLMs Series)☆11Oct 21, 2024Updated last year
- Official Code and Data repository of our ACL 2021 paper X-FACT: A New Benchmark Dataset for Multilingual Fact Checking.☆27Oct 4, 2024Updated last year
- The NEKO Project is an open source effort to build a model of equivalent scale and capability as that reported in DeepMind’s 2022 Paper, …☆10Sep 2, 2023Updated 2 years ago
- ☆10Nov 30, 2022Updated 3 years ago
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Oct 16, 2023Updated 2 years ago
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆177Aug 15, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Provides a minimal implementation to extract FLAN datasets for further processing☆11Feb 1, 2023Updated 3 years ago
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆31Oct 23, 2025Updated 7 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆411May 17, 2024Updated 2 years ago
- Run evaluation on LLMs using human-eval benchmark☆430Sep 12, 2023Updated 2 years ago
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation☆17Sep 2, 2024Updated last year
- Source Code Data Augmentation for Deep Learning: A Survey.☆66Jun 15, 2024Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆234Oct 31, 2024Updated last year
- Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning☆307Oct 24, 2024Updated last year
- Seq2seq Type Inference using Static Analysis and CodeT5☆32Jul 9, 2023Updated 2 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- The Official Repo for Paper: Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning☆23Apr 7, 2026Updated last month
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- QLoRA with Enhanced Multi GPU Support☆38Aug 8, 2023Updated 2 years ago
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆48Nov 10, 2025Updated 6 months ago
- ☆74Apr 2, 2024Updated 2 years ago
- ☆17Jan 30, 2023Updated 3 years ago
- ☆45Jun 19, 2024Updated last year