theblackcat102/evol-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/theblackcat102/evol-dataset)

theblackcat102 / evol-dataset

evol augment any dataset online

☆61

Alternatives and similar repositories for evol-dataset

Users that are interested in evol-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

nickrosh / evol-teacher
View on GitHub
Open Source WizardCoder Dataset
☆166Jul 12, 2023Updated 3 years ago
Naman-ntc / FastCode
View on GitHub
Utilities for efficient fine-tuning, inference and evaluation of code generation models
☆21Oct 3, 2023Updated 2 years ago
jina-ai / textbook
View on GitHub
distill chatGPT coding ability into small model (1b)
☆31Sep 7, 2023Updated 2 years ago
ntunlp / ExecEval
View on GitHub
A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.
☆64Oct 21, 2024Updated last year
CarperAI / decontamination
View on GitHub
This repository contains code for cleaning your training data of benchmark data to help combat data snooping.
☆28Apr 21, 2023Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
swtheing / WizardCoder_Instruct_Generator
View on GitHub
Generate the WizardCoder Instruct from the CodeAlpaca
☆21Jun 27, 2023Updated 3 years ago
ManifoldRG / NEKO_Archive
View on GitHub
The NEKO Project is an open source effort to build a model of equivalent scale and capability as that reported in DeepMind’s 2022 Paper, …
☆10Sep 2, 2023Updated 2 years ago
jondurbin / bagel
View on GitHub
A bagel, with everything.
☆326Apr 11, 2024Updated 2 years ago
all-the-noises / eval-arena
View on GitHub
☆34Mar 21, 2026Updated 4 months ago
my-other-github-account / llm-humaneval-benchmarks
View on GitHub
☆86May 15, 2026Updated 2 months ago
VikParuchuri / textbook_quality
View on GitHub
Generate textbook-quality synthetic LLM pretraining data
☆508Oct 19, 2023Updated 2 years ago
Chillee / llm.c
View on GitHub
LLM training in simple, raw C/CUDA
☆18May 6, 2024Updated 2 years ago
nlpxucan / evol-instruct
View on GitHub
☆287Apr 25, 2023Updated 3 years ago
zarakiquemparte / zaraki-tools
View on GitHub
☆28Aug 30, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
Hanchen-Wang / GoGNN
View on GitHub
☆10Nov 30, 2022Updated 3 years ago
mettamind-ai / physics_of_llms
View on GitHub
Các thí nghiệm liên quan tới LLMs cho tiếng Việt (insprised by Physics of LLMs Series)
☆11Oct 21, 2024Updated last year
huggingface / peft-pytorch-conference
View on GitHub
Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…
☆15Oct 16, 2023Updated 2 years ago
dmis-lab / ArkDTA
View on GitHub
☆11Apr 11, 2023Updated 3 years ago
DeepSoftwareAnalytics / Telly
View on GitHub
Replication package for ISSTA2023 paper - Towards Efficient Fine-tuning of Pre-trained Code Models: An Experimental Study and Beyond
☆23Apr 9, 2023Updated 3 years ago
amazon-science / cceval
View on GitHub
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)
☆182Aug 15, 2025Updated 11 months ago
rootker / chatgpt-cli
View on GitHub
chatgpt written in c++
☆14Jan 5, 2023Updated 3 years ago
neelsjain / NEFTune
View on GitHub
Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning
☆412May 17, 2024Updated 2 years ago
abacaj / code-eval
View on GitHub
Run evaluation on LLMs using human-eval benchmark
☆431Sep 12, 2023Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
JuneTse / ReInceptionE
View on GitHub
☆13Mar 16, 2021Updated 5 years ago
terryyz / DataAug4Code
View on GitHub
Source Code Data Augmentation for Deep Learning: A Survey.
☆67Jun 15, 2024Updated 2 years ago
ve3wwg / teensy3_qemu
View on GitHub
Changes to QEMU to accomodate the teensy3.x arm platform (Cortex-m4)
☆16Oct 13, 2019Updated 6 years ago
Locutusque / TPU-Alignment
View on GitHub
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆234Oct 31, 2024Updated last year
coastalcph / lexlms
View on GitHub
LeXFiles and LegalLAMA: Facilitating English Multinational Legal Language Model Development
☆23Jul 24, 2023Updated 3 years ago
jplhughes / dotfiles
View on GitHub
Easily deploy my zsh and tmux configuration on new machines. Includes local and remote aliases to improve workflow.
☆15Apr 23, 2026Updated 3 months ago
tpoisonooo / open-r1
View on GitHub
Fully open reproduction of DeepSeek-R1
☆11Mar 24, 2025Updated last year
ChrisHayduk / qlora-multi-gpu
View on GitHub
QLoRA with Enhanced Multi GPU Support
☆38Aug 8, 2023Updated 2 years ago
Shen-Lab / CPAC
View on GitHub
[Bioinformatics 2022] Cross-Modality and Self-Supervised Protein Embedding for Compound-Protein Affinity and Contact Prediction
☆16Jun 6, 2024Updated 2 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
white127 / SQUAD-2.0-bidaf
View on GitHub
☆11Aug 8, 2018Updated 7 years ago
LGH1gh / PromptProtein
View on GitHub
UPDATE: All future changes will be pushed to https://github.com/HICAI-ZJU/PromptProtein
☆15Apr 23, 2023Updated 3 years ago
FreedomIntelligence / OVM
View on GitHub
☆74Apr 2, 2024Updated 2 years ago
Zyphra / Zyda_processing
View on GitHub
☆44Jun 19, 2024Updated 2 years ago
wanteatfruit / WebGraph
View on GitHub
Chrome Extension for visualizing browsing history
☆11Sep 6, 2023Updated 2 years ago
nuprl / MultiPL-E
View on GitHub
A multi-programming language benchmark for LLMs
☆314Apr 12, 2026Updated 3 months ago
agungdwiprasetyo / gojek-parking-lot
View on GitHub
GO-JEK Challenge
☆11Jul 9, 2018Updated 8 years ago