evol augment any dataset online
☆61Aug 3, 2023Updated 2 years ago
Alternatives and similar repositories for evol-dataset
Users that are interested in evol-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Open Source WizardCoder Dataset☆166Jul 12, 2023Updated 2 years ago
- A repository to perform self-instruct with a model on HF Hub☆32Sep 29, 2023Updated 2 years ago
- Utilities for efficient fine-tuning, inference and evaluation of code generation models☆21Oct 3, 2023Updated 2 years ago
- distill chatGPT coding ability into small model (1b)☆30Sep 7, 2023Updated 2 years ago
- Accepted by Transactions on Machine Learning Research (TMLR)☆135Oct 5, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A distributed, extensible, secure solution for evaluating machine generated code with unit tests in multiple programming languages.☆62Oct 21, 2024Updated last year
- Generate the WizardCoder Instruct from the CodeAlpaca☆21Jun 27, 2023Updated 2 years ago
- This repository contains code for cleaning your training data of benchmark data to help combat data snooping.☆28Apr 21, 2023Updated 2 years ago
- A bagel, with everything.☆326Apr 11, 2024Updated 2 years ago
- ☆34Mar 21, 2026Updated 3 weeks ago
- Python module that creates a context map for AI code generation☆29Aug 14, 2024Updated last year
- Generate textbook-quality synthetic LLM pretraining data☆509Oct 19, 2023Updated 2 years ago
- Code related to the ELM neuron.☆14Feb 27, 2024Updated 2 years ago
- ☆282Apr 25, 2023Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- LLM training in simple, raw C/CUDA☆18May 6, 2024Updated last year
- ☆28Aug 30, 2023Updated 2 years ago
- [EMNLP 2024] Multi-modal reasoning problems via code generation.☆28Updated this week
- Official Code and Data repository of our ACL 2021 paper X-FACT: A New Benchmark Dataset for Multilingual Fact Checking.☆27Oct 4, 2024Updated last year
- Code for the examples presented in the talk "Training a Llama in your backyard: fine-tuning very large models on consumer hardware" given…☆15Oct 16, 2023Updated 2 years ago
- ☆11Apr 11, 2023Updated 3 years ago
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)☆177Aug 15, 2025Updated 8 months ago
- ☆12Aug 15, 2023Updated 2 years ago
- Provides a minimal implementation to extract FLAN datasets for further processing☆11Feb 1, 2023Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- The official implement of paper 《DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile Phone Agents》☆30Oct 23, 2025Updated 5 months ago
- Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning☆413May 17, 2024Updated last year
- Run evaluation on LLMs using human-eval benchmark☆430Sep 12, 2023Updated 2 years ago
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation☆16Sep 2, 2024Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆234Oct 31, 2024Updated last year
- Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning☆308Oct 24, 2024Updated last year
- Changes to QEMU to accomodate the teensy3.x arm platform (Cortex-m4)☆16Oct 13, 2019Updated 6 years ago
- Seq2seq Type Inference using Static Analysis and CodeT5☆32Jul 9, 2023Updated 2 years ago
- The Official Repo for Paper: Aligning Clinical Needs and AI Capabilities: A Survey on LLMs for Medical Reasoning☆22Apr 7, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor…☆20Feb 27, 2024Updated 2 years ago
- QLoRA with Enhanced Multi GPU Support☆38Aug 8, 2023Updated 2 years ago
- Official code for the paper "CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules"☆48Nov 10, 2025Updated 5 months ago
- ☆11Aug 8, 2018Updated 7 years ago
- [Bioinformatics 2022] Cross-Modality and Self-Supervised Protein Embedding for Compound-Protein Affinity and Contact Prediction☆16Jun 6, 2024Updated last year
- UPDATE: All future changes will be pushed to https://github.com/HICAI-ZJU/PromptProtein☆15Apr 23, 2023Updated 2 years ago