☆78Dec 26, 2023Updated 2 years ago
Alternatives and similar repositories for detect-pretrain-code-contamination
Users that are interested in detect-pretrain-code-contamination are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Sakura-SOLAR-DPO: Merge, SFT, and DPO☆116Dec 30, 2023Updated 2 years ago
- This is our own implementation of 'Layer Selective Rank Reduction'☆240May 26, 2024Updated last year
- ☆68May 26, 2024Updated last year
- ☆27Mar 13, 2024Updated 2 years ago
- Paper, dataset and code list for multimodal dialogue.☆22Jan 2, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆20Jul 24, 2024Updated last year
- ☆10Jan 20, 2024Updated 2 years ago
- 🕸 GlotCC Dataset and Pipline -- NeurIPS 2024☆20Apr 6, 2025Updated 11 months ago
- Official Code For Dual Grained Quantization: Efficient Fine-Grained Quantization for LLM☆14Dec 27, 2023Updated 2 years ago
- Tools for merging pretrained large language models.☆6,895Mar 15, 2026Updated last week
- Difference-based Contrastive Learning for Korean Sentence Embeddings☆23Mar 11, 2026Updated 2 weeks ago
- Index of URLs to pdf files all over the internet and scripts☆25May 2, 2023Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆320Dec 20, 2023Updated 2 years ago
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31May 22, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆15Sep 6, 2024Updated last year
- ☆17Apr 11, 2024Updated last year
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆207Feb 18, 2026Updated last month
- Extract a single expert from a Mixture Of Experts model using slerp interpolation.☆19May 26, 2024Updated last year
- ☆143Aug 20, 2025Updated 7 months ago
- Modular task agnostic training pipeline using LFM2 from Liquid AI with unsloth.☆16Sep 13, 2025Updated 6 months ago
- Code for Blog Post: Can Better Cold-Start Strategies Improve RL Training for LLMs?☆20Mar 9, 2025Updated last year
- Automatically evaluate your LLMs in Google Colab☆687May 7, 2024Updated last year
- Merge Transformers language models by use of gradient parameters.☆214Aug 8, 2024Updated last year
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Simple Model Similarities Analysis☆21Feb 3, 2024Updated 2 years ago
- Natural Language Processing Tasks and Examples.☆61Aug 17, 2022Updated 3 years ago
- ☆10Dec 19, 2023Updated 2 years ago
- A tool for cross-checking Verilog compilers☆14Apr 16, 2025Updated 11 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆59Oct 18, 2025Updated 5 months ago
- Automated Identification of Redundant Layer Blocks for Pruning in Large Language Models☆263Apr 23, 2024Updated last year
- QLoRA: Efficient Finetuning of Quantized LLMs☆79Apr 10, 2024Updated last year
- All the world is a play, we are but actors in it.☆50Jul 21, 2025Updated 8 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…☆253Oct 30, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Download, parse, and filter data from Phil Papers. Data-ready for The-Pile.☆19Aug 28, 2023Updated 2 years ago
- prototype of plant-disease-detector☆10Apr 21, 2021Updated 4 years ago
- Source code for Truth-Aware Context Selection: Mitigating the Hallucinations of Large Language Models Being Misled by Untruthful Contexts☆17Sep 2, 2024Updated last year
- Luber : A ridesharing App☆14Dec 13, 2017Updated 8 years ago
- Evaluating LLMs with Dynamic Data☆113Feb 11, 2026Updated last month
- Easily convert HuggingFace models to GGUF-format for llama.cpp☆23Jul 27, 2024Updated last year
- 🥤🧑🏻🚀Code and dataset for our EMNLP 2023 paper - "SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization…☆240Jan 23, 2026Updated 2 months ago