UCSC-REAL/TokenCleaning

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/UCSC-REAL/TokenCleaning)

UCSC-REAL / TokenCleaning

[ICML 2025] Official implementation of paper "Token Cleaning: Fine-Grained Data Selection for LLM Supervised Fine-Tuning"

☆53

Alternatives and similar repositories for TokenCleaning

Users that are interested in TokenCleaning are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

UCSC-REAL / DS2
View on GitHub
[ICLR 2025] Official implementation of paper "Improving Data Efficiency via Curating LLM-Driven Rating Systems"
☆100Mar 24, 2025Updated last year
yixinzhang98 / otc_med_chat_agent
View on GitHub
An AI-powered conversational agent for recommending over-the-counter medications based on user symptoms and needs. Built with Python and …
☆198Jul 29, 2025Updated last year
wenhaoli-xmu / seco
View on GitHub
☆163Nov 16, 2025Updated 8 months ago
GenerTeam / GENERanno
View on GitHub
GENERanno: A Genomic Foundation Model for Metagenomic Annotation
☆314Jun 15, 2026Updated last month
WYKwong / LoLTrackGuard
View on GitHub
☆149Apr 2, 2026Updated 3 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
shapsider / modalnext
View on GitHub
Repository of "Modal-NexT: toward unified heterogeneous cellular data integration"
☆84Jun 16, 2025Updated last year
hehefan / Translution
View on GitHub
☆141Feb 7, 2026Updated 5 months ago
Irreel / AnyActions
View on GitHub
☆132Feb 15, 2025Updated last year
yixinzhang98 / causal_inference_uplift_toolkits
View on GitHub
☆155Nov 14, 2025Updated 8 months ago
lyanlin96 / Application-Security-Ingress-Controller
View on GitHub
☆277Apr 29, 2025Updated last year
rainbowyuyu / manim_extend_rainbow
View on GitHub
Improvements to animations based on Manim, designed to facilitate the demonstration of algorithms in data structures, operating systems, …
☆206Dec 15, 2025Updated 7 months ago
suimuc / VIRES
View on GitHub
☆342Jul 4, 2025Updated last year
CoderLineChan / SwiftlyUI
View on GitHub
UIKit Plus: Infusing SwiftUI-like Development Efficiency. Revolutionizing UIKit development through chain syntax, resultBuilder, and mode…
☆261Updated this week
HiGoalV / HiGoalVita
View on GitHub
HiGoalVita is a modular, layered, production ready AI RAG suite.
☆252May 22, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
GaohaoZhou-ops / JetsonYoloROS
View on GitHub
This repository implements Yolo functionality using TensorRT and CUDA acceleration on Nvidia Jetson devices and the ROS framework.
☆205Aug 14, 2025Updated 11 months ago
JyAether / Aether
View on GitHub
☆389May 5, 2025Updated last year
LeeJarvis996 / edsr_project
View on GitHub
☆50Dec 12, 2023Updated 2 years ago
shalfun / DriVerse
View on GitHub
[ACMMM 2025] Officially implement of the paper "DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompti…
☆221May 7, 2025Updated last year
shenshanf / mmdepth
View on GitHub
MMDepth: Comprehensive MMEngine-based Framework for Monocular, Stereo & Multi-view Depth Estimation
☆98Mar 4, 2025Updated last year
hzlab / Brain-Harmony
View on GitHub
Official codebase for "Brain Harmony: A Multimodal Foundation Model Unifying Morphology and Function into 1D Tokens" (NeruIPS 2025).
☆243Oct 26, 2025Updated 9 months ago
ByteDance-Seed / EvaLearn
View on GitHub
EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in chall…
☆431May 12, 2026Updated 2 months ago
JIA-Lab-research / Logits-Based-Finetuning
View on GitHub
Official Code of Logits-Based-Finetuning
☆90Jun 14, 2025Updated last year
WYKwong / Circlify_UI_Library
View on GitHub
☆116Sep 2, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ximinng / SVGDreamerV2
View on GitHub
[T-PAMI 2025] Official implementation for "SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation" https://arxiv…
☆451Dec 13, 2024Updated last year
360CVGroup / WISA
View on GitHub
World Simulator Assistant for Physics-Aware Text-to-Video Generation
☆278Sep 22, 2025Updated 10 months ago
Echo-Nie / KaggleForFun
View on GitHub
This repository documents my learning journey on the Kaggle platform, including model study notes and implementations of models actually …
☆32Dec 9, 2025Updated 7 months ago
nonamev-ls / SCIE_MCE
View on GitHub
Major Color Extract using SWASA and S-CIELAB
☆230Jun 7, 2025Updated last year
ZinYY / TreeLoRA
View on GitHub
[ICML 2025] A pytorch implementation of the paper "TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical G…
☆350Dec 15, 2025Updated 7 months ago
YesuLabs / contracts
View on GitHub
☆98Mar 8, 2025Updated last year
s3ndd / sen-graphql-go
View on GitHub
☆80Jun 8, 2025Updated last year
xxiaouw / SteamSmartBuy
View on GitHub
An intelligent Steam deal analytics dashboard leveraging Python, MySQL, and Power BI to surface the most worthwhile discounts.
☆156Jul 8, 2025Updated last year
kelvinfkr / adaptive-strategies-for-climate-change-adaptation-An-application-for-flood-risk-management
View on GitHub
data and codes for adaptive strategies for climate change adaptation: An application for flood risk management
☆134Feb 13, 2025Updated last year
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
pentilm / torch_quant
View on GitHub
A PyTorch quantization tool for machine learning models
☆78Mar 1, 2025Updated last year
Jiapeng-Pei / LLMSensitiveDataGoverance
View on GitHub
☆286Feb 21, 2026Updated 5 months ago
wenlongliaoEE / ETDToolbox
View on GitHub
☆175Feb 21, 2025Updated last year
GabePersson / EmoVision
View on GitHub
☆590Oct 11, 2025Updated 9 months ago
OpenDCAI / RARE
View on GitHub
Official repository of RARE: Retrieval-Augmented Reasoning Modeling [KDD 2026 Research Track]
☆183May 20, 2026Updated 2 months ago
JIA-Lab-research / UnityVideo
View on GitHub
[CVPR 2026]UnityVideo: Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
☆317Jul 14, 2026Updated 2 weeks ago
renxh4 / CompressPng
View on GitHub
☆405Aug 31, 2022Updated 3 years ago