CodeCreator/datatools

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/CodeCreator/datatools)

CodeCreator / datatools

Common tools for data processing

☆22

Alternatives and similar repositories for datatools

Users that are interested in datatools are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

tiremoscode / dw-grupo58
View on GitHub
☆20Nov 28, 2024Updated last year
buyi-Yang / getQzonehistory
View on GitHub
☆12Nov 13, 2024Updated last year
abduvalimurodullayev1 / boilerplate_Drf
View on GitHub
This is the boilerplate for django project. There are so many settings configurations
☆10Nov 7, 2025Updated 8 months ago
GovardhaneNitin / smart-inventory
View on GitHub
A smart inventory management system that includes real-time stock tracking, supplier management, predictive analytics for inventory forec…
☆16Apr 22, 2025Updated last year
EvanZhuang / AgenticLU
View on GitHub
Official implementation of Self-Taught Agentic Long Context Understanding (ACL 2025).
☆13Sep 22, 2025Updated 9 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
megagonlabs / holobench
View on GitHub
🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…
☆12Feb 25, 2025Updated last year
hgabor / nestjs-keret-2024
View on GitHub
NestJS project template, configured with prisma and ejs
☆12Dec 1, 2024Updated last year
xiamengzhou / DataAugForLRL
View on GitHub
Generalized Data Augmentation for Low-Resource Translation
☆12Jul 30, 2019Updated 6 years ago
princeton-nlp / HELMET
View on GitHub
The HELMET Benchmark
☆220Apr 17, 2026Updated 3 months ago
Levcqhh / Apple-Unlocker
View on GitHub
This script automates the process of unlocking Apple ID accounts by solving captcha challenges, verifying account details, and resetting …
☆15Jan 24, 2026Updated 5 months ago
xiamengzhou / NLPerf
View on GitHub
Performance Prediction for NLP Tasks
☆17May 5, 2020Updated 6 years ago
Harry-Chan / seq2seqlm-on-qg
View on GitHub
☆13Feb 9, 2022Updated 4 years ago
princeton-nlp / ProLong
View on GitHub
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"
☆260Sep 12, 2025Updated 10 months ago
princeton-pli / QRHead
View on GitHub
QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking
☆40Jan 20, 2026Updated 6 months ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
AndreaGrandieri / ing-sw-2024-codex-naturalis
View on GitHub
Progetto per la prova finale di Ingegneria del Software 2023-2024 al Politecnico di Milano
☆10Oct 19, 2024Updated last year
dragonjsq / -VPN
View on GitHub
免费梯子，免费VPN，真正免费的的VPN，shadowsocks,v2rey,官网地址www.dragonvpn.cc
☆13Sep 4, 2024Updated last year
xiamengzhou / training_trajectory_analysis
View on GitHub
[ACL 2023]: Training Trajectories of Language Models Across Scales https://arxiv.org/pdf/2212.09803.pdf
☆25Nov 14, 2023Updated 2 years ago
FreedomIntelligence / SepsisAgent
View on GitHub
Agentifying Patient Dynamics within LLMs through Interacting with Clinical World Model
☆30May 15, 2026Updated 2 months ago
bzhangGo / st_from_scratch
View on GitHub
Revisiting End-to-End Speech-to-Text Translation From Scratch
☆13Feb 21, 2023Updated 3 years ago
FreedomIntelligence / TinyDeepSeek
View on GitHub
Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.
☆30Mar 11, 2025Updated last year
cjyaras / monarch-attention
View on GitHub
MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention (NeurIPS'25 Spotlight)
☆26Feb 22, 2026Updated 4 months ago
MileBench / MileBench
View on GitHub
This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"
☆38Jul 11, 2024Updated 2 years ago
EliasEsperanza / UES-API
View on GitHub
API de mapeo para la Universidad de El Salvador (UES), desarrollada por estudiantes de la Facultad Multidisciplinaria Oriental. Proporcio…
☆16Oct 3, 2025Updated 9 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
moayedellah / Network-Security
View on GitHub
A curated collection of courses, videos, and resources to master network security from the ground up.
☆11Jan 6, 2025Updated last year
facebookresearch / evaluation-of-nmt-bt
View on GitHub
This repository contains additional reference translations for the WMT'14 En-De (newstest2014) and WMT'19 En-Ru (newstest2019) test sets …
☆15Aug 31, 2021Updated 4 years ago
ShabanMughal / Robot-Ai
View on GitHub
☆22Jan 1, 2026Updated 6 months ago
princeton-pli / AggAgent
View on GitHub
☆28Apr 29, 2026Updated 2 months ago
xalanq / PiLibrary
View on GitHub
在线图书借阅系统 - 2017 THU OOP课大作业
☆14Jul 1, 2018Updated 8 years ago
da03 / criticize_text_generation
View on GitHub
A method for evaluating the high-level coherence of machine-generated texts. Identifies high-level coherence issues in transformer-based …
☆12Mar 18, 2023Updated 3 years ago
AgustinCoding / identity-alchemist
View on GitHub
Identity Alchemist: A powerful Python-based tool for generating and managing synthetic identities. Features machine learning integration,…
☆12Feb 12, 2025Updated last year
princeton-pli / LongProc
View on GitHub
LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation
☆36Feb 26, 2026Updated 4 months ago
Lilytreasure / MultiplatformPdfGenerator
View on GitHub
Compose Multiplatform pdf generator for Android/iOS
☆14Jan 9, 2025Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tsinghua-fib-lab / UGI
View on GitHub
Urban Generative Intelligence (UGI): A Foundational Platform for Embodied Agent and Future City
☆12Dec 17, 2023Updated 2 years ago
teslamotors / LVCS
View on GitHub
LVCS@Tesla.com
☆13May 20, 2026Updated 2 months ago
THUNLP-MT / DirectQuote
View on GitHub
A Dataset for Direct Quotation Extraction and Attribution in News Articles.
☆14Sep 28, 2021Updated 4 years ago
NEUIR / ConAE
View on GitHub
[EMNLP 2022] This is the code repo for our EMNLP‘22 paper "Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder"…
☆13Oct 20, 2022Updated 3 years ago
echohive42 / video-scene-splitter
View on GitHub
splits videos into scenes with gpt-4o-mini and saves them separately
☆12Dec 19, 2024Updated last year
ytyz1307zzh / PLUG
View on GitHub
Code for the ACL 2024 paper "PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning"
☆13Aug 13, 2025Updated 11 months ago
santiagorugnitz / SDUI-Demo-KMP
View on GitHub
☆15Sep 10, 2024Updated last year