yigitkonur / llm-dataset-prepLinks
Python toolkit for preparing LLM fine-tuning datasets. Features category weighting, reservoir sampling, JSONL processing, and statistical analysis.
☆15Updated 2 years ago
Alternatives and similar repositories for llm-dataset-prep
Users that are interested in llm-dataset-prep are comparing it to the libraries listed below
Sorting:
- Density-based clustering for vector embeddings using HDBSCAN and cosine similarity. Features automatic parameter search, PCA, and quality…☆17Updated 2 months ago
- Serverless API Gateway☆70Updated this week
- deduplication☆15Updated 2 years ago
- Summarize webpages from specified URLs using the LangChain framework and the ChatOllama model☆122Updated last year
- High-performance Rust CLI and library achieving 10K+ req/s for LLM APIs. Features weighted load-balancing, HTTP/2 pooling, and real-time …☆17Updated 2 months ago
- Subtitle translation API using GPT. Dynamic context windows preserve conversational flow. Features auto-fallback to DeepL, concurrent pro…☆32Updated 2 months ago
- Türkiye Teknoloji Takımı Vakfı - Yapay Zeka Usta Eğitimleri Serisi - Makine Öğreniminde Regresyon ve Sınıflandırma☆17Updated 5 years ago
- Bulk call automation tool using Telnyx and LLM-based transcription. Dials, plays audio, records, and transcribes hundreds of concurrent c…☆139Updated 2 months ago
- Gen AI based travel assistant for Turkish Airlines customers☆11Updated last year
- Aşağılayıcı Söylemlerin Doğal Dil İşleme İle Tespiti☆22Updated 2 years ago
- ☆12Updated last year
- ☆177Updated 6 months ago
- Minimalist search engine for job applications (CVs)☆61Updated last year
- Server load testing CLI tool 🏋️☆11Updated 2 years ago
- a lightweight and simple cli package☆12Updated 4 years ago
- Muninn is a fast and flexible HTML parsing tool that simplifies the process of extracting data from HTMLs.☆146Updated 11 months ago
- Kubernetes logs to MongoDB☆16Updated 4 years ago
- A Command Line Interface that is designed for technical SEOs☆12Updated 4 years ago
- Turkish-Reading-Comprehension-Question-Answering-Dataset☆84Updated 3 years ago
- Tiny AI is a platform to create/modify AI powered chatbots. This repository contains ChatGPT plugin and API for talk, create and modify T…☆36Updated last year
- Turkish LM Tuner☆86Updated last year
- Türkiye Açık Kaynak Platformunun organizasyonluğunda düzenlenen Açık Seminer (https://www.acikseminer.com/) serisinin doğal dil işleme ha…☆21Updated 5 years ago
- Binalyze logger is an easily customizable wrapper for logrus with log rotation☆28Updated 4 years ago
- A Turkish Text-to-SQL Dataset☆14Updated last year
- 🌊 plugin.t4y.ai☆34Updated 2 years ago
- Jumpstart Your Cursor AI Projects☆178Updated 11 months ago
- Few-Shot Prompting - Chain-of-Thought (CoT) Prompting - Hallucinations - Self-Consistency - Generated Knowledge Prompting - Tree of …☆29Updated 2 years ago
- unofficial Tureng.com API☆81Updated 3 years ago
- A Python toolkit for image clustering using deep learning, PCA, and K-means, with support for GPU and CPU processing. Simplify your image…☆38Updated last year
- MCP Server for Searching Turkish Legislation☆121Updated 3 months ago