yigitkonur / llm-dataset-prepLinks
Python toolkit for preparing LLM fine-tuning datasets. Features category weighting, reservoir sampling, JSONL processing, and statistical analysis.
☆15Updated 2 years ago
Alternatives and similar repositories for llm-dataset-prep
Users that are interested in llm-dataset-prep are comparing it to the libraries listed below
Sorting:
- Density-based clustering for vector embeddings using HDBSCAN and cosine similarity. Features automatic parameter search, PCA, and quality…☆17Updated last month
- Serverless API Gateway☆70Updated 3 weeks ago
- Minimalist search engine for job applications (CVs)☆61Updated last year
- High-performance Rust CLI and library achieving 10K+ req/s for LLM APIs. Features weighted load-balancing, HTTP/2 pooling, and real-time …☆17Updated last month
- Summarize webpages from specified URLs using the LangChain framework and the ChatOllama model☆123Updated last year
- Türkiye Teknoloji Takımı Vakfı - Yapay Zeka Usta Eğitimleri Serisi - Makine Öğreniminde Regresyon ve Sınıflandırma☆17Updated 5 years ago
- A curated list of awesome Turkish language processing libraries, models, resources and datasets. The main focus is on open source tools, …☆41Updated 5 years ago
- Muninn is a fast and flexible HTML parsing tool that simplifies the process of extracting data from HTMLs.☆146Updated 10 months ago
- Subtitle translation API using GPT. Dynamic context windows preserve conversational flow. Features auto-fallback to DeepL, concurrent pro…☆32Updated last month
- Bulk call automation tool using Telnyx and LLM-based transcription. Dials, plays audio, records, and transcribes hundreds of concurrent c…☆139Updated last month
- deduplication☆15Updated 2 years ago
- Few-Shot Prompting - Chain-of-Thought (CoT) Prompting - Hallucinations - Self-Consistency - Generated Knowledge Prompting - Tree of …☆29Updated 2 years ago
- Türkiye Açık Kaynak Platformunun organizasyonluğunda düzenlenen Açık Seminer (https://www.acikseminer.com/) serisinin doğal dil işleme ha…☆21Updated 5 years ago
- ☆220Updated 4 months ago
- Server load testing CLI tool 🏋️☆11Updated 2 years ago
- Gen AI based travel assistant for Turkish Airlines customers☆11Updated last year
- Turkish-Reading-Comprehension-Question-Answering-Dataset☆84Updated 3 years ago
- Turkish LM Tuner☆87Updated last year
- a lightweight and simple cli package☆12Updated 4 years ago
- Tiny AI is a platform to create/modify AI powered chatbots. This repository contains ChatGPT plugin and API for talk, create and modify T…☆36Updated last year
- ☆177Updated 5 months ago
- A Turkish Text-to-SQL Dataset☆12Updated 11 months ago
- TRScraper, doğal dil işleme uygulamalarında kullanılmak amacıyla geliştirilmiş, Türkçe içerik girilen büyük platformlarda metin madencili…☆75Updated 4 years ago
- MCP Server for Searching Turkish Legislation☆119Updated 2 months ago
- A powerful TypeScript framework for building non-deterministic AI agents with advanced cognitive capabilities like reasoning, decision-ma…☆106Updated 3 months ago
- Kubernetes logs to MongoDB☆16Updated 4 years ago
- Aşağılayıcı Söylemlerin Doğal Dil İşleme İle Tespiti☆22Updated 2 years ago
- Türkçe Makine Öğrenmesi Notları☆23Updated 2 years ago
- Jumpstart Your Cursor AI Projects☆179Updated 10 months ago
- Repository for "Turkish Wikipedia Based Knowledge Graph (Vikipedi Tabanlı Türkçe Bilgi Çizgesi)" of inzva AI Projects #6☆27Updated 4 years ago