A Python library to chunk/group your texts based on semantic similarity.
☆105Jun 12, 2026Updated this week
Alternatives and similar repositories for semantic-split
Users that are interested in semantic-split are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Exploration of semantic chunking and chunk classification☆19Sep 16, 2024Updated last year
- Pairwise Controlled Manifold Approximation (PaCMAP) for dimensionality reduction☆21Feb 3, 2026Updated 4 months ago
- This is the source code of IJCNN 2023 paper TieFake: Title-Text Similarity and Emotion-Aware Fake News Detection (TieFake).☆16Dec 21, 2023Updated 2 years ago
- Cocktail: A Comprehensive Information Retrieval Benchmark with LLM-Generated Documents Integration☆15Jun 4, 2024Updated 2 years ago
- Split text into semantic chunks, up to a desired chunk size. Supports calculating length by characters and tokens, and is callable from R…☆612Updated this week
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A quick Crew AI tutorial☆23May 9, 2024Updated 2 years ago
- Inprocess : Installable version of FreeCAD OpenSCAD workbench☆16Feb 24, 2026Updated 3 months ago
- An implementation of LLMzip using GPT-2☆14Aug 7, 2023Updated 2 years ago
- ☆39Apr 17, 2024Updated 2 years ago
- The code for paper: Hierarchical Document Refinement for Long-context Retrieval-augmented Generation [ACL2025 Oral]☆46Aug 25, 2025Updated 9 months ago
- CFT-RAG: An Entity Tree Based Retrieval Augmented Generation Algorithm With Cuckoo Filter☆25May 28, 2025Updated last year
- A package to process electrochemical results from atomistic simulations.☆17Jun 5, 2026Updated last week
- The predecessor of CiteLab.☆18Feb 3, 2026Updated 4 months ago
- Code and data for GMT-KBQA☆17Jan 5, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆41May 26, 2026Updated 2 weeks ago
- Speaker Role Contextual Model for Dialogues☆15Sep 30, 2017Updated 8 years ago
- A simple web application for searching Word2Vec embeddings derived from approximately 2,000 law reports published by the The Incorporated…☆27Oct 4, 2022Updated 3 years ago
- ☆20Apr 8, 2025Updated last year
- Goval is a zero-reflection, strong-type, and customizable validator for Go. It also supports error translation☆13Sep 27, 2023Updated 2 years ago
- basically youtube in your terminal☆14Apr 28, 2022Updated 4 years ago
- This repo explores how AMR to address tasks difficult for LLMs☆13Jan 15, 2024Updated 2 years ago
- 基于langchain设计的智能体任务,包含规划会话场景资源,构建子任务,任务执行器包含(MCTS)☆33Nov 10, 2025Updated 7 months ago
- BPE modification that implements removing of the intermediate tokens during tokenizer training.☆27Nov 25, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- My Gen AI research☆11Jun 3, 2024Updated 2 years ago
- ReMe: A Personalized Cognitive Training Framework Based on an LLM Voice Chatbot for Research☆17Jul 3, 2025Updated 11 months ago
- Data mapping framework for rust stuff☆54Mar 25, 2026Updated 2 months ago
- ☆1,468Jun 18, 2024Updated last year
- Typesafe browser extension messaging created with Typescript☆12Sep 2, 2023Updated 2 years ago
- Vector Search Benchmarking suite☆16May 4, 2026Updated last month
- A node CLI for querying the Twitter spaces API.☆14Sep 6, 2021Updated 4 years ago
- ☆12Jan 25, 2025Updated last year
- A git repo showcasing RAG Techniques for building Naive to Advance RAG solutions☆13Feb 16, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A set of containerized Google Cloud Platform emulators used for development and testing purposes.☆13Nov 25, 2022Updated 3 years ago
- Graphing Scrubbing Calculator☆11Nov 27, 2017Updated 8 years ago
- How to set up various vtuber productions☆12Mar 13, 2022Updated 4 years ago
- ☆56Updated this week
- arXiv-Chat: An AI research assistant and Discord bot☆13Jul 16, 2023Updated 2 years ago
- tiny spreadsheet language with ambiguous values☆11Mar 29, 2024Updated 2 years ago
- An automated data pipeline scaling RL to pretraining levels☆77Jun 2, 2026Updated last week