ruanchaves / hashformersView external linksLinks
Accurate word segmentation for hashtags and text, powered by Transformers and Beam Search. A scalable alternative to heuristic splitters and massive LLMs.
☆76Jan 8, 2026Updated last month
Alternatives and similar repositories for hashformers
Users that are interested in hashformers are comparing it to the libraries listed below
Sorting:
- This project shows how to build a simple handwriting recognizer in Keras with the IAM dataset.☆13Aug 15, 2021Updated 4 years ago
- HashtagMaster: Segmentation tool for hashtags☆12Oct 27, 2020Updated 5 years ago
- [LREC 2022] An off-the-shelf pre-trained Tweet NLP Toolkit (NER, tokenization, lemmatization, POS tagging, dependency parsing) + Tweeban…☆105Jan 24, 2024Updated 2 years ago
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆85Feb 1, 2026Updated last week
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆17May 1, 2023Updated 2 years ago
- The Natural Portuguese Language Benchmark (Napolab). Stay up to date with the latest advancements in Portuguese language models and their…☆72Jul 28, 2025Updated 6 months ago
- Presents an optimized Apache Beam pipeline for generating sentence embeddings (runnable on Cloud Dataflow).☆20Mar 7, 2022Updated 3 years ago
- This repository holds files and scripts for incorporating simple CI/CD practices for model training in ML.☆21Oct 26, 2021Updated 4 years ago
- Concept Modeling: Topic Modeling on Images and Text☆217Nov 4, 2024Updated last year
- Few-shot Named Entity Recognition☆121Mar 30, 2022Updated 3 years ago
- ☆69May 1, 2025Updated 9 months ago
- A UI automation engine☆11Aug 14, 2025Updated 6 months ago
- 🤝 Trade any tensors over the network☆31Sep 27, 2023Updated 2 years ago
- A Python library for calculating a large variety of metrics from text☆359Jan 30, 2026Updated 2 weeks ago
- A framework for evaluating semantic search across custom datasets, metrics, and embedding backends.☆38May 26, 2025Updated 8 months ago
- LinkedIn Web Scraper☆10Mar 3, 2021Updated 4 years ago
- Transformer based Trigram Blocking implementation in Tensorflow☆11Feb 26, 2020Updated 5 years ago
- Implements RNNPool and SoftPool for CNNs.☆14Jan 29, 2021Updated 5 years ago
- YASEM - Yet Another Splade|Sparse Embedder - A simple and efficient library for SPLADE embeddings☆13May 22, 2025Updated 8 months ago
- A Simulator for Traffic Intersection based on Crossroads technique☆10Dec 4, 2019Updated 6 years ago
- This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with enti…☆244Jun 19, 2023Updated 2 years ago
- SpanMarker for Named Entity Recognition☆465Jan 8, 2025Updated last year
- Edo Liberty's class notes form the course Algorithms in Data Mining given in Tel Aviv University in academic years 2011-2013☆26May 20, 2022Updated 3 years ago
- A Python library aimed at dissecting and augmenting NER training data.☆60May 11, 2023Updated 2 years ago
- Slide and notebook used for my talk on vaex at the Pandas summit 2019 @ Lodnon☆11Jun 13, 2019Updated 6 years ago
- Includes additional materials for the following keras.io blog post.☆12Jun 23, 2021Updated 4 years ago
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations☆14Aug 16, 2022Updated 3 years ago
- Repository of data and code to use the models described in the paper "Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia…☆11Nov 21, 2022Updated 3 years ago
- This is the python program which performs text summarization with pronoun replacement method. This method initially identifies pronouns i…☆10Dec 5, 2018Updated 7 years ago
- DL4CV book☆10Sep 18, 2018Updated 7 years ago
- Universal Python binding for the LMDB 'Lightning' Database☆13Nov 7, 2017Updated 8 years ago
- [NAACL 2021] This is the code for our paper `Fine-Tuning Pre-trained Language Model with Weak Supervision: A Contrastive-Regularized Self…☆206Aug 17, 2022Updated 3 years ago
- ☆75Jul 2, 2021Updated 4 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Sep 19, 2025Updated 4 months ago
- Compares the DistilBERT and MobileBERT architectures for mobile deployments.☆33Oct 15, 2020Updated 5 years ago
- A Rideshare Simulation built in C++, using OpenStreetMap data☆14Oct 24, 2021Updated 4 years ago
- This project takes the arXiv dataset and builds an automatic tag classifier from the arXiv article/paper titles☆13Aug 18, 2021Updated 4 years ago
- OptimSeed - Seed Word Selection for Weakly-Supervised Text Classification [NAACL SRW 2021]☆14Mar 29, 2021Updated 4 years ago
- Shows how to create basic image adversaries, and train adversarially robust image classifiers (to some extent).☆13Oct 14, 2020Updated 5 years ago