A very simple news crawler with a funny name
☆437Feb 25, 2026Updated last week
Alternatives and similar repositories for fundus
Users that are interested in fundus are comparing it to the libraries listed below
Sorting:
- Efficiently find the best-suited language model (LM) for your NLP task☆135Jul 26, 2025Updated 7 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆111May 16, 2024Updated last year
- Evaluate language models using multiple choice items☆13Updated this week
- Small python package to measure OCR quality and other related metrics.☆27Feb 19, 2024Updated 2 years ago
- SpanMarker for Named Entity Recognition☆464Jan 8, 2025Updated last year
- German Language Understanding Evaluation Benchmark @NAACL24☆22Dec 11, 2025Updated 2 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆28Apr 17, 2024Updated last year
- A RAG that can scale 🧑🏻💻☆11May 28, 2024Updated last year
- ☆210Jun 26, 2025Updated 8 months ago
- Efficient few-shot learning with Sentence Transformers☆2,688Dec 11, 2025Updated 2 months ago
- news-please - an integrated web crawler and information extractor for news that just works☆2,391Sep 21, 2025Updated 5 months ago
- A comprehensive benchmark for entity disambiguation☆28Jun 29, 2023Updated 2 years ago
- Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets☆4,884Updated this week
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆23Mar 12, 2024Updated last year
- ☆17Feb 16, 2024Updated 2 years ago
- Zero and Few shot named entity & relationships recognition☆401Sep 17, 2025Updated 5 months ago
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Oct 9, 2023Updated 2 years ago
- Active Learning for Text Classification in Python☆639Feb 1, 2026Updated last month
- Easily embed, cluster and semantically label text datasets☆596Mar 28, 2024Updated last year
- Corresponding code repo for the paper at COLING 2020 - ARGMIN 2020: "DebateSum: A large-scale argument mining and summarization dataset"☆55Dec 2, 2021Updated 4 years ago
- A Python library for calculating a large variety of metrics from text☆360Jan 30, 2026Updated last month
- Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024☆2,865Updated this week
- Highly concurrent and fast content processing for Mighty Inference Server☆10Feb 6, 2023Updated 3 years ago
- Label shift estimation for transfer difficulty with Familiarity.☆10Feb 4, 2025Updated last year
- Temporary remove unused tokens during training to save ram and speed.☆23Jun 15, 2025Updated 8 months ago
- Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verifi…☆3,108Feb 23, 2026Updated last week
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆18Dec 22, 2023Updated 2 years ago
- Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XM…☆5,402Sep 12, 2025Updated 5 months ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆159Jul 14, 2025Updated 7 months ago
- My NER Experiments with ModernBERT and Ettin☆26Jul 17, 2025Updated 7 months ago
- Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.☆96Feb 9, 2023Updated 3 years ago
- Simply, faster, sentence-transformers☆144Aug 27, 2024Updated last year
- The NLP Bias Identification Toolkit☆39Sep 8, 2023Updated 2 years ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,915Updated this week
- Label data using HuggingFace's transformers and automatically get a prediction service☆193Jun 20, 2023Updated 2 years ago
- A BERT-based application for reusable text classification at scale☆38Jul 23, 2023Updated 2 years ago
- Late Interaction Models Training & Retrieval☆732Updated this week
- skweak: A software toolkit for weak supervision applied to NLP tasks☆926Sep 2, 2024Updated last year
- PathPiece tokenizer☆13Nov 10, 2024Updated last year