Webhose / free-news-datasetsLinks
Weekly free datasets from global news sites
☆23Updated last week
Alternatives and similar repositories for free-news-datasets
Users that are interested in free-news-datasets are comparing it to the libraries listed below
Sorting:
- TextGraphs + LLMs + graph ML for entity extraction, linking, ranking, and constructing a lemma graph☆24Updated last year
- Automated Qualitative Analysis of LLMs (ICLR 2025)☆41Updated last week
- Common crawl extractor☆77Updated last year
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 8 months ago
- The Official NewsCatcher News API V2 SDK for Python☆20Updated 9 months ago
- Data and code related to the report "Truth, Lies, and Automation: How Language Models Could Change Disinformation"☆27Updated 4 years ago
- Microsoft Phi 2 Streamlit App, deployed on HuggingFace Spaces is based on the Microsoft Phi 2 small language model (SLM) for text generat…☆14Updated last year
- ☆9Updated last year
- Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.☆54Updated 4 months ago
- Tools to construct and process Common Crawl webgraphs☆92Updated 2 weeks ago
- Pytorch implementation of a BiLSTM model for the Wikification project.☆19Updated 5 years ago
- A pipeline using LLMs for Knowledge Engineering, combining knowledge probing and Wikidata entity mapping.☆37Updated 6 months ago
- ☆40Updated 7 months ago
- 🛤️ Pathik - High-Performance Web Crawler ⚡☆26Updated 3 months ago
- This repository is designed for deploying and managing server processes that handle embeddings using the Infinity Embedding model or Larg…☆23Updated 4 months ago
- 🔔🧠 Easily experiment with popular language agents across diverse reasoning/decision-making benchmarks!☆52Updated last week
- Email Datasets can be found here☆66Updated 5 years ago
- A framework for converting natural language text inputs to corresponding Pandas, MongoDB, Kusto and Neo4j (Cypher) queries.☆83Updated last year
- LLM plugin for clustering embeddings☆77Updated last year
- spaCy entry points for Curated Transformers☆31Updated last month
- [SIGIR 2024 (Demo)] CoSearchAgent: A Lightweight Collborative Search Agent with Large Language Models☆27Updated last year
- Professional Wargaming LLM Toolbox☆14Updated last week
- ☆25Updated 3 months ago
- Tree-based indexes for neural-search☆32Updated last year
- Python package for extractive NLP using the OpenAI API☆17Updated 10 months ago
- DSBench: How Far are Data Science Agents from Becoming Data Science Experts?☆58Updated 4 months ago
- Quick Notebook Tutorials☆32Updated 5 months ago
- Python library to use Pleias-RAG models☆58Updated 2 months ago
- ☆54Updated last year
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆59Updated this week