simonw / nicar-2025-scrapingLinks
Cutting-edge web scraping techniques workshop at NICAR 2025
☆370Updated 10 months ago
Alternatives and similar repositories for nicar-2025-scraping
Users that are interested in nicar-2025-scraping are comparing it to the libraries listed below
Sorting:
- Template repository for setting up a new git scraper☆121Updated 2 months ago
- Free travel times between U.S. Census geographies☆162Updated 9 months ago
- CLI tool for stripping tags from HTML☆352Updated 10 months ago
- Examples and guides for using the VLM Run API☆303Updated last week
- Integrate LLM in any pipeline - fit/predict pattern, JSON driven flows, and built in concurency support.☆606Updated 10 months ago
- Mapping the French Culinary Universe☆50Updated 10 months ago
- Tools for LIL's data preservation project☆125Updated 3 months ago
- Data from the Bloomberg News analysis on streamers and podcasters on YouTube☆25Updated 11 months ago
- CleverBee - The Open Source Deep Researcher Tool☆309Updated 7 months ago
- An SDK for working with LLMs and AI Agents from Apache Airflow, based on Pydantic AI☆512Updated 3 months ago
- Tools to build your own "taskmaster"☆162Updated 4 months ago
- AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards☆746Updated 3 months ago
- https://verdad.app☆83Updated last week
- Multimodal RAG to search and interact locally with technical documents of any kind☆285Updated 2 months ago
- WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.☆267Updated 11 months ago
- Turn docstrings into LLM-functions☆514Updated 2 weeks ago
- Spegel - Reflect the web through AI☆331Updated 5 months ago
- LLM plugin providing access to models running on an Ollama server☆346Updated 2 weeks ago
- This is a framework that implements various parallel reasoning strategies from the literature☆274Updated 3 weeks ago
- UV kernel for Jupyter☆461Updated 7 months ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated last year
- Browser-LLM Auto-Scaling Technology☆768Updated this week
- LLM plugin to access Google's Gemini family of models☆419Updated 2 weeks ago
- clean & curate your data with LLMs.☆489Updated last year
- A Twitter, Mastodon, and BlueSky bot that shares new interactive, graphic, and data vis stories from newsrooms around the world☆58Updated this week
- Fast Diversification for Search & Retrieval☆461Updated last month
- OpenAI's Structured Outputs with Logprobs☆200Updated 7 months ago
- A scientific instrument for investigating latent spaces☆745Updated last month
- Extract Stats Q/A from Tables With Provenance☆25Updated 2 weeks ago
- A terminal based book tracking tool☆212Updated last week