simonw / nicar-2025-scrapingLinks
Cutting-edge web scraping techniques workshop at NICAR 2025
☆366Updated 7 months ago
Alternatives and similar repositories for nicar-2025-scraping
Users that are interested in nicar-2025-scraping are comparing it to the libraries listed below
Sorting:
- Template repository for setting up a new git scraper☆118Updated 7 months ago
- CLI tool for stripping tags from HTML☆343Updated 7 months ago
- Tools to build your own "taskmaster"☆157Updated last month
- Integrate LLM in any pipeline - fit/predict pattern, JSON driven flows, and built in concurency support.☆606Updated 7 months ago
- Tools for LIL's data preservation project☆124Updated 3 weeks ago
- Free travel times between U.S. Census geographies☆159Updated 6 months ago
- Examples and guides for using the VLM Run API☆292Updated this week
- Data from the Bloomberg News analysis on streamers and podcasters on YouTube☆23Updated 8 months ago
- AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards☆716Updated last week
- CleverBee - The Open Source Deep Researcher Tool☆307Updated 4 months ago
- https://verdad.app☆83Updated last week
- Mapping the French Culinary Universe☆48Updated 7 months ago
- An SDK for working with LLMs and AI Agents from Apache Airflow, based on Pydantic AI☆478Updated 2 weeks ago
- Turn docstrings into LLM-functions☆503Updated 6 months ago
- Spegel - Reflect the web through AI☆316Updated 2 months ago
- Import unstructured data (text and images) into structured tables☆154Updated 5 months ago
- A scientific instrument for investigating latent spaces☆729Updated 4 months ago
- OpenAI's Structured Outputs with Logprobs☆187Updated 4 months ago
- UV kernel for Jupyter☆452Updated 4 months ago
- Browser-LLM Auto-Scaling Technology☆546Updated last week
- ☆21Updated 3 months ago
- A playbook for effectively prompting post-trained LLMs☆895Updated 8 months ago
- A hub for various industry-specific schemas to be used with VLMs.☆535Updated 4 months ago
- A Twitter, Mastodon, and BlueSky bot that shares new interactive, graphic, and data vis stories from newsrooms around the world☆58Updated this week
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆219Updated 9 months ago
- WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.☆259Updated 8 months ago
- clean & curate your data with LLMs.☆490Updated last year
- Parallel Reasoning: llm-consortium orchestrates mulitple LLMs, iteratively refines & achieves consensus.☆367Updated this week
- Ask questions of your data with LLM assistance☆66Updated 10 months ago
- Easiest way to give context to LLMs; Attachments has the ambition to be the general funnel for any files to be transformed into images+te…☆306Updated 3 weeks ago