simonw / nicar-2025-scrapingLinks
Cutting-edge web scraping techniques workshop at NICAR 2025
☆368Updated 8 months ago
Alternatives and similar repositories for nicar-2025-scraping
Users that are interested in nicar-2025-scraping are comparing it to the libraries listed below
Sorting:
- Template repository for setting up a new git scraper☆119Updated 3 weeks ago
- Data from the Bloomberg News analysis on streamers and podcasters on YouTube☆25Updated 9 months ago
- CLI tool for stripping tags from HTML☆348Updated 8 months ago
- AI Dataset Generator – Create realistic datasets for demos, learning, and dashboards☆732Updated last month
- Tools for LIL's data preservation project☆125Updated 2 months ago
- Mapping the French Culinary Universe☆49Updated 8 months ago
- WARC + AI - Experimental Retrieval Augmented Generation Pipeline for Web Archive Collections.☆262Updated 9 months ago
- A simple Python 3.13 dev container☆38Updated 6 months ago
- https://verdad.app☆83Updated last week
- Tools to build your own "taskmaster"☆159Updated 2 months ago
- CleverBee - The Open Source Deep Researcher Tool☆305Updated 5 months ago
- An SDK for working with LLMs and AI Agents from Apache Airflow, based on Pydantic AI☆496Updated last month
- UV kernel for Jupyter☆458Updated 6 months ago
- Examples and guides for using the VLM Run API☆297Updated last month
- Integrate LLM in any pipeline - fit/predict pattern, JSON driven flows, and built in concurency support.☆605Updated 8 months ago
- Spegel - Reflect the web through AI☆316Updated 4 months ago
- Visualise your CSV files in seconds without sending your data anywhere☆516Updated last month
- clean & curate your data with LLMs.☆490Updated last year
- OpenAI's Structured Outputs with Logprobs☆198Updated 5 months ago
- A playbook for effectively prompting post-trained LLMs☆893Updated 10 months ago
- Import unstructured data (text and images) into structured tables☆159Updated last week
- Data and Codes for GroceryDB☆151Updated 10 months ago
- Turn docstrings into LLM-functions☆508Updated 2 weeks ago
- This project collects GPU benchmarks from various cloud providers and compares them to fixed per token costs. Use our tool for efficient …☆222Updated 11 months ago
- ☆21Updated 4 months ago
- Browser-LLM Auto-Scaling Technology☆755Updated 2 weeks ago
- Fully neural approach for text chunking☆393Updated last month
- Save OpenAI API results to a SQLite database☆235Updated last year
- A terminal based book tracking tool☆211Updated 2 months ago
- Build, Improve Performance, and Productionize your LLM Application with an Integrated Framework☆341Updated 11 months ago