noanabeshima / github-downloaderLinks
Script for downloading GitHub.
☆13Updated 5 years ago
Alternatives and similar repositories for github-downloader
Users that are interested in github-downloader are comparing it to the libraries listed below
Sorting:
- Hugging Face and Pyserini interoperability☆19Updated 2 years ago
- Script for downloading GitHub.☆97Updated last year
- This repository contains all the code for collecting large scale amounts of code from GitHub.☆110Updated 2 years ago
- ☆22Updated 9 months ago
- Stuff related to scraping the Code Review StackExchange☆12Updated 2 years ago
- A client library for LAION's effort to filter CommonCrawl with CLIP, building a large scale image-text dataset.☆31Updated 2 years ago
- GenieNLP: A versatile codebase for any NLP task☆88Updated last year
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated 2 years ago
- examples and guides to using Nomic Atlas☆38Updated 7 months ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆67Updated 2 years ago
- The data and implementation for the experiments in the paper "Flows: Building Blocks of Reasoning and Collaborating AI".☆31Updated last year
- ☆26Updated last year
- Downloads 2020 English Wikipedia articles as plaintext☆24Updated 2 years ago
- URL downloader supporting checkpointing and continuous checksumming.☆19Updated last year
- Python tools for processing the stackexchange data dumps into a text dataset for Language Models☆83Updated last year
- Everything for the Paper: 'Evoke: Evoking Critical Thinking Abilities in LLMs via Reviewer-Author Prompt Editing'☆17Updated last year
- ☆44Updated last year
- ☆26Updated this week
- Fault-aware neural code rankers☆29Updated 2 years ago
- Code for constructing TLDR corpus from Reddit dataset☆27Updated 3 years ago
- Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.☆14Updated 2 years ago
- GPU Environment Management for Visual Studio Code☆39Updated 2 years ago
- ☆92Updated 3 years ago
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆49Updated last year
- Neural search engine for discovering semantically similar Python repositories on GitHub☆26Updated last year
- Analyzing and scoring reasoning traces of LLMs☆46Updated last year
- FactNews is the first dataset to predict sentence-level factuality of news reporting. Furthemore, we provide baseline results for sentenc…☆10Updated 5 months ago
- One stop shop for all things carp☆59Updated 3 years ago
- ☆44Updated 2 years ago
- A swarm of LLM agents that will help you test, document, and productionize your code!☆17Updated last week