CarperAI / CodeReviewSE
Stuff related to scraping the Code Review StackExchange
☆11Updated last year
Related projects ⓘ
Alternatives and complementary repositories for CodeReviewSE
- Ludwig benchmark☆19Updated 2 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated last year
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆31Updated 5 months ago
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆11Updated 3 years ago
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆12Updated last year
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 2 years ago
- A library for squeakily cleaning and filtering language datasets.☆45Updated last year
- Ranking of fine-tuned HF models as base models.☆35Updated last year
- Minimum Description Length probing for neural network representations☆16Updated last week
- Documentation effort for the BookCorpus dataset☆31Updated 3 years ago
- StAtutory Reasoning Assessment☆11Updated last year
- This repo contains all of the code for my Youtube series on how to create a VSCode extension for autocompleting code using Deep Learning!☆15Updated 3 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Updated last year
- This repository implements DSPy programs to tasks in Indian Languages☆11Updated 9 months ago
- One stop shop for all things carp☆58Updated 2 years ago
- Training and Inference Notebooks for the RedPajama (OpenLlama) models☆18Updated last year
- A sample pattern for running CI tests on Modal☆13Updated last month
- This project develops compact transformer models tailored for clinical text analysis, balancing efficiency and performance for healthcare…☆18Updated 7 months ago
- Training a model without a dataset for natural language inference (NLI)☆25Updated 4 years ago
- Using short models to classify long texts☆20Updated last year
- Companion Repo for the Vision Language Modelling YouTube series - https://bit.ly/3PsbsC2 - by Prithivi Da. Open to PRs and collaborations☆14Updated 2 years ago
- PyTorch implementation for MRL☆18Updated 8 months ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 2 years ago
- ☆19Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated last year
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated last year
- ☆19Updated 2 years ago