HazyResearch / fm_data_tasks
Foundation Models for Data Tasks
☆100Updated last year
Related projects ⓘ
Alternatives and complementary repositories for fm_data_tasks
- Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).☆41Updated 2 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆40Updated 3 years ago
- ☆30Updated last year
- The Official Repository for "Bring Your Own Data! Self-Supervised Evaluation for Large Language Models"☆109Updated last year
- Code for paper "LEVER: Learning to Verifiy Language-to-Code Generation with Execution" (ICML'23)☆79Updated last year
- Code for paper 'Data-Efficient FineTuning'☆29Updated last year
- Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond…☆21Updated 2 years ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆130Updated last week
- Retrieval as Attention☆83Updated last year
- A framework for few-shot evaluation of autoregressive language models.☆102Updated last year
- [EMNLP'23] Execution-Based Evaluation for Open Domain Code Generation☆44Updated 11 months ago
- Finding semantically meaningful and accurate prompts.☆46Updated last year
- ☆29Updated 9 months ago
- AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness☆97Updated last year
- The LM Contamination Index is a manually created database of contamination evidences for LMs.☆76Updated 7 months ago
- ☆33Updated last year
- ☆192Updated 3 months ago
- This project studies the performance and robustness of language models and task-adaptation methods.☆141Updated 6 months ago
- Resources for PVLDB 2023 submission☆23Updated 2 months ago
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]☆49Updated last week
- Scalable Meta-Evaluation of LLMs as Evaluators☆41Updated 9 months ago
- Interpretable and efficient predictors using pre-trained language models. Scikit-learn compatible.☆38Updated 7 months ago
- Characterization of relational table embeddings (VLDB 2024).☆25Updated 4 months ago
- PASTA: Post-hoc Attention Steering for LLMs☆108Updated 2 months ago
- Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)☆43Updated 2 years ago
- Code for paper Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding☆47Updated 5 months ago
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆41Updated last year
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval☆57Updated last month
- The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…☆27Updated last year
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkI☆91Updated last year