guenthermi / table-embeddings
Tools for training schema-aware Web table embedding for unsupervised and supervised machine learning on tabular data
☆14Updated 5 months ago
Related projects: ⓘ
- [NAACL'24] Dataset, code and models for "TableLlama: Towards Open Large Generalist Models for Tables".☆104Updated 4 months ago
- Resources for PVLDB 2023 submission☆18Updated 3 weeks ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆51Updated last month
- The data and the PyTorch implementation for the models and experiments in the paper "Exploiting Asymmetry for Synthetic Training Data Gen…☆56Updated last year
- ☆82Updated 3 weeks ago
- Code and data for "TURL: Table Understanding through Representation Learning"☆115Updated 2 years ago
- ☆78Updated 4 months ago
- CLIR version of ColBERT☆62Updated 3 months ago
- ☆45Updated 2 years ago
- Structured Prediction for Entity Linking☆25Updated last month
- Code for the paper "Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching, Data Cleaning, Text Classification, and Beyond…☆21Updated 2 years ago
- A Human-LLM Collaborative Dataset for Generative Information-seeking with Attribution☆30Updated last year
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆120Updated 8 months ago
- Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings (EMNLP 2022 paper)☆63Updated last year
- An easy-to-use python toolkit for flexibly adapting various neural ranking models to any target domain.☆55Updated last year
- This repository provides scripts for evaluating NLP models on the LEXTREME benchmark, a set of diverse multilingual tasks in legal NLP☆19Updated 8 months ago
- Implementation of paper: HLATR: Enhance Multi-stage Text Retrieval with Hybrid List Aware Transformer Reranking☆65Updated last year
- MTab: Entity Search and Table Annotation with Wikidata, Wikipedia, and DBpedia☆29Updated 2 years ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆118Updated 6 months ago
- An Open-Source Package for Information Retrieval☆145Updated last month
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆98Updated 2 weeks ago
- Benchmarking library for RAG☆87Updated this week
- Retrieval-Augmented Generation-based Relation Extraction☆28Updated last month
- Characterization of relational table embeddings (VLDB 2024).☆22Updated 2 months ago
- ☆27Updated 9 months ago
- provides a common interface to many IR measure tools☆75Updated 3 weeks ago
- ☆22Updated 2 months ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆29Updated 6 months ago
- Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).☆40Updated 2 years ago
- Dense hybrid representations for text retrieval☆60Updated last year