ilyalasy / DOM-LM
Unofficial Pytorch implementation of Dom-LM paper.
☆32Updated last year
Related projects ⓘ
Alternatives and complementary repositories for DOM-LM
- SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval☆47Updated 2 years ago
- Simplified DOM Trees for Transferable Attribute Extraction from the Web☆37Updated last month
- Completion After Prompt Probability. Make your LLM make a choice☆69Updated 2 weeks ago
- Dense X Retrieval: What Retrieval Granularity Should We Use?☆134Updated 10 months ago
- Evaluating tool-augmented LLMs in conversation settings☆72Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆68Updated last month
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆98Updated 10 months ago
- Efficient few-shot learning with cross-encoders.☆40Updated 9 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆162Updated 2 months ago
- Python API for https://vespa.ai, the open big data serving engine☆105Updated this week
- A fork of Dragnet that also extract author, headline, date, keywords from context, as well as built in metadata extraction all in one pac…☆239Updated 10 months ago
- ReLM is a Regular Expression engine for Language Models☆104Updated last year
- This repo is for handling Question Answering, especially for Multi-hop Question Answering☆64Updated 11 months ago
- 🚀 Scale your RAG pipeline using Ragswift: A scalable centralized embeddings management platform☆36Updated 9 months ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆77Updated 8 months ago
- A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!☆91Updated last year
- ☆22Updated 5 months ago
- The official code repo for "Sub-Sentence Encoder: Contrastive Learning of Propositional Semantic Representations".☆75Updated 10 months ago
- ☆48Updated 2 weeks ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆103Updated 6 months ago
- Vespa application making an index of the CORD-19 dataset.☆39Updated this week
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆37Updated 5 months ago
- The Screen Annotation dataset consists of pairs of mobile screenshots and their annotations. The annotations are in text format, and desc…☆49Updated 8 months ago
- It includes two datasets that are used in the downstream tasks for evaluating UIBert: App Similar Element Retrieval data and Visual Item …☆41Updated 3 years ago
- Vector Database with support for late interaction and token level embeddings.☆54Updated last month
- Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform☆81Updated this week
- an implementation of Self-Extend, to expand the context window via grouped attention☆118Updated 10 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆63Updated last year
- Build reliable, secure, and production-ready AI apps easily.☆46Updated 2 weeks ago
- LLM prompt language based on Jinja. Banks provides tools and functions to build prompts text and chat messages from generic blueprints. I…☆66Updated last week