DS4SD / SemTabNetLinks
Repository for ACL paper: "Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs"
☆16Updated last year
Alternatives and similar repositories for SemTabNet
Users that are interested in SemTabNet are comparing it to the libraries listed below
Sorting:
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆68Updated last month
- Universal text classifier for generative models☆25Updated last year
- Create fast graph language models from converted PDF documents for knowledge extraction and Q&A.☆57Updated 10 months ago
- [TACL, EMNLP 2025 Oral] Code, datasets, and checkpoints for the paper "CRAFT Your Dataset: Task-Specific Synthetic Dataset Generation Thr…☆32Updated 3 weeks ago
- This project is a collection of fine-tuning scripts to help researchers fine-tune Qwen 2 VL on HuggingFace datasets.☆77Updated 5 months ago
- A Python library to chunk/group your texts based on semantic similarity.☆101Updated last year
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆29Updated 9 months ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆45Updated last year
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆71Updated last year
- Easy to use, High Performant Knowledge Distillation for LLMs☆97Updated 7 months ago
- ☆53Updated 10 months ago
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆212Updated last week
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated last year
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆78Updated last year
- ☆62Updated last year
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆82Updated last year
- A python library to define and validate data types in Docling.☆219Updated last week
- Code for the EMNLP'24 paper "Learning to Extract Structured Entities Using Language Models"☆48Updated 8 months ago
- Efficient few-shot learning with cross-encoders.☆60Updated last year
- ☆68Updated last year
- Python library to use Pleias-RAG models☆67Updated 7 months ago
- A new novel multi-modality (Vision) RAG architecture☆33Updated last year
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆35Updated last year
- ☆52Updated last year
- Evaluation framework for document processing models and services.☆59Updated this week
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated last year
- Doc2Graph transforms documents into graphs and exploit a GNN to solve several tasks.☆135Updated 2 months ago
- ☆70Updated last year