DataScienceUIBK / ArabicaQALinks
ArabicaQA: Comprehensive Dataset for Arabic Question Answering accepted at SIGIR 2024
☆17Updated last year
Alternatives and similar repositories for ArabicaQA
Users that are interested in ArabicaQA are comparing it to the libraries listed below
Sorting:
- ☆125Updated last year
- This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.☆26Updated 9 months ago
- Generalist and Lightweight Model for Text Classification☆157Updated 3 months ago
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆75Updated 4 months ago
- ☆124Updated 6 months ago
- DocLLM: A layout-aware generative language model for multimodal document understanding☆129Updated last year
- Vision Document Retrieval (ViDoRe): Benchmark. Evaluation code for the ColPali paper.☆236Updated last month
- Data extraction with LLM on CPU☆68Updated last year
- Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Bui…☆16Updated last year
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆109Updated last year
- LitePali is a minimal, efficient implementation of ColPali for image retrieval and indexing, optimized for cloud deployment.☆65Updated 11 months ago
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆332Updated 3 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 11 months ago
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆178Updated 11 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆67Updated 10 months ago
- Data extraction with LLM on CPU☆112Updated last year
- Simple UI for debugging correlations of text embeddings☆291Updated 3 months ago
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- This repo is for handling Question Answering, especially for Multi-hop Question Answering☆67Updated last year
- Lightweight Non-Parametric Embedding Fine-Tuning☆36Updated this week
- ☆49Updated 7 months ago
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆189Updated last year
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆74Updated 6 months ago
- Python intefrace for evaluation on chatgpt models☆19Updated last year
- Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling☆20Updated last year
- ☆95Updated 5 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆17Updated last year
- Doing simple retrieval from LLM models at various context lengths to measure accuracy☆103Updated last year