DataScienceUIBK / ArabicaQALinks
ArabicaQA: Comprehensive Dataset for Arabic Question Answering accepted at SIGIR 2024
☆18Updated last year
Alternatives and similar repositories for ArabicaQA
Users that are interested in ArabicaQA are comparing it to the libraries listed below
Sorting:
- This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.☆26Updated last year
- Generalist and Lightweight Model for Text Classification☆166Updated last month
- ☆125Updated 10 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafte…☆81Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆69Updated last month
- This repo is the central repo for all the RAG Evaluation reference material and partner workshop☆78Updated 8 months ago
- ☆104Updated 9 months ago
- ☆127Updated last year
- Solving data for LLMs - Create quality synthetic datasets!☆151Updated 11 months ago
- Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Bui…☆15Updated last year
- Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling☆21Updated last year
- Simple UI for debugging correlations of text embeddings☆306Updated 7 months ago
- Docutron Toolkit: detection and segmentation analysis for legal data extraction over documents.☆26Updated 2 years ago
- ☆17Updated last year
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆183Updated last year
- Recipes for learning, fine-tuning, and adapting ColPali to your multimodal RAG use cases. 👨🏻🍳☆351Updated 7 months ago
- ☆176Updated last month
- ☆53Updated 11 months ago
- ARAGOG- Advanced RAG Output Grading. Exploring and comparing various Retrieval-Augmented Generation (RAG) techniques on AI research paper…☆113Updated last year
- Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…☆201Updated last year
- Experimental tl;dr summaries for datasets on the Hugging Face Hub!☆10Updated last year
- Data extraction with LLM on CPU☆112Updated 2 years ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- Agentic RAG to help you build a startup🚀☆55Updated 9 months ago
- Testing and evaluation framework for voice agents☆161Updated 7 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆103Updated last year
- Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.☆43Updated 9 months ago
- LangChain, Llama2-Chat, and zero- and few-shot prompting are used to generate synthetic datasets for IR and RAG system evaluation☆38Updated 2 years ago
- A streaming whisper server for on-prem transcription☆22Updated last year