Lyon-NLP / mteb-french
MTEB: Massive Text Embedding Benchmark French extended
☆19Updated 8 months ago
Alternatives and similar repositories for mteb-french:
Users that are interested in mteb-french are comparing it to the libraries listed below
- CLIR version of ColBERT☆67Updated 4 months ago
- The repository contains generative AI analytics platform application code.☆23Updated 3 months ago
- GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embeddings☆37Updated 11 months ago
- Two approaches for robust TableQA: 1) ITR is a general-purpose retrieval-based approach for handling long tables in TableQA transformer m…☆38Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆67Updated 4 months ago
- GraphER: A Structure-aware Text-to-Graph Model for Entity and Relation Extraction☆66Updated 6 months ago
- Ready-to-go containerized RAG service. Implemented with text-embedding-inference + Qdrant/LanceDB.☆57Updated last month
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆55Updated 6 months ago
- ☆22Updated 7 months ago
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆45Updated 8 months ago
- FastFit ⚡ When LLMs are Unfit Use FastFit ⚡ Fast and Effective Text Classification with Many Classes☆185Updated 4 months ago
- ☆62Updated 7 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆63Updated last year
- Pre-train Static Word Embeddings☆47Updated 3 weeks ago
- RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker☆106Updated last week
- Benchmark various LLM Structured Output frameworks: Instructor, Mirascope, Langchain, LlamaIndex, Fructose, Marvin, Outlines, etc on task…☆147Updated 4 months ago
- This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"☆201Updated 2 months ago
- Bi-encoder entity linking architecture☆44Updated 5 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Updated 4 months ago
- ☆41Updated 3 weeks ago
- Generalist and Lightweight Model for Text Classification☆65Updated this week
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆46Updated last year
- Notebooks for ThirdAI demos☆74Updated 4 months ago
- 🌏 Modular retrievers for zero-shot multilingual IR.☆27Updated 11 months ago
- ☆30Updated last year
- ☆84Updated 5 months ago
- ☆26Updated 8 months ago
- Efficient few-shot learning with cross-encoders.☆48Updated last year
- Code for the EMNLP'24 paper "Learning to Extract Structured Entities Using Language Models"☆25Updated 2 weeks ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆57Updated 11 months ago