trustbit / enterprise-rag-challenge
Enterprise RAG Challenge to test accuracy of different LLM-driven assistants
☆24Updated 2 weeks ago
Related projects: ⓘ
- 2D Positional Embeddings for Webpage Structural Understanding 🦙👀☆93Updated 2 weeks ago
- ☆26Updated last month
- MERA (Multimodal Evaluation for Russian-language Architectures) is a new open benchmark for the Russian language for evaluating fundament…☆57Updated this week
- ☆51Updated last week
- Бенчмарк сравнивает русские аналоги ChatGPT: Saiga, YandexGPT, Gigachat☆55Updated 11 months ago
- ExplainitAll — это библиотека для интерпретируемого ИИ, предназначенная для интерпретации генеративных моделей ( GPT-like), и векторизато…☆14Updated last month
- ☆53Updated this week
- Using multiple LLMs for ensemble Forecasting☆17Updated 8 months ago
- Modified Arena-Hard-Auto LLM evaluation toolkit with an emphasis on Russian language☆15Updated 2 weeks ago
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [F…☆57Updated 3 months ago
- RuBLiMP: Russian Benchmark of Linguistic Minimal Pairs☆13Updated 2 months ago
- Thin wrapper around OpenAI Whisper API with streaming support☆85Updated 3 months ago
- ☆59Updated last week
- Universal LLM Telegram chatbot in Python☆13Updated last month
- ☆18Updated 2 years ago
- Bunch of notebooks for pre-training custom Saiga-like LLM☆13Updated 7 months ago
- LangChain-compatible integrations with YandexGPT and YandexGPT Embeddings☆31Updated last week
- Framework for processing and filtering datasets☆25Updated last month
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆58Updated 2 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆48Updated 2 months ago
- ☆15Updated 2 weeks ago
- This project is concerned with my participating in the RuNNE competition https://github.com/dialogue-evaluation/RuNNE☆10Updated last year
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems.☆48Updated 3 weeks ago
- Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first app…☆161Updated 8 months ago
- ☆21Updated 11 months ago
- ☆24Updated 2 months ago
- ☆18Updated 3 months ago
- Estimate Your LLM's Token Toll Across Various Platforms and Configurations☆28Updated last month
- Official homepage for "Self-Harmonized Chain of Thought"☆45Updated this week
- Using transformers to generate Russian poetry☆35Updated last year