telekom / wikipedia-22-12-de-dpr
German dataset for DPR model training
☆18Updated 9 months ago
Alternatives and similar repositories for wikipedia-22-12-de-dpr:
Users that are interested in wikipedia-22-12-de-dpr are comparing it to the libraries listed below
- Chunk your text using gpt4o-mini more accurately☆44Updated 8 months ago
- A Python library aimed at dissecting and augmenting NER training data.☆58Updated last year
- Generalist and Lightweight Model for Text Classification☆121Updated 2 weeks ago
- 🤗 Disaggregators: Curated data labelers for in-depth analysis.☆65Updated 2 years ago
- Pre-train Static Word Embeddings☆56Updated 2 weeks ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆45Updated 2 weeks ago
- The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…☆131Updated 4 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆66Updated 5 months ago
- German Alpaca Dataset (Cleaned + Translated)☆24Updated 2 years ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated 4 months ago
- Using short models to classify long texts☆21Updated 2 years ago
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆100Updated last year
- Efficiently find the best-suited language model (LM) for your NLP task☆120Updated this week
- minimal pytorch implementation of bm25 (with sparse tensors)☆101Updated last year
- Fact checking baseline combining dense retrieval and textual entailment☆28Updated 3 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.☆58Updated 8 months ago
- Analysis on the cost of encoder based models☆11Updated 2 months ago
- A RAG that can scale 🧑🏻💻☆11Updated 10 months ago
- Using open source LLMs to build synthetic datasets for direct preference optimization☆61Updated last year
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆102Updated 3 weeks ago
- ☆47Updated last year
- ☆41Updated 2 months ago
- NLP with Rust for Python 🦀🐍☆62Updated 10 months ago
- ☆113Updated 2 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- ☆24Updated last year
- 💬 Language Identification with Support for More Than 2000 Labels -- EMNLP 2023☆127Updated 4 months ago
- [EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.☆108Updated 11 months ago
- Embedding Recycling for Language models☆38Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆76Updated 6 months ago