latam-gpt / llm-data-evalLinks
LLM-aided data filtering
☆13Updated last year
Alternatives and similar repositories for llm-data-eval
Users that are interested in llm-data-eval are comparing it to the libraries listed below
Sorting:
- ☆43Updated 7 months ago
- Benchmarks for Evaluating Spanish Language Models☆11Updated 2 years ago
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 4 months ago
- Repositorio general para Bootcamps de Data Science en Coding Dojo☆11Updated last month
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆66Updated 2 weeks ago
- Generalist and Lightweight Model for Text Classification☆167Updated 2 weeks ago
- ☆11Updated 4 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆119Updated 8 months ago
- Pre-train Static Word Embeddings☆93Updated 3 months ago
- A CLI for generating synthetic data☆42Updated 7 months ago
- ☆12Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆71Updated last year
- Datamodels for hugging face tokenizers☆86Updated 3 weeks ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆51Updated last year
- ☆80Updated last year
- A library for working with prompt templates locally or on the Hugging Face Hub.☆51Updated 9 months ago
- ☆53Updated 10 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆67Updated 2 months ago
- A curated list of materials on AI guardrails☆43Updated 6 months ago
- Python library to use Pleias-RAG models☆67Updated 7 months ago
- A full fledged mistral+wandb☆13Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 3 months ago
- A RAG that can scale 🧑🏻💻☆11Updated last year
- SynthGenAI - Package for Generating Synthetic Datasets using LLMs.☆54Updated 3 weeks ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆45Updated last year
- Lite weight wrapper for the independent implementation of SPLADE++ models for search & retrieval pipelines. Models and Library created by…☆33Updated last year
- Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper☆16Updated last year
- ☆37Updated last year
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆67Updated 2 weeks ago