latam-gpt / llm-data-evalLinks
LLM-aided data filtering
☆12Updated 10 months ago
Alternatives and similar repositories for llm-data-eval
Users that are interested in llm-data-eval are comparing it to the libraries listed below
Sorting:
- ☆41Updated 5 months ago
- Benchmarks for Evaluating Spanish Language Models☆11Updated 2 years ago
- ☆11Updated last month
- Repositorio general para Bootcamps de Data Science en Coding Dojo☆11Updated last week
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper☆16Updated 10 months ago
- Efficiently find the best-suited language model (LM) for your NLP task☆127Updated 2 months ago
- Pre-train Static Word Embeddings☆87Updated last month
- Generalist and Lightweight Model for Text Classification☆163Updated 4 months ago
- A CLI for generating synthetic data☆42Updated 5 months ago
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆68Updated this week
- Datamodels for hugging face tokenizers☆77Updated 3 weeks ago
- ☆12Updated last year
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆58Updated last week
- ALBETO and DistilBETO are versions of ALBERT and DistilBERT pre-trained exclusively on Spanish corpora.☆37Updated 2 years ago
- Notebooks for training universal 0-shot classifiers on many different tasks☆136Updated 9 months ago
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆116Updated 6 months ago
- A tool that facilitates easy, efficient and high-quality fine-tuning of Cohere's models☆73Updated 7 months ago
- Simple UI for debugging correlations of text embeddings☆295Updated 4 months ago
- ☆35Updated 10 months ago
- This is the reproduction repository for my 🤗 Hugging Face blog post on synthetic data☆68Updated last year
- DashAI: an interactive platform for training, evaluating and deploying AI models☆65Updated this week
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆33Updated last month
- Unannotated Spanish 3 Billion Words Corpora☆105Updated 3 years ago
- A framework for evaluating semantic search across custom datasets, metrics, and embedding backends.☆35Updated 4 months ago
- A library for working with prompt templates locally or on the Hugging Face Hub.☆50Updated 7 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆66Updated 3 weeks ago
- A spaCy wrapper for GliNER☆122Updated 8 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆50Updated last year
- A repository containing general tutorials I'd like to share with the world.☆46Updated 3 months ago