latam-gpt / llm-data-evalLinks
LLM-aided data filtering
☆13Updated 11 months ago
Alternatives and similar repositories for llm-data-eval
Users that are interested in llm-data-eval are comparing it to the libraries listed below
Sorting:
- ☆42Updated 7 months ago
- Benchmarks for Evaluating Spanish Language Models☆11Updated 2 years ago
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 4 months ago
- ☆11Updated 3 months ago
- Repositorio general para Bootcamps de Data Science en Coding Dojo☆11Updated 2 weeks ago
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- Generalist and Lightweight Model for Text Classification☆165Updated this week
- ALBETO and DistilBETO are versions of ALBERT and DistilBERT pre-trained exclusively on Spanish corpora.☆39Updated 2 years ago
- Pre-train Static Word Embeddings☆91Updated 2 months ago
- ☆12Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 2 months ago
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆65Updated this week
- ☆124Updated last year
- Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper☆16Updated 11 months ago
- DashAI: an interactive platform for training, evaluating and deploying AI models☆68Updated last week
- A CLI for generating synthetic data☆42Updated 6 months ago
- A framework for evaluating semantic search across custom datasets, metrics, and embedding backends.☆35Updated 6 months ago
- Datamodels for hugging face tokenizers☆86Updated this week
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, including…☆67Updated 4 months ago
- A list of awesome open source projects in the machine learning field, who's developers are mainly based in Germany☆49Updated last year
- ☆37Updated last year
- Let's build better datasets, together!☆265Updated 11 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆67Updated 2 months ago
- Simple UI for debugging correlations of text embeddings☆301Updated 6 months ago
- synthetic data for ml☆25Updated 10 months ago
- Unified Schema-Based Information Extraction☆223Updated 3 weeks ago
- Command Line Interface for Hugging Face Inference Endpoints☆66Updated last year
- Plug-and-play, zero-shot document AI pipelines.☆117Updated last week
- Fine-tune ModernBERT on a large Dataset with Custom Tokenizer Training☆72Updated last month
- Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da☆117Updated 8 months ago