latam-gpt / llm-data-evalLinks
LLM-aided data filtering
☆14Updated last year
Alternatives and similar repositories for llm-data-eval
Users that are interested in llm-data-eval are comparing it to the libraries listed below
Sorting:
- Benchmarks for Evaluating Spanish Language Models☆11Updated 2 years ago
- ☆43Updated 8 months ago
- Repositorio general para Bootcamps de Data Science en Coding Dojo☆11Updated last month
- Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper☆16Updated last year
- ☆11Updated 4 months ago
- A CLI for generating synthetic data☆42Updated 7 months ago
- ALBETO and DistilBETO are versions of ALBERT and DistilBERT pre-trained exclusively on Spanish corpora.☆40Updated 2 years ago
- Chunk your text using gpt4o-mini more accurately☆44Updated last year
- Datamodels for hugging face tokenizers☆86Updated last week
- ☆12Updated last year
- Efficiently find the best-suited language model (LM) for your NLP task☆132Updated 5 months ago
- Generalist and Lightweight Model for Text Classification☆166Updated last month
- A curated list of materials on AI guardrails☆43Updated 7 months ago
- Simple UI for debugging correlations of text embeddings☆306Updated 7 months ago
- A public repo that contains integrations for Argilla and LlamaIndex.☆17Updated last year
- Pre-train Static Word Embeddings☆94Updated 4 months ago
- ☆125Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆72Updated last year
- ☆53Updated 11 months ago
- ☆30Updated 8 months ago
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆68Updated this week
- Collection of resources for RL and Reasoning☆27Updated 11 months ago
- A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs …☆61Updated 11 months ago
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆32Updated 3 months ago
- Trully flash implementation of DeBERTa disentangled attention mechanism.☆67Updated 3 months ago
- Train LLM on Hugging Face infra☆67Updated last month
- Let's build better datasets, together!☆267Updated last year
- Hugging Face Inference Toolkit used to serve transformers, sentence-transformers, and diffusers models.☆89Updated last month
- 🤗 HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)☆17Updated last year
- synthetic data for ml☆25Updated 11 months ago