Finetuning InstructLLaMA with portuguese data
☆561Jun 6, 2023Updated 2 years ago
Alternatives and similar repositories for cabrita
Users that are interested in cabrita are comparing it to the libraries listed below
Sorting:
- List of resources and tools developed with focus on Portuguese.☆311Jun 26, 2025Updated 8 months ago
- Finetuning Stanford Alpaca (LLaMA) with Brazilian Portuguese data☆39Apr 10, 2023Updated 2 years ago
- Extrator de entidades mencionadas em notícias da mídia☆15May 25, 2021Updated 4 years ago
- Portuguese Named Entity Recognition☆61Sep 27, 2023Updated 2 years ago
- ☆12Nov 10, 2024Updated last year
- Portuguese pre-trained BERT models☆861Jun 17, 2024Updated last year
- Natively pre-trained open-source Portuguese language models.☆79Feb 24, 2026Updated last week
- Code and documentation for the MariTalk API☆308Feb 20, 2026Updated 2 weeks ago
- Fine-tuning OpenLlama-Instruct with portuguese data, for commercial use.☆20Aug 8, 2023Updated 2 years ago
- The Natural Portuguese Language Benchmark (Napolab). Stay up to date with the latest advancements in Portuguese language models and their…☆72Jul 28, 2025Updated 7 months ago
- ☆16May 27, 2020Updated 5 years ago
- The evalution suite for the 🚀 Open Portuguese LLM Leaderboard☆26Aug 31, 2025Updated 6 months ago
- ☆20Dec 22, 2023Updated 2 years ago
- LegalNLP - Natural Language Processing Methods for the Brazilian Legal Language☆179Jun 12, 2023Updated 2 years ago
- Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks☆252Oct 12, 2025Updated 4 months ago
- repository to manage document-based translation with OmegaT☆18Nov 1, 2024Updated last year
- Instruct-tune LLaMA on consumer hardware☆18,964Jul 29, 2024Updated last year
- SemClinBr - a multi-institutional and multi-specialty semantically annotated corpus for Portuguese clinical NLP tasks☆31Mar 12, 2024Updated last year
- Brazilian Tertiary Care Dataset☆14Dec 14, 2022Updated 3 years ago
- ☆40Mar 25, 2023Updated 2 years ago
- A japanese finetuned instruction LLaMA☆128Mar 20, 2023Updated 2 years ago
- Alpaca dataset from Stanford, cleaned and curated☆1,582Apr 14, 2023Updated 2 years ago
- Stylometric framework in Python☆17Apr 9, 2015Updated 10 years ago
- Code and documentation to train Stanford's Alpaca models, and generate the data.☆30,267Jul 17, 2024Updated last year
- ☆23Feb 11, 2026Updated 3 weeks ago
- Related resources to the paper RoBERTaLexPT: A Legal RoBERTa Model pretrained with deduplication for Portuguese.☆20Mar 14, 2024Updated last year
- ☆20Jul 5, 2013Updated 12 years ago
- An experimental desktop client for using Claude Desktop's MCP with Novelcrafter codices.☆10Dec 3, 2024Updated last year
- Consider is a parser for the ThinkGear protocol used by NeuroSky devices (MindSet, BrainBand and others).☆16Apr 3, 2012Updated 13 years ago
- ☆14Apr 26, 2025Updated 10 months ago
- Code for training and evaluating T5 on Portuguese data.☆90Dec 8, 2022Updated 3 years ago
- A POC Neo4j Application about the News☆16Aug 15, 2013Updated 12 years ago
- A simple node.js wrapper for Stanford CoreNLP.☆10Aug 7, 2014Updated 11 years ago
- ☆16Oct 16, 2013Updated 12 years ago
- CROMER (CROss-document Main Events and entities Recognition), is a tool for cross-document coreference☆12Jan 14, 2015Updated 11 years ago
- ☆11Dec 10, 2022Updated 3 years ago
- Sharedb PubSub module based on ws-bus (websocket bus)☆10Jan 12, 2017Updated 9 years ago
- A proselint linter for use with Phabricator's arc command line tool.☆17Jun 17, 2016Updated 9 years ago
- Supporting code for the paper "Portuguese Language Models and Word Embeddings: Evaluating on Semantic Similarity Tasks".☆11Dec 8, 2022Updated 3 years ago