Nkluge-correa / TeenyTinyLlama
A pair of tiny foundational models trained in Brazilian Portuguese.π¦π¦
β35Updated 4 months ago
Alternatives and similar repositories for TeenyTinyLlama
Users that are interested in TeenyTinyLlama are comparing it to the libraries listed below
Sorting:
- β47Updated last year
- Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian university admission exams.β47Updated 5 months ago
- A Natural Portuguese Language Benchmark (Napolab) for the evaluation of language models.β67Updated 2 months ago
- β29Updated last year
- FaQuAD reading comprehension dataset and related code to reproduce experiments from Sayama et al. (BRACIS 2019).β8Updated 2 years ago
- β11Updated 3 years ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ63Updated last year
- β40Updated 2 years ago
- Code for training and evaluating T5 on Portuguese data.β86Updated 2 years ago
- Universal text classifier for generative modelsβ24Updated 9 months ago
- ToK aka Tree of Knowledge for Large Language Models LLM. It's a novel dataset that inspires knowledge symbolic correlation in simple inpuβ¦β52Updated last year
- Finetuning Stanford Alpaca (LLaMA) with Brazilian Portuguese dataβ39Updated 2 years ago
- Unofficial python bindings for the rust llm library. πβ€οΈπ¦β75Updated last year
- Transformer model for Portuguese language (Brazil pt_BR)β16Updated last year
- Optimus is a flexible and scalable framework built to train language models efficiently across diverse hardware configurations, includingβ¦β54Updated last month
- β17Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β34Updated last week
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β76Updated 6 months ago
- β43Updated 3 months ago
- A collection of datasets for language model pretraining including scripts for downloading, preprocesssing, and sampling.β59Updated 9 months ago
- β35Updated 2 years ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrievalβ14Updated last year
- β22Updated last month
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2β¦β67Updated 2 years ago
- Pre-train Static Word Embeddingsβ60Updated last month
- StAtutory Reasoning Assessmentβ13Updated 2 years ago
- π€ HuggingFace Inference Toolkit for Google Cloud Vertex AI (similar to SageMaker's Inference Toolkit, but for Vertex AI and unofficial)β17Updated last year
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.β79Updated last year
- Efficient few-shot learning with cross-encoders.β52Updated last year
- Some simple scripts that I use day-to-day when working with LLMs and Huggingface Hubβ162Updated last year