Nkluge-correa / TeenyTinyLlama
A pair of tiny foundational models trained in Brazilian Portuguese.π¦π¦
β26Updated last month
Related projects β
Alternatives and complementary repositories for TeenyTinyLlama
- β46Updated 9 months ago
- Efficient few-shot learning with cross-encoders.β40Updated 9 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)β68Updated last month
- A Natural Portuguese Language Benchmark (Napolab) for the evaluation of language models.β64Updated 2 months ago
- Using short models to classify long textsβ20Updated last year
- A BERT-based application for reusable text classification at scaleβ37Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.β32Updated this week
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β62Updated 2 weeks ago
- π€ Disaggregators: Curated data labelers for in-depth analysis.β65Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimizationβ40Updated 8 months ago
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engineβ31Updated 2 years ago
- GLiNER model in a FastAPI microservice.β30Updated 3 weeks ago
- β13Updated 10 months ago
- Unofficial python bindings for the rust llm library. πβ€οΈπ¦β73Updated last year
- Versatile framework designed to streamline the integration of your models, as well as those sourced from Hugging Face, into complex progrβ¦β23Updated 3 months ago
- Repository containing the SPIN experiments on the DIBT 10k ranked promptsβ23Updated 8 months ago
- π₯ Use Hugging Face text and token classification pipelines directly in spaCyβ62Updated 8 months ago
- Code and data to evaluate LLMs on the ENEM, the main standardized Brazilian university admission exams.β36Updated 11 months ago
- π€ Trade any tensors over the networkβ30Updated last year
- utilities for loading and running text embeddings with onnxβ39Updated 3 months ago
- β11Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2β¦β66Updated last year
- Repository for deepdoctection tutorial notebooksβ39Updated 4 months ago
- Chunk your text using gpt4o-mini more accuratelyβ42Updated 3 months ago
- π« SpaCy wrapper for ConceptNet π«β88Updated last year
- β68Updated 8 months ago
- FaQuAD reading comprehension dataset and related code to reproduce experiments from Sayama et al. (BRACIS 2019).β8Updated 2 years ago
- Generalist and Lightweight Model for Text Classificationβ51Updated last week
- This repository contains an easy and intuitive approach to use SetFit in combination with spaCy.β72Updated last year
- Scripts to convert datasets from various sources to Hugging Face Datasets.β58Updated 2 years ago