OpenLLM-France / Lit-ClaireLinks
Continual pretraining of foundation LLM using ⚡ Lightning Fabric
☆36Updated 7 months ago
Alternatives and similar repositories for Lit-Claire
Users that are interested in Lit-Claire are comparing it to the libraries listed below
Sorting:
- Interroger à l'aveugle deux modèles de langage conversationnels sur des tâches exprimées en français et comparer les résultats.☆37Updated this week
- Scripts for training Kaldi for German speech recognition (ASR).☆24Updated 4 years ago
- ☆76Updated 3 years ago
- A PyTorch Lightning Callback for pushing models to the Hugging Face Hub 🤗⚡️☆36Updated 3 years ago
- Softcatalà neural translation models☆18Updated 5 months ago
- Coqui Inference Engine☆40Updated 3 years ago
- AI Energy Score: Initiative to establish comparable energy efficiency ratings for AI models.☆31Updated 2 months ago
- Crowd-sourced lists of urls to help Common Crawl crawl under-resourced languages. See https://github.com/commoncrawl/web-languages-code/ …☆45Updated this week
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Tracking instruction-tuned LLM openness. Paper: Liesenfeld, Andreas, Alianda Lopez, and Mark Dingemanse. 2023. “Opening up ChatGPT: Track…☆118Updated 3 months ago
- Scripts to convert datasets from various sources to Hugging Face Datasets.☆57Updated 2 years ago
- Scansion tool for Spanish texts☆12Updated last year
- Local emulator for Hugging Face Inference Endpoints customer handlers☆26Updated last year
- **ARCHIVED** Filesystem interface to 🤗 Hub☆58Updated 2 years ago
- A small rust-based data loader☆29Updated 2 weeks ago
- Execute arbitrary SQL queries on 🤗 Datasets☆32Updated last year
- MediaWiki Categories Model☆13Updated last year
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated 2 months ago
- docker for HF wav2vec2-sprint☆13Updated 4 years ago
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- ☆56Updated 2 years ago
- Library for fast text representation and classification.☆30Updated last year
- Seed Machine Translation Data☆32Updated 7 months ago
- Extracts plain text, language identification and more metadata from WARC records☆22Updated 3 months ago
- image-to-text model for PDF.js☆41Updated 3 months ago
- Zero-shot Audio Classification using Whisper☆79Updated 2 years ago
- Automatic Test Generator☆12Updated 3 months ago
- This will hold the data pipeline to convert raw audio data to speech which will act as input dataset for speech-to-text pipeline☆32Updated 2 years ago
- Hassle-free ML Pipelines on Kubernetes☆39Updated 2 years ago
- Experiments with Hugging Face 🔬 🤗☆44Updated 10 months ago