getalp / Flaubert
Unsupervised Language Model Pre-training for French
☆248Updated last year
Alternatives and similar repositories for Flaubert:
Users that are interested in Flaubert are comparing it to the libraries listed below
- spaCy + UDPipe☆160Updated 2 years ago
- NLP French language model implementing ULMFiT☆87Updated 5 years ago
- Custom French POS and lemmatizer based on Lefff for spacy☆66Updated last year
- How good is BERT ? Comparing BERT to other state-of-the-art approaches on a French sentiment analysis dataset☆156Updated 2 years ago
- ✒️ Cedille is a large French language model (6B), released under an open-source license☆202Updated 3 years ago
- A french sequence to sequence pretrained model☆57Updated 2 years ago
- 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy☆730Updated 6 months ago
- DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models☆156Updated 2 years ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆88Updated 4 years ago
- spaCy pipeline object for negating concepts in text☆279Updated 8 months ago
- A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata☆157Updated 2 years ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphology…☆221Updated 2 years ago
- Text tokenization and sentence segmentation (segtok v2)☆201Updated 2 years ago
- 🧪 Cutting-edge experimental spaCy components and features☆96Updated 9 months ago
- ✔️Contextual word checker for better suggestions (not actively maintained)☆413Updated 3 weeks ago
- 🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy☆309Updated last year
- BERTje is a Dutch pre-trained BERT model developed at the University of Groningen. (EMNLP Findings 2020) "What’s so special about BERT’s …☆136Updated 2 years ago
- 🍳 Recipes for the Prodigy, our fully scriptable annotation tool☆488Updated 6 months ago
- PYthon Automated Term Extraction☆309Updated 2 years ago
- SEM, a free NLP tool relying on machine learning technologies, especially CRFs.☆24Updated 3 years ago
- Information extraction from English and German texts based on predicate logic☆388Updated 2 years ago
- Question Answering annotation platform - Plateforme d'annotation☆90Updated last month
- Named Entity Recognition data for Europeana Newspapers☆171Updated last year
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆154Updated 8 months ago
- Fuzzy matching and more functionality for spaCy.☆254Updated 7 months ago
- UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files☆375Updated 2 months ago
- SpikeX - SpaCy Pipes for Knowledge Extraction☆397Updated 3 years ago
- LASER multilingual sentence embeddings as a pip package☆224Updated last year
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆357Updated last year
- communication sur le moteur de pseudonymisation de la Cour de Cassation☆18Updated 2 years ago