Unsupervised Language Model Pre-training for French
☆247Apr 11, 2023Updated 2 years ago
Alternatives and similar repositories for Flaubert
Users that are interested in Flaubert are comparing it to the libraries listed below
Sorting:
- Data from the Sequoia treebank.☆11Feb 19, 2026Updated 2 weeks ago
- NLP French language model implementing ULMFiT☆87Mar 18, 2019Updated 6 years ago
- communication sur le moteur de pseudonymisation de la Cour de Cassation☆18Feb 14, 2023Updated 3 years ago
- McKernel: A Library for Approximate Kernel Expansions in Log-linear Time.☆13Sep 3, 2022Updated 3 years ago
- A french sequence to sequence pretrained model☆63Aug 27, 2022Updated 3 years ago
- UFSAC is a resource containing all WordNet Sense Annotated Corpora, and a Java library for manipulating them☆38May 17, 2022Updated 3 years ago
- French Machine Reading for Question Answering☆18Sep 21, 2022Updated 3 years ago
- SEM, a free NLP tool relying on machine learning technologies, especially CRFs.☆23Dec 1, 2021Updated 4 years ago
- A collection of over 1.5 Million tweets data translated to French, with their sentiment.☆35May 18, 2017Updated 8 years ago
- Weighted multiple-instance learning algorithm based on stochastic gradient descent☆12Feb 22, 2019Updated 7 years ago
- 🇧🇪 BelGPT-2: the 1st GPT model pretrained in French.☆34Feb 24, 2021Updated 5 years ago
- Factorization of the neural parameter space for zero-shot multi-lingual and multi-task transfer☆39Sep 22, 2020Updated 5 years ago
- Multilingual speech translation☆41Apr 15, 2021Updated 4 years ago
- 📧 Melusine: Use python to automatize your email processing workflow☆363Feb 26, 2026Updated last week
- ✒️ Cedille is a large French language model (6B), released under an open-source license☆204Feb 9, 2022Updated 4 years ago
- Calculette de l'impôt sur le revenu parsée☆15Feb 19, 2020Updated 6 years ago
- A simple frontend for https://github.com/etalab/csvapi☆37Feb 13, 2026Updated 3 weeks ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆88Dec 9, 2020Updated 5 years ago
- R package for Byte Pair Encoding based on YouTokenToMe☆16Sep 5, 2025Updated 6 months ago
- Project Dense Vectors Text Representation on 2D Plan☆16Mar 7, 2019Updated 7 years ago
- Small examples showing how to use Odin for various IE tasks☆16Jun 1, 2017Updated 8 years ago
- Streamlit apps on Cloud Run with Identity-Aware Proxy (IAP).☆10Mar 5, 2022Updated 4 years ago
- Twitter Discovery: Search articles referenced in your tweets, retweets, and favorites☆16Jun 16, 2020Updated 5 years ago
- ☆25May 11, 2024Updated last year
- Disambiguate is a tool for training and using state of the art neural WSD models☆60Jul 12, 2025Updated 7 months ago
- A word2vec negative sampling implementation with correct CBOW update.☆261Nov 8, 2021Updated 4 years ago
- ☆43Jan 3, 2022Updated 4 years ago
- Efficient learning of word representations☆22Feb 15, 2021Updated 5 years ago
- Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms☆14Jul 15, 2022Updated 3 years ago
- Extract structured data online☆12Jan 19, 2026Updated last month
- Docker image: DNS over HTTPS proxy☆11Jun 26, 2020Updated 5 years ago
- A tokenizer for French☆14Apr 18, 2013Updated 12 years ago
- Hierarchical State Machine for Unity☆14Sep 11, 2021Updated 4 years ago
- Code and experiments for the COLING2020 paper "Conception: Multilingually-Enhanced, Human-Readable Concept Vector Representations".☆11Dec 9, 2020Updated 5 years ago
- A more generic version of https://github.com/dataarts/armsglobe for visualizing source/destination data☆10Jul 5, 2016Updated 9 years ago
- Pixano website☆10Apr 7, 2022Updated 3 years ago
- Cours « science des données » à Mines ParisTech (2019-2020)☆23Jul 1, 2020Updated 5 years ago
- Parse and convert numbers written in French, English, Spanish, Portuguese, German and Catalan into their digit representation.☆113Jan 12, 2026Updated last month
- ☆56Feb 23, 2024Updated 2 years ago