Maximax67/Words-CEFR-Dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Maximax67/Words-CEFR-Dataset)

Maximax67 / Words-CEFR-Dataset

A dataset mapping English words to CEFR levels based on the CEFR-J dataset, word lemmas, stems, parts of speech (POS), and frequency data from the N-Gram Google dataset. Ideal for NLP tasks, language proficiency assessment, and linguistic research.

☆85

Alternatives and similar repositories for Words-CEFR-Dataset

Users that are interested in Words-CEFR-Dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

kanishkg / boxing-gym
View on GitHub
☆12Jul 30, 2025Updated 11 months ago
NLP-Knowledge-Graph / NLP-KG-WebApp
View on GitHub
The official repository for the NLP-KG web application [ACL 2024 Demo].
☆14Oct 16, 2025Updated 9 months ago
hljodbokasafnid / Ascanius
View on GitHub
Automates the creation of full-text (sound and text) ebooks in epub/epub3/daisy format, the webserver/client creates smil files to sync a…
☆10Nov 12, 2021Updated 4 years ago
llt22 / coca-vocabulary-20000
View on GitHub
coca-vocabulary-20000
☆335Mar 7, 2025Updated last year
namtuanly / WikiTableSet
View on GitHub
WikiTableSet: A largest publicly available image-based table recognition dataset in three languages built from Wikipedia
☆32Jun 12, 2025Updated last year
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
robbiemu / llama-gguf-optimize
View on GitHub
Scripts and tools for optimizing quantizations in llama.cpp with GGUF imatrices.
☆19Jan 10, 2025Updated last year
WSE-research / LinguaF
View on GitHub
python package for calculating famous measures in computational linguistics
☆15Jun 29, 2026Updated 3 weeks ago
akazwz / mlx-live
View on GitHub
一个基于 MLX 的本地实时语音助手示例。
☆34Apr 8, 2026Updated 3 months ago
KYLN24 / CritiQ
View on GitHub
Repository of the paper ''CritiQ: Mining Data Quality Criteria from Human Preferences". Code for CritiQ Flow & Training CritiQ Scorer.
☆22Dec 11, 2025Updated 7 months ago
YuyaoZhangQAQ / QCompiler
View on GitHub
This repository contains the code for the paper “Neuro-Symbolic Query Compiler”, accepted to the Findings of ACL 2025.
☆17Oct 20, 2025Updated 9 months ago
manhdh32 / 1st_kalapa_ocr
View on GitHub
☆11Jan 1, 2024Updated 2 years ago
tonghejiao / canvas-mindmap-keyboard
View on GitHub
MindMap for Obsidian Canvas: Tab/Enter/Arrow driven, auto-layout and auto-size.
☆16Oct 23, 2025Updated 9 months ago
ishanShahzad / Google-meet-bot-record-audio-and-transcription
View on GitHub
Google meet bot deployed on Digital ocean join meetings from Google calendar and record audio+transcription.
☆10Aug 4, 2021Updated 4 years ago
tud-hri / joan
View on GitHub
JOAN is an software package that allows to perform human-in-the loop experiments in the open source driving simulator CARLA. JOAN facilit…
☆19Feb 3, 2025Updated last year
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
jpmcair / tweetfinsent
View on GitHub
TweetFinSent: A Dataset of Stock Sentiments on Twitter
☆13Jul 7, 2022Updated 4 years ago
302ai / 302_vector_graphics_generation
View on GitHub
🖼️🤖 302 Vector Graphics Generation! 🚀✨
☆17Aug 26, 2025Updated 10 months ago
kadirnar / fast-dacvae
View on GitHub
☆20Mar 17, 2026Updated 4 months ago
alby13 / NVIDIA-Nemo-Parakeet-TDT-0-6B-V2-Audio-to-Text
View on GitHub
NVIDIA Nemo Parakeet TDT 0.6B V2 Audio to Text Python Script
☆20May 8, 2025Updated last year
Vexa-ai / n8n
View on GitHub
☆17May 10, 2025Updated last year
starrYYxuan / LeCo
View on GitHub
This the implementation of LeCo
☆33Jan 20, 2025Updated last year
dacsang97 / aigc
View on GitHub
AIGC - AI-powered Git Commit Message Generator
☆29Dec 30, 2024Updated last year
dmisol / flexatar-virtual-webcam
View on GitHub
Personalized Virtual Webcam for WebRTC
☆19Apr 20, 2026Updated 3 months ago
ducnt18121997 / Viet-Text-Normalization
View on GitHub
A Python library for text normalization, specifically designed for Vietnamese and English text processing. This library provides comprehe…
☆14Mar 30, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
gitkaz / mlx_gguf_server
View on GitHub
This is a FastAPI based LLM server. Load multiple LLM models (MLX or llama.cpp) simultaneously using multiprocessing.
☆18Apr 8, 2026Updated 3 months ago
fishiatee / Tumera
View on GitHub
Yet another frontend for LLM, written using .NET and WinUI 3
☆11Sep 14, 2025Updated 10 months ago
KevinAHM / soprano-web-onnx
View on GitHub
☆16Jan 10, 2026Updated 6 months ago
AndreaBasile97 / Scholarpy
View on GitHub
An ideal companion for PhD students! This tool is crafted to streamline academic research by wrapping around Semantic Scholar APIs. 🎓❤️
☆22Jun 17, 2024Updated 2 years ago
Xiaochr / LLM-AES
View on GitHub
[LAK25] Human-AI Collaborative Essay Scoring: A Dual-Process Framework with LLMs
☆34Feb 22, 2025Updated last year
CodeByPinar / YouTube-Data-Analysis-Insights
View on GitHub
🚀 Welcome to the YouTube Data Analysis and Insights project! 📊
☆18Sep 21, 2023Updated 2 years ago
wass08 / chatbot-kit-lite
View on GitHub
☆16Dec 19, 2025Updated 7 months ago
supermemoryai / pipecat-memory
View on GitHub
Add persistent memory to Pipecat voice AI agents
☆20Jan 23, 2026Updated 6 months ago
archiephan78 / ssi-stock-mcp-server
View on GitHub
VN Stock intraday data MCP server (using SSI FastConnect API)
☆17Aug 6, 2025Updated 11 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
KyleBing / english-vocabulary
View on GitHub
英文单词，英语单词，四六级 CET4 CET6、考研、SAT单词，txt 文件, json 文件
☆1,778Dec 31, 2025Updated 6 months ago
kamjin3086 / kokoro-onnx-fastapi
View on GitHub
高性能本地化语音合成API服务，基于kokoro-onnx开发，支持中文和多语言，提供FastAPI接口与Docker部署，一键搭建私有TTS服务。
☆16Jan 10, 2026Updated 6 months ago
thuy-le-ep / Vietnamese-data
View on GitHub
Include Vietnamese stop words, Vietnamese person names, Vietnam GIS(Geographic Information System) data, Vietnamese Dictionary ...
☆14Oct 18, 2017Updated 8 years ago
neosun100 / supertonic-tts-enhanced
View on GitHub
Enhanced Supertonic TTS with Docker, FastAPI, Web UI, and comprehensive API documentation
☆21Dec 7, 2025Updated 7 months ago
sagemathinc / cocalc-desktop
View on GitHub
This is the CoCalc Electron desktop application.
☆19Sep 30, 2022Updated 3 years ago
KanishqGandharv219 / N8N-Automation-Suite
View on GitHub
Complete N8N workflow collection for automated lead generation, LinkedIn scraping, and AI-powered qualification. Includes ready-to-use wo…
☆17May 13, 2025Updated last year
kyr0 / fast-qwen-asr-inference-vllm
View on GitHub
FastAPI to serve Qwen-ASR with streaming support. Tested. Benchmarked. Flash Attention 2. Fast & Stable.
☆15Jun 24, 2026Updated last month