MohamedAliRashad / youtube-audio-collector
Simple script to collect code switching audio and captions data
☆8Updated 9 months ago
Alternatives and similar repositories for youtube-audio-collector
Users that are interested in youtube-audio-collector are comparing it to the libraries listed below
Sorting:
- This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.☆25Updated 5 months ago
- Code-Switched translations with Large Language models☆17Updated 5 months ago
- Python intefrace for evaluation on chatgpt models☆19Updated last year
- The official implementation of CATT Arabic diacritization models.☆44Updated 4 months ago
- ☆9Updated 4 months ago
- Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling☆20Updated 9 months ago
- ☆41Updated 3 weeks ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 6 months ago
- Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.☆40Updated last month
- Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.☆16Updated 8 months ago
- ☆25Updated 3 months ago
- 🚀 Framework for seamless fine-tuning of Whisper model on a multi-lingual dataset and deployment to prod.☆27Updated 2 months ago
- Fine tune Gemma 3 on an object detection task☆20Updated this week
- A framework for Arabic spelling correction using different seq2seq model architectures such as transformers and RNNs☆21Updated 9 months ago
- ☆42Updated 9 months ago
- Quantization of LLMs and benchmarking.☆10Updated last year
- A streaming whisper server for on-prem transcription☆20Updated 9 months ago
- Which model is the best at object detection? Which is best for small or large objects? We compare the results in a handy leaderboard.☆70Updated this week
- This Repository demostrates various examples using YOLO☆13Updated last year
- ☆12Updated 7 months ago
- Whisper finetuned on VinBigdata-VLSP2020-100h + KenLM☆39Updated last year
- This repo is for semantic search app to search over Quran tafsir books☆24Updated 10 months ago
- ☆15Updated 2 months ago
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆45Updated 9 months ago
- Provide Gradio custom components to make the diarization-based audio labeling process easier and faster.☆62Updated last month
- ☆112Updated 5 months ago
- Arabic deep-learning based diacritization models (Shakkala, Shakkelha) ported to PyTorch☆14Updated last year
- ☆30Updated last week
- A python package made to generate sequences (greedy and beam-search) from Pytorch (not necessarily HF transformers) models.☆17Updated this week
- Composition of Multimodal Language Models From Scratch☆14Updated 9 months ago