srvk/how2-dataset

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/srvk/how2-dataset)

srvk / how2-dataset

This repository contains code and metadata of How2 dataset

☆192

Alternatives and similar repositories for how2-dataset

Users that are interested in how2-dataset are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

ImperialNLP / MMT-Delib
View on GitHub
☆10Dec 21, 2022Updated 3 years ago
dan-wells / kiss-aligner
View on GitHub
Simple Kaldi recipe for forced alignment
☆11Jul 16, 2023Updated 3 years ago
HLTCHKUST / VG-GPLMs
View on GitHub
The code repository for EMNLP 2021 paper "Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization".
☆57Jan 14, 2022Updated 4 years ago
ZNLP / ZNLP-Dataset
View on GitHub
☆31Jul 23, 2025Updated last year
isl-mt / SLT.KIT
View on GitHub
Spoken Language Translation System
☆20Jul 26, 2021Updated 4 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
how2sign / how2sign-data
View on GitHub
Scripts to download and explore the How2Sign dataset. If you have any questions, please contact: amanda.duarte@upc.edu
☆27Jan 25, 2023Updated 3 years ago
jayleicn / TVCaption
View on GitHub
[ECCV 2020] PyTorch code of MMT (a multimodal transformer captioning model) on TVCaption dataset
☆91Sep 6, 2023Updated 2 years ago
ustctf-zz / delibnet
View on GitHub
☆14Nov 16, 2022Updated 3 years ago
forkarinda / MFN
View on GitHub
Multistage Fusion with Forget Gate for Multimodal Summarization in Open-Domain Videos
☆12Oct 8, 2020Updated 5 years ago
ufal / MLASK
View on GitHub
EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"
☆11Nov 7, 2023Updated 2 years ago
RuABraun / texterrors
View on GitHub
☆37Jun 9, 2026Updated last month
alumae / streaming-punctuator
View on GitHub
☆17Apr 14, 2023Updated 3 years ago
georgesterpu / Taris
View on GitHub
Transformer-based online speech recognition system with TensorFlow 2
☆26Jan 22, 2021Updated 5 years ago
CoEDL / kaldi_helpers
View on GitHub
A set of scripts to use in preparing a corpus for speech-to-text processing with the Kaldi Automatic Speech Recognition Library.
☆15May 19, 2020Updated 6 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
XL2248 / VHM
View on GitHub
Code for the ACL2022 main conference paper "A Variational Hierarchical Model for Neural Cross-Lingual Summarization"
☆18Sep 5, 2022Updated 3 years ago
darthgera123 / Multimodal-Summarization
View on GitHub
Summarization of Multimodal articles
☆10Oct 14, 2022Updated 3 years ago
iriscxy / VMSMO
View on GitHub
Official code and dataset link for ''VMSMO: Learning to Generate Multimodal Summary for Video-based News Articles''
☆36Jul 30, 2021Updated 4 years ago
isl-mt / fluent-fisher
View on GitHub
☆15Jun 17, 2019Updated 7 years ago
amankhullar / mast
View on GitHub
Code for the paper Multimodal Abstractive Summarization with Trimodal Hierarchical Attention
☆20Jan 25, 2022Updated 4 years ago
sign-language-processing / datasets
View on GitHub
TFDS data loaders for sign language datasets.
☆108Feb 9, 2026Updated 5 months ago
kate-egorova / ASR-hybrid-decoding
View on GitHub
This is an extension of kaldi speech recognition software which allows to perform decoding of speech with hybrid word and phoneme graphs.…
☆11Feb 4, 2020Updated 6 years ago
lium-lst / nmtpytorch
View on GitHub
Sequence-to-Sequence Framework in PyTorch
☆392Jan 5, 2023Updated 3 years ago
motazsaad / ara-pronunciation-tool
View on GitHub
A python tool that converts Arabic diacritised text to a sequence of phonemes and creates a pronunciation dictionary. This code is based …
☆15Sep 5, 2017Updated 8 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
alicank / Translation-Augmented-LibriSpeech-Corpus
View on GitHub
Large scale (>200h) and publicly available read audio book corpus. This corpus is an augmentation of LibriSpeech ASR Corpus (1000h) and c…
☆44Jul 9, 2022Updated 4 years ago
eric-xw / Video-guided-Machine-Translation
View on GitHub
Starter code for the VMT task and challenge
☆51Jul 29, 2020Updated 5 years ago
h-munakata / Lighthouse-Wrapper-for-Audio-Moment-Retrieval
View on GitHub
☆13Mar 23, 2026Updated 4 months ago
markusdr / transducersaurus
View on GitHub
Automatically exported from code.google.com/p/transducersaurus
☆11Apr 1, 2015Updated 11 years ago
bzhangGo / zero
View on GitHub
Zero -- A neural machine translation system
☆152May 8, 2023Updated 3 years ago
slSeanWU / beats-conformer-bart-audio-captioner
View on GitHub
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Superv…
☆41Jan 6, 2024Updated 2 years ago
idiap / inv-tn
View on GitHub
A bunch of scripts exploiting several tools to perform inverse text normalization (ITN)
☆21Sep 27, 2017Updated 8 years ago
berniebear / Multi-HT100M
View on GitHub
☆53Dec 6, 2021Updated 4 years ago
NTRLab / MediaSpeech
View on GitHub
☆22Jul 22, 2022Updated 4 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
jefflai108 / Semi-Supervsied-Spoken-Language-Understanding-PyTorch
View on GitHub
Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining
☆12Mar 23, 2021Updated 5 years ago
xinjli / asr2k
View on GitHub
asr2k
☆51Jun 2, 2024Updated 2 years ago
LuoweiZhou / YouCook2-Leaderboard
View on GitHub
A one-stop shop for YouCook2 info such as leaderboard and recent advances on (cooking) video retrieval and captioning.
☆41Jun 29, 2022Updated 4 years ago
mzboito / IWSLT2022_Tamasheq_data
View on GitHub
Repository for sharing the data in the Tamasheq language, one of the target languages for the low-resource speech translation track at IW…
☆18Nov 30, 2022Updated 3 years ago
burrmill / burrmill
View on GitHub
BurrMill core
☆22Nov 2, 2021Updated 4 years ago
igormq / speech2text
View on GitHub
☆12Feb 9, 2021Updated 5 years ago
Dod-o / VT-SSum
View on GitHub
☆23Jul 13, 2021Updated 5 years ago