smallflyingpig/speech-to-image-translation-without-text

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/smallflyingpig/speech-to-image-translation-without-text)

smallflyingpig / speech-to-image-translation-without-text

Code for paper "direct speech-to-image translation"

☆26

Alternatives and similar repositories for speech-to-image-translation-without-text

Users that are interested in speech-to-image-translation-without-text are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

smallflyingpig / learning-to-fool-the-speaker-recognition
View on GitHub
code for paper "learning to fool the speaker recognition"
☆10Jun 12, 2020Updated 6 years ago
yuqing-liu-dut / ISRN
View on GitHub
Iterative Network for Image Super-Resolution (TMM)
☆20Nov 26, 2021Updated 4 years ago
duyichao / E2E-ST-TDA
View on GitHub
Official implementation of AAAI'2022 paper "Regularizing End-to-End Speech Translation with Triangular Decomposition Agreement"
☆17Dec 23, 2021Updated 4 years ago
NovelAI / k-diffusion-multigen
View on GitHub
Karras et al. (2022) diffusion models for PyTorch
☆18Oct 5, 2023Updated 2 years ago
smallflyingpig / universal_adversarial_perturbation_generative_network_for_speaker_recognition
View on GitHub
code for paper "Universal Adversarial Perturbations Generative Network for Speaker Recognition"
☆23Nov 23, 2020Updated 5 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
cloneofsimo / project_RF
View on GitHub
☆24Jun 4, 2024Updated 2 years ago
itmo-mbss-lab / sr_labs_book
View on GitHub
The project is related to the development of labs for the ITMO Speaker Recognition Course.
☆16Jul 3, 2026Updated 3 weeks ago
aws-samples / content-based-item-recommender
View on GitHub
☆10Apr 2, 2024Updated 2 years ago
sooftware / speech-paper-review
View on GitHub
Review of papers I read
☆14Dec 11, 2020Updated 5 years ago
Sreyan88 / Disfluency-Detection-with-Span-Classification
View on GitHub
This repository contains the implementation of the paper: "Span Classification with Structured Information for Disfluency Detection in Sp…
☆14Jun 6, 2023Updated 3 years ago
bzhangGo / st_from_scratch
View on GitHub
Revisiting End-to-End Speech-to-Text Translation From Scratch
☆13Feb 21, 2023Updated 3 years ago
jefflai108 / Semi-Supervsied-Spoken-Language-Understanding-PyTorch
View on GitHub
Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining
☆12Mar 23, 2021Updated 5 years ago
giakoumoglou / rrd
View on GitHub
PyTorch implementation of RRD: https://arxiv.org/abs/2407.12073
☆15Dec 2, 2025Updated 7 months ago
deepconsc / SplitSR
View on GitHub
SplitSR: An End-to-End Approach to Super-Resolution on Mobile Devices (Unofficial Implementation)
☆28Jan 24, 2021Updated 5 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
arijitx / Amazon-Satelite-Image-Labeling
View on GitHub
This is my CS 763 Computer Vision Course Project , Here we try to label Amazon Satelite Images. Here we try to implement the Show and Tel…
☆12May 10, 2018Updated 8 years ago
speech-paper-reading / speech-paper-reading
View on GitHub
Repository for speech paper reading
☆33Aug 19, 2021Updated 4 years ago
tunib-ai / transformers
View on GitHub
🚀 Implementation of easy-to-use 3D parallelism based on Huggingface Transformers & Microsoft DeepSpeed
☆31Feb 5, 2022Updated 4 years ago
Shiyang-Yan / Discrete-continous-PG-for-Retrieval
View on GitHub
☆13Feb 1, 2022Updated 4 years ago
Ha0Tang / ASGAN
View on GitHub
[FG 2019 Oral] Attribute-Guided Sketch Generation
☆10Jul 25, 2021Updated 5 years ago
tashapiro / predicting-song-music-genre
View on GitHub
What part of a song is better at determining it's music genre - the music (audio features) or the lyrics (NLP) ?
☆14Jan 2, 2023Updated 3 years ago
ZurichNLP / domain-robustness
View on GitHub
☆13Dec 11, 2020Updated 5 years ago
qcri / Arabic_speech_code_switching
View on GitHub
The first Dialectal Arabic Code Switching - DACS corpus from broadcast speech. Annotated at the token-level, considering both the linguis…
☆15Apr 3, 2022Updated 4 years ago
wagenaartje / neuraldino
View on GitHub
Neuraldino is a neural AI that learns to play Google's offline T-rex game from any player on the world.
☆14Jun 12, 2017Updated 9 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
LeadingIndiaAI / -IMAGE-TO-SPEECH-CONVERTOR-
View on GitHub
The aim of the project was to convert an image to speech. An image is processed and segmented to identify the text in the image. Then the…
☆13Sep 12, 2018Updated 7 years ago
SeongokRyu / my-study-materials
View on GitHub
☆13Jul 4, 2020Updated 6 years ago
yenchenlin / evf-public
View on GitHub
Experience-embedded Visual Foresight, CoRL 2019
☆14Nov 13, 2019Updated 6 years ago
CAU-ISS-Lab / AIGT-Detection-Evade-Detection
View on GitHub
☆13Sep 1, 2025Updated 10 months ago
EnkrateiaLucca / audio_transcription_app_version_2
View on GitHub
A simple audio transcription app with Gradio and Whisper
☆14Apr 23, 2023Updated 3 years ago
OmarMedhat22 / Sound-Classification-Short-Time-Fourier-Transform-STFT
View on GitHub
☆15May 28, 2020Updated 6 years ago
CAU-ISS-Lab / Cross-Modal-Steganography
View on GitHub
☆12Mar 3, 2025Updated last year
BAI-Yeqi / SF2F_PyTorch
View on GitHub
☆16Apr 27, 2025Updated last year
ctrlflowjs / ctrlflow
View on GitHub
An app dev framework for no-code user automations
☆12Jan 29, 2023Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
Mrunal-G / Casual-turn-taking-and-backchannel-prediction
View on GitHub
☆16Jun 25, 2024Updated 2 years ago
CAU-ISS-Lab / Text-steganalysis
View on GitHub
☆15Jul 3, 2026Updated 3 weeks ago
flyywh / VCM_resources
View on GitHub
☆20Nov 14, 2023Updated 2 years ago
LVYUERLVR / OutboundEval-Xbench
View on GitHub
OutboundEval, a comprehensive benchmark for evaluating large language models (LLMs) in expert-level intelligent outbound calling scenario…
☆17Oct 28, 2025Updated 9 months ago
msalhab96 / RNN-Transducer
View on GitHub
PyTorch implementation of Sequence Transduction with Recurrent Neural Networks (RNN-T) speech recognition paper
☆16Mar 4, 2022Updated 4 years ago
warnikchow / kosp2e
View on GitHub
Korean Speech to English Translation Corpus
☆45Sep 3, 2021Updated 4 years ago
felixSchober / Defect-Prediction
View on GitHub
Defect prediction of java projects using neural networks.
☆15Jun 28, 2017Updated 9 years ago