Weird autoencoder experiments
☆24Apr 24, 2026Updated this week
Alternatives and similar repositories for owl-vaes
Users that are interested in owl-vaes are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Basic world models☆32Oct 30, 2025Updated 5 months ago
- Audio Entailment: Deductive Reasoning for Audio Understanding☆17Dec 10, 2024Updated last year
- Official PyTorch implementation of "MM-PoisonRAG: Disrupting Multimodal RAG with Local and Global Poisoning Attacks"☆13Dec 4, 2025Updated 4 months ago
- Arduino library for the Maxim DS1337 I2C RTC.☆11Aug 20, 2014Updated 11 years ago
- Code for ICLR 2024 Paper: CompA: Addressing the Gap in Compositional Reasoning in Audio-Language Models☆22Jul 10, 2024Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- 一个将豆包 ASR 能力封装为 OpenAI 兼容 接口的小项目,支持 Docker 启动,并提供一份可配合 Spokenly 使用的参考修正提示词,实现和 Typeless 类似的语音修正效果。☆32Feb 28, 2026Updated 2 months ago
- KUDA: Keypoints to Unify Dynamics Learning and Visual Prompting for Open-Vocabulary Robotic Manipulation☆22Apr 23, 2025Updated last year
- GStar Bootcamp - Assignment 1☆16Sep 8, 2025Updated 7 months ago
- Dataset, code and results repository for SBA-Net.☆14Sep 23, 2022Updated 3 years ago
- ☆42Jun 6, 2025Updated 10 months ago
- Code release for 'Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs' (NeurIPS 2025)☆30Oct 28, 2025Updated 6 months ago
- Searching for Music Mixing Graphs: A Pruning Approach☆25Feb 13, 2025Updated last year
- 5Hz Deep-Compression Speech VAE for AR-Diffusion and CALMs☆57Nov 19, 2025Updated 5 months ago
- poorman's ar-dit tts☆45Dec 31, 2025Updated 4 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Sample relational database load scripts and SQL queries for processing SNOMED CT-AU RF2 release files.☆18Jul 17, 2024Updated last year
- ☆45Updated this week
- ☆23Apr 30, 2025Updated last year
- Semantic Map Learning of Traffic Light to Lane Assignment based on Motion Data☆11Mar 30, 2024Updated 2 years ago
- The official implementation of AAAI2024 paper of "Scribble Hides Class: Promoting Scribble-based Semantic Segmentation with its Class Lab…☆17Oct 10, 2024Updated last year
- FlowMirror-HydraVox — A natively accelerated multi-head autoregressive TTS system derived from CosyVoice 3.0. It predicts multiple tokens…☆50Feb 17, 2026Updated 2 months ago
- Official implementation of 'A Large-Scale Exploration of mu-Transfer'☆32Jun 5, 2025Updated 10 months ago
- Notebooks for the EPFL class "Computers and Music".☆25Aug 20, 2021Updated 4 years ago
- Understanding and Tackling Hallucinations in Large Audio-Language Models | ICASSP 2025, Interspeech 2024☆34Mar 14, 2025Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆35Nov 26, 2025Updated 5 months ago
- This repository collects papers related to Speech Tokenizer.☆18Oct 16, 2024Updated last year
- The official github repo for "Diffusion Language Models are Super Data Learners".☆228Nov 6, 2025Updated 5 months ago
- Code for the paper: Separate but togerher: Unsupervised Federated Learning for Speech Enhancement from non-iid data☆41Nov 1, 2021Updated 4 years ago
- A pytorch implementation of FFTNet.☆37Aug 31, 2018Updated 7 years ago
- ☆20Oct 14, 2024Updated last year
- 语音合成端到端TTS模型vits中文版,VITS Mandarin☆15Sep 17, 2022Updated 3 years ago
- 📜 [CVPRw] SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention, Muhammad Nawfal Meeran, Gokul Adethya T,…☆20Jun 17, 2024Updated last year
- The source code for target sound detection☆15Feb 26, 2022Updated 4 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Official Implementation for "ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation", CVPR 2024.☆10Jun 17, 2024Updated last year
- ☆42Dec 20, 2025Updated 4 months ago
- Uses ctypes and libespeak-ng to transform test into IPA phonemes☆26Sep 20, 2023Updated 2 years ago
- A web based platform for collecting human actions in reinforcement learning environments☆31Sep 10, 2025Updated 7 months ago
- Repository of Near-field Acoustic Holography analysis with Convolutional Neural Networks☆11Dec 16, 2020Updated 5 years ago
- Implementation of SoundStream, an end-to-end neural audio codec☆32Jun 11, 2023Updated 2 years ago
- Official repo for: Epipolar Geometry Improves Video Generation Models☆90Oct 28, 2025Updated 6 months ago