JusperLee / DolphinLinks
☆164Updated 4 months ago
Alternatives and similar repositories for Dolphin
Users that are interested in Dolphin are comparing it to the libraries listed below
Sorting:
- PyTorch Implementation of VersBand(EMNLP 2025): Versatile Framework for Song Generation with Prompt-based Control☆224Updated 5 months ago
- Code for paper "Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models"☆242Updated last year
- [DCASE 2023] Official Implementation for "Low-Complexity Acoustic Scene Classification Using Deep Space Separable Distillation And Mutil-…☆25Updated last year
- LLaQo, a Large Language Query-based Coach in the domain of expressive performance☆111Updated 3 weeks ago
- Dataset and evaluation code of ISDrama(ACM-MM 2025): Immersive Spatial Drama Generation through Multimodal Prompting☆236Updated 5 months ago
- This is the code for Visual Reasoning Sequential Attack, which is a method to jailbreak Multimodal Large Language Models Based on their v…☆64Updated last week
- ☆14Updated 8 months ago
- From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano☆76Updated last year
- DExter: Learning and Controlling Performance Expression through Diffusion models☆114Updated last year
- Butter is a novel 2D object detection framework designed to enhance hierarchical feature representations for improved detection robustnes…☆85Updated 5 months ago
- Official code of ICML 2025 paper "NTPP: Generative Speech Language Modeling for Dual-Channel Spoken Dialogue via Next-Token-Pair Predicti…☆135Updated 3 months ago
- ☆45Updated 3 months ago
- ☆279Updated 9 months ago
- Official repo for the paper "Multimodal Phased Transformer for Sentiment Analysis".☆183Updated 4 months ago
- A reading list for trustworthy audio large language models.☆114Updated this week
- ☆50Updated 10 months ago
- Open-source framework for automatic video annotation.☆106Updated 9 months ago
- Official code of the paper "MIND: Multi-rationale INtegrated Discriminative Reasoning Framework for Multi-modal Large Models"☆35Updated 2 months ago
- A PyTorch implementation of the paper "Provably Efficient Online RLHF with One-Pass Reward Modeling". This repository provides a flexible…☆88Updated 2 months ago
- RKAN: Residual Kolmogorov-Arnold Network is designed to enhance the performance of deep learning models.☆274Updated 3 months ago
- ☆391Updated 9 months ago
- The code for TPAMI paper "Text-Guided Human Image Manipulation via Image-Text Shared Space"☆86Updated 3 years ago
- ☆344Updated 7 months ago
- Large-Scale Selfie Video Dataset (L-SVD): A Benchmark for Emotion Recognition☆306Updated last year
- Cascade is a production-ready, high-performance, and low-latency audio stream processing library designed for Voice Activity Detection (V…☆84Updated last month
- Official implementation of Text2VectorSQL: Towards a Unified Interface for Vector Search and SQL Queries☆52Updated 2 months ago
- Improvements to animations based on Manim, designed to facilitate the demonstration of algorithms in data structures, operating systems, …☆207Updated last month
- [SIGGRAPH 2025 (TOG)] MyTimeMachine: Personalized Facial Age Transformation☆50Updated last week
- This is the project for the paper at ICCV 2025☆85Updated 3 months ago
- MTLA: Multi-head Temporal Latent Attention☆760Updated 4 months ago