wntg/LLaMA-Omni

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/wntg/LLaMA-Omni)

wntg / LLaMA-Omni

llama-omni训练代码复现

☆72

Alternatives and similar repositories for LLaMA-Omni

Users that are interested in LLaMA-Omni are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

OmniMMI / OpenOmniNexus
View on GitHub
a fully open-source implementation of a GPT-4o-like speech-to-speech video understanding model.
☆38Apr 7, 2025Updated last year
anxiangsir / Video_Benchmark_Suite
View on GitHub
Video Benchmark Suite: Rapid Evaluation of Video Foundation Models
☆17Jan 10, 2025Updated last year
RainBowLuoCS / OpenOmni
View on GitHub
(NIPS 2025) OpenOmni: Official implementation of Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Align…
☆142May 9, 2026Updated 2 months ago
VITA-MLLM / Freeze-Omni
View on GitHub
✨✨Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM
☆388May 27, 2025Updated last year
xiaoxing2001 / DeGLA
View on GitHub
[ACM MM25] Official Pytorch implementation of [Decoupled Global-Local Alignment for Improving Compositional Understanding]
☆16Jul 15, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
deepglint / Victor
View on GitHub
ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs
☆29Aug 15, 2025Updated 11 months ago
anxiangsir / V-SWIFT
View on GitHub
V-SWIFT: Training a Small VideoMAE Model on a Single Machine in a Day
☆30Feb 5, 2025Updated last year
X-LANCE / SLAM-LLM
View on GitHub
A Framework for Speech, Language, Audio, Music Processing with Large Language Model
☆1,050Jan 15, 2026Updated 6 months ago
yangdongchao / RSTnet
View on GitHub
Real-time Speech-Text Foundation Model Toolkit (wip)
☆255Mar 26, 2025Updated last year
ictnlp / LLaMA-Omni2
View on GitHub
☆278May 19, 2025Updated last year
linshuqing / NoteRepo-remote-github
View on GitHub
☆25Oct 15, 2025Updated 9 months ago
BayLing-Models / BayLing-Speech
View on GitHub
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve spee…
☆3,146May 19, 2025Updated last year
deepglint / UniDoc-RL
View on GitHub
UniDoc-RL: Unified Document Understanding with Reinforcement Learning
☆16May 21, 2026Updated 2 months ago
SJTU-OmniAgent / VocalNet
View on GitHub
☆123May 18, 2026Updated 2 months ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
NKU-HLT / AudioEditor
View on GitHub
☆47Apr 2, 2025Updated last year
RWKV-Wiki / rwkv-wiki.github.io
View on GitHub
RWKV Wiki website (archived, please visit official wiki)
☆10Mar 26, 2023Updated 3 years ago
y-ren16 / TiCodec
View on GitHub
☆81Aug 11, 2025Updated 11 months ago
ADDchallenge / CFAD
View on GitHub
CFAD: A Chinese Dataset for Fake Audio Detection
☆24Jul 3, 2023Updated 3 years ago
WWWWxp / Speech-Tokenizer-Papers
View on GitHub
This repository collects papers related to Speech Tokenizer.
☆18Oct 16, 2024Updated last year
CLAD23 / CLAD
View on GitHub
☆21Apr 23, 2024Updated 2 years ago
DDATT / Vits2-onnx-cpp
View on GitHub
Simple inference for Vits2 TTS Using ONNXRUNTIME and espeak-ng on C++
☆19Apr 17, 2024Updated 2 years ago
whn09 / VITA
View on GitHub
✨✨VITA: Towards Open-Source Interactive Omni Multimodal LLM
☆11Jun 16, 2025Updated last year
MuSAELab / AUDDT
View on GitHub
A toolkit for benchmarking on a wide variety of audio deepfake datasets.
☆36May 22, 2026Updated 2 months ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
gpt-omni / mini-omni
View on GitHub
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming…
☆3,565Nov 5, 2024Updated last year
Prunoideae / rwkv-contrib
View on GitHub
☆10Aug 18, 2023Updated 2 years ago
y-ren16 / OV-InstructTTS
View on GitHub
☆22Jan 27, 2026Updated 6 months ago
seanghay / vits.cpp
View on GitHub
VITS Inference using ONNX Runtime on C++
☆13Dec 25, 2023Updated 2 years ago
futuredialchallenge / 2024-RAG
View on GitHub
A Challenge on Dialog Systems with Retrieval Augmented Generation (FutureDial-RAG), Co-located with SLT2024 FutureDial-RAG Challenge
☆11Aug 10, 2024Updated last year
LCF2764 / autoKWS2021_1st_solution
View on GitHub
Auto-KWS 2021 Challenge 1st place solution.
☆11Jul 20, 2021Updated 5 years ago
MikaStars39 / StableMask
View on GitHub
PyTorch implementation of StableMask (ICML'24)
☆15Jun 27, 2024Updated 2 years ago
xuchennlp / S2T
View on GitHub
The project for speech translation
☆12Sep 28, 2023Updated 2 years ago
chenpk00 / IS2024_stream_decoder_only_asr
View on GitHub
☆16Mar 12, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
Ruiqi-Yan / URO-Bench
View on GitHub
Towards Comprehensive Evaluation for End-to-End Spoken Dialogue Models
☆55Sep 2, 2025Updated 10 months ago
hammaad2002 / ASRAdversarialAttacks
View on GitHub
An ASR (Automatic Speech Recognition) adversarial attack repository.
☆44Nov 7, 2023Updated 2 years ago
kyutai-labs / moshi-finetune
View on GitHub
☆475Oct 3, 2025Updated 9 months ago
MatthewCYM / VoiceBench
View on GitHub
[TACL'26] VoiceBench: Benchmarking LLM-Based Voice Assistants
☆378Jun 11, 2026Updated last month
xieyuankun / Codecfake
View on GitHub
This is the official repo of our work titled "The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio".
☆76Dec 13, 2024Updated last year
hwang-cs-ime / ATSS
View on GitHub
This is the official code for ``ATSS: Detecting AI-Generated Videos via Anomalous Temporal Self-Similarity''
☆17Apr 7, 2026Updated 3 months ago
AdvSV / AdvSV.github.io
View on GitHub
AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. I…
☆11Nov 21, 2023Updated 2 years ago