google-research-datasets/LLAMA1-Test-Set

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-research-datasets/LLAMA1-Test-Set)

google-research-datasets / LLAMA1-Test-Set

We introduce the LLAMA1 Test Set, a comprehensive open-domain world knowledge QA dataset for evaluating question-answering systems. We prompted the open-source LLama-7B model for questions and short answers on various topics. We gathered 300 questions (with Google Cloud TTS service, voice en-US-Neural2-C), and generally verified the answers.

☆23

Alternatives and similar repositories for LLAMA1-Test-Set

Users that are interested in LLAMA1-Test-Set are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

Mddct / transformer-vocos
View on GitHub
☆35Sep 6, 2025Updated 10 months ago
slp-rl / SpokenStoryCloze
View on GitHub
A spoken version of the textual story cloze benchmark
☆22Aug 6, 2023Updated 2 years ago
Mddct / usm-tokenizer
View on GitHub
semantic tokenizer for speech and music
☆20Jul 6, 2025Updated last year
3loi / NaturalVoices
View on GitHub
☆61Oct 22, 2025Updated 9 months ago
RanaCM / DSU-AVO
View on GitHub
Source code and speech samples for the DSU-AVO paper accepted to INTERSPEECH 2023
☆12May 13, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
slp-rl / SLM-Discrete-Representations
View on GitHub
This repo contains the official PyTorch implementation of "Analyzing Discrete Self Supervised Speech Representation For Spoken Language M…
☆20Jan 3, 2023Updated 3 years ago
AI-Hypercomputer / ray-tpu
View on GitHub
☆15May 11, 2025Updated last year
Mddct / cosyvoice2-flow-optimized
View on GitHub
faster inference
☆27Jan 20, 2025Updated last year
yoongi43 / VRVQ
View on GitHub
Implementation of the paper "Variable Bitrate Residual Vector Quantization for Audio Coding"
☆11Apr 10, 2025Updated last year
york135 / CTC_CE_for_AST
View on GitHub
The official repo/implementation of the paper "Training a Singing Transcription Model Using Connectionist Temporal Classification Loss an…
☆12Mar 25, 2025Updated last year
mushanshanshan / ESLTTS
View on GitHub
ESLTTS dataset
☆16Feb 6, 2025Updated last year
thunlp / duplex-model
View on GitHub
☆48Aug 17, 2024Updated last year
0nutation / SLMTokBench
View on GitHub
SLMTokBench for paper "SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models"
☆37Aug 29, 2023Updated 2 years ago
pengzhendong / wavesurfer
View on GitHub
For audio visualization and playback in Jupyter notebooks.
☆18Nov 25, 2025Updated 7 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
google-research / last
View on GitHub
A JAX library for building lattice-based speech transducer models
☆48Jul 2, 2026Updated 3 weeks ago
pengzhendong / ngram-punctuator
View on GitHub
An N-gram punctuator for Chinese and English.
☆18Oct 14, 2025Updated 9 months ago
lifeiteng / NotebookTTS
View on GitHub
Text-To-Speech for NotebookLM
☆39Jul 20, 2025Updated last year
kamilakesbi / DiarizersLM
View on GitHub
☆15Jul 16, 2024Updated 2 years ago
ArenAcikgoz / Whisper-Alignment
View on GitHub
Forced alignment decoder for Whisper.
☆16Mar 13, 2024Updated 2 years ago
pengzhendong / audiolab
View on GitHub
A streaming audio reader, processor, and writer built on top of soundfile, and PyAV (bindings for FFmpeg)
☆39Mar 31, 2026Updated 3 months ago
ddlBoJack / Omni-Captioner
View on GitHub
[ICLR 2026] Data Pipeline, Models, and Benchmark for Omni-Captioner.
☆142Apr 7, 2026Updated 3 months ago
pengzhendong / streaming-tts-webui
View on GitHub
Streaming Text to Speech Web UI
☆22May 6, 2024Updated 2 years ago
primepake / dac_vae
View on GitHub
Descript Audio Codec - VAE Variant (.dac-vae): High-Fidelity Audio Compression with Variational Autoencoder
☆38Aug 30, 2025Updated 10 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
0nutation / SpeechGPT2.github.io
View on GitHub
☆12Jul 23, 2024Updated 2 years ago
ndkgit339 / spe-dss
View on GitHub
Speech Parameter Estimation Using Differentiable Speech Synthesizer
☆43May 9, 2023Updated 3 years ago
mct10 / RepCodec
View on GitHub
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
☆196Jul 12, 2024Updated 2 years ago
ddlBoJack / MMAR
View on GitHub
[NeurIPS 2025] Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix
☆214Feb 25, 2026Updated 4 months ago
ictnlp / SLED-TTS
View on GitHub
Streamable Text-to-Speech model using a language modeling approach, without vector quantization
☆108May 20, 2025Updated last year
kyutai-labs / kaudio
View on GitHub
Rust crate for some audio utilities
☆32Jun 17, 2026Updated last month
zexupan / USEV
View on GitHub
☆14Jul 1, 2024Updated 2 years ago
Watchfulio / dataset-generator
View on GitHub
A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …
☆23Oct 1, 2024Updated last year
MoonshotAI / Kimi-Audio-Evalkit
View on GitHub
☆169Nov 20, 2025Updated 8 months ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
cogmhear / Intelligibility-Oriented-Audio-Visual-Speech-Enhancement
View on GitHub
Towards Intelligibility-Oriented Audio-Visual Speech Enhancement
☆15Sep 6, 2024Updated last year
P2Oileen / CitationHelper
View on GitHub
Google Scholar自搜小脚本，每次开启命令行即显示当前citation。Small Script displaying current citation count each time the shell is opened.
☆21Mar 3, 2025Updated last year
sungnyun / ARMHuBERT
View on GitHub
(Interspeech 2023 & ICASSP 2024) Official repository for ARMHuBERT and STaRHuBERT
☆41Aug 29, 2024Updated last year
lifeiteng / Aligner-SUPERB
View on GitHub
Speech-To-Text forced-alignment Speech processing Universal PERformance Benchmark
☆39May 7, 2025Updated last year
google-deepmind / librispeech-long
View on GitHub
LibriSpeech-Long is a benchmark dataset for long-form speech generation and processing. Released as part of "Long-Form Speech Generation …
☆99Dec 28, 2024Updated last year
ddlBoJack / MT4SSL
View on GitHub
[INTERSPEECH 2023 Best Paper Shortlist] Official implementation for MT4SSL: Boosting Self-Supervised Speech Representation Learning by In…
☆45Mar 25, 2024Updated 2 years ago
yl4579 / SLMGAN
View on GitHub
SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
☆16Jul 19, 2023Updated 3 years ago