google-research-datasets/presto

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/google-research-datasets/presto)

google-research-datasets / presto

A Multilingual Dataset for Parsing Realistic Task-Oriented Dialogs

☆116

Alternatives and similar repositories for presto

Users that are interested in presto are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

eagle705 / awesome-nlp-note
View on GitHub
A curated list of resources dedicated to NLP (paper, blogs, note and etc)
☆13Nov 30, 2019Updated 6 years ago
mrcolo / longboii
View on GitHub
☆18May 6, 2023Updated 3 years ago
tatHi / optok
View on GitHub
☆10Aug 26, 2021Updated 4 years ago
soumik12345 / nerf.jax
View on GitHub
A minimal TPU compatible Jax implementation of NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis.
☆13Apr 21, 2022Updated 4 years ago
dig-team / hanna-benchmark-asg
View on GitHub
HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.
☆38Oct 15, 2024Updated last year
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
iwiwi / epochraft-hf-fsdp
View on GitHub
Example of using Epochraft to train HuggingFace transformers models with PyTorch FSDP
☆11Jan 29, 2024Updated 2 years ago
MicrosoftTranslator / NTREX
View on GitHub
NTREX -- News Test References for MT Evaluation
☆87Jun 5, 2024Updated 2 years ago
geov-ai / geov
View on GitHub
The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…
☆122Apr 29, 2023Updated 3 years ago
nyu-mll / ILF-for-code-generation
View on GitHub
☆81Mar 24, 2025Updated last year
oriram / spider
View on GitHub
☆55Jan 18, 2023Updated 3 years ago
SkunkworksAI / CodeFusion
View on GitHub
☆14Oct 31, 2023Updated 2 years ago
luohongyin / EntST
View on GitHub
Entailment self-training
☆27May 30, 2023Updated 3 years ago
Birch-san / booru-embed
View on GitHub
[WIP] Transformer to embed Danbooru labelsets
☆13Mar 31, 2024Updated 2 years ago
huggingface / olm-training
View on GitHub
Repo for training MLMs, CLMs, or T5-type models on the OLM pretraining data, but it should work with any hugging face text dataset.
☆98Feb 9, 2023Updated 3 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
cambridgeltl / ACL2022_tutorial_multilingual_dialogue
View on GitHub
Materials for "Natural Language Processing for Multilingual Task-Oriented Dialogue" Tutorial at ACL 2022
☆14May 21, 2022Updated 4 years ago
LAION-AI / Anh
View on GitHub
Anh - LAION's multilingual assistant datasets and models
☆28Apr 5, 2023Updated 3 years ago
official-elinas / zeus-llm-trainer
View on GitHub
Zeus LLM Trainer is a rewrite of Stanford Alpaca aiming to be the trainer for all Large Language Models
☆69Aug 27, 2023Updated 2 years ago
davidsvaughn / prompt-loss-weight
View on GitHub
code for Towards Data Science article on prompt-loss-weight
☆11Jun 4, 2025Updated last year
IBM / zero-shot-classification-boost-with-self-training
View on GitHub
code for the paper "Zero-Shot Text Classification with Self-Training" for EMNLP 2022
☆50Sep 17, 2025Updated 10 months ago
acosharma / elita-transformer
View on GitHub
Official Repository for Efficient Linear-Time Attention Transformers.
☆17Jun 2, 2024Updated 2 years ago
clab / knowledge
View on GitHub
☆10Oct 6, 2015Updated 10 years ago
mobarski / alpaca-libre
View on GitHub
Reimplementation of the task generation part from the Alpaca paper
☆118Apr 4, 2023Updated 3 years ago
apartresearch / specificityplus
View on GitHub
👩‍💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
☆20Jan 19, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
ShorensteinCenter / Benchmarks-Program
View on GitHub
Free, open source data science metrics for MailChimp email lists, delivered via an email report
☆21Dec 8, 2022Updated 3 years ago
swj0419 / kNN_prompt
View on GitHub
TBC
☆28Nov 2, 2022Updated 3 years ago
kaiokendev / cutoff-len-is-context-len
View on GitHub
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Jun 21, 2023Updated 3 years ago
salesforce / AuditNLG
View on GitHub
AuditNLG: Auditing Generative AI Language Modeling for Trustworthiness
☆103Jun 2, 2026Updated last month
dmis-lab / TouR
View on GitHub
Findings of ACL'2023: Optimizing Test-Time Query Representations for Dense Retrieval
☆30Oct 24, 2023Updated 2 years ago
john-hewitt / backpacks-flash-attn
View on GitHub
The original Backpack Language Model implementation, a fork of FlashAttention
☆71May 29, 2023Updated 3 years ago
alirezamshi / small100
View on GitHub
Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…
☆30Feb 8, 2023Updated 3 years ago
osekilab / JCoLA
View on GitHub
☆19Apr 21, 2026Updated 3 months ago
HojiChar / HojiChar
View on GitHub
The robust text processing pipeline framework enabling customizable, efficient, and metric-logged text preprocessing.
☆128Jul 17, 2026Updated last week
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
nateraw / modal-examples
View on GitHub
Apps that run on modal.com
☆13Sep 14, 2025Updated 10 months ago
JoJo0217 / rlhf_korean_dataset
View on GitHub
For the rlhf learning environment of Koreans
☆25Sep 25, 2023Updated 2 years ago
CarperAI / decontamination
View on GitHub
This repository contains code for cleaning your training data of benchmark data to help combat data snooping.
☆28Apr 21, 2023Updated 3 years ago
vaguenebula / AlpacaDataReflect
View on GitHub
An experiment to see if chatgpt can improve the output of the stanford alpaca dataset
☆12Mar 29, 2023Updated 3 years ago
Patil-Onkar / Remove-silence-from-an-audio
View on GitHub
☆10Jun 30, 2022Updated 4 years ago
dhansmair / flamingo-mini
View on GitHub
Implementation of the deepmind Flamingo vision-language model, based on Hugging Face language models and ready for training
☆171Apr 27, 2023Updated 3 years ago
epfml / dynamic-sparse-flash-attention
View on GitHub
☆152Jun 2, 2023Updated 3 years ago