LAION-AI/interesting-text-datasets

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LAION-AI/interesting-text-datasets)

LAION-AI / interesting-text-datasets

☆47

Alternatives and similar repositories for interesting-text-datasets

Users that are interested in interesting-text-datasets are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

oaimli / PeerSum
View on GitHub
The dataset and code for PeerSum at EMNLP'23.
☆16Oct 20, 2025Updated 9 months ago
cat-state / clip_benchmark
View on GitHub
clip retrieval benchmark
☆17May 4, 2022Updated 4 years ago
rom1504 / any2dataset
View on GitHub
Turn any collection of files into a dataset
☆45Mar 10, 2023Updated 3 years ago
ricardokleinklein / deepMultiSpeech
View on GitHub
Deep Multi-Speech model
☆11Jul 25, 2018Updated 7 years ago
tfernd / sd-fused
View on GitHub
A re-implementation of Stable-Diffusion using better code pratices with faster and lower-memory usage.
☆45Feb 8, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
allenai / mslr-shared-task
View on GitHub
Multidocument Summarization for Literature Review Shared Task 2022
☆30Oct 16, 2022Updated 3 years ago
technobird22 / NeoGen
View on GitHub
A tool for generating awesome AI art
☆17Jul 29, 2022Updated 3 years ago
kaiokendev / cutoff-len-is-context-len
View on GitHub
Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit
☆62Jun 21, 2023Updated 3 years ago
wbrown / gpt_bpe
View on GitHub
GPT2 Byte Pair Encoding implementation in Golang
☆25Jul 9, 2025Updated last year
crowsonkb / cloob-training
View on GitHub
CLOOB training (JAX) and inference (JAX and PyTorch)
☆76May 16, 2022Updated 4 years ago
shinhyeokoh / rwen
View on GitHub
☆14Jun 16, 2023Updated 3 years ago
fywalter / simptc
View on GitHub
Code and datasets for EMNLP 2022 paper: Beyond prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Repr…
☆19Jan 1, 2024Updated 2 years ago
seqml / VerbalTS
View on GitHub
☆15Sep 30, 2025Updated 9 months ago
kyleliang919 / Long-context-transformers
View on GitHub
Exploring finetuning public checkpoints on filter 8K sequences on Pile
☆116Mar 22, 2023Updated 3 years ago
End-to-end encrypted email - Proton Mail • Ad
Special offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
shwinshaker / LipGrow
View on GitHub
An adaptive training algorithm for residual network
☆17Aug 22, 2020Updated 5 years ago
bprabhakar / upside-down-reinforcement-learning
View on GitHub
Pytorch based implementation of Upside Down Reinforcement Learning (UDRL) by J. Schmidhuber et al.
☆12May 1, 2020Updated 6 years ago
DaisukeDaisuke / AndroidPHP_old
View on GitHub
PHP binaries for PocketMine-MP, archived
☆13Jan 13, 2022Updated 4 years ago
uvadlc / uvadlc_practicals_2022
View on GitHub
Repository for the code assignment of the Deep Learning 1 course, Fall 2022 edition
☆20Dec 9, 2022Updated 3 years ago
cloneofsimo / efae
View on GitHub
☆24Jun 18, 2024Updated 2 years ago
yuyay / chainer_nic
View on GitHub
Neural Image Caption (NIC) on chainer, its pretrained models on English and Japanese image caption datasets.
☆17Dec 14, 2018Updated 7 years ago
lucidrains / memory-editable-transformer
View on GitHub
My explorations into editing the knowledge and memories of an attention network
☆35Dec 8, 2022Updated 3 years ago
uclnlp / APE
View on GitHub
Adaptive Passage Encoder for Open-domain Question Answering
☆15Jun 1, 2021Updated 5 years ago
SDLAML / disco
View on GitHub
☆16Dec 11, 2025Updated 7 months ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
jquesnelle / literAI
View on GitHub
Generate visual podcasts about novels using open source models
☆33Feb 15, 2023Updated 3 years ago
jfma-USTC / HRDoc
View on GitHub
Dataset and scripts for HRDoc
☆42Jun 21, 2023Updated 3 years ago
minrq / CGAN_Text2Video
View on GitHub
Code for our IJCAI 2019 paper entitled "Conditional GAN with Discriminative Filter Generation for Text-to-Video Synthesis"
☆14Mar 29, 2022Updated 4 years ago
christophschuhmann / 4MC-4M-Image-Text-Pairs-with-CLIP-embeddings
View on GitHub
I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…
☆17Apr 22, 2021Updated 5 years ago
uds-lsv / anea
View on GitHub
☆19Apr 28, 2021Updated 5 years ago
Raincleared-Song / DejaVu_predictor
View on GitHub
The codes for training sparsity predictor on LLaMA.
☆18May 12, 2024Updated 2 years ago
apple / ml-spin
View on GitHub
This repository contains the official implementation for the ECCV'22 paper, "SPIN: An Empirical Evaluation on Sharing Parameters of Isotr…
☆20Sep 9, 2023Updated 2 years ago
hyharryhuang / TwitterBot
View on GitHub
Twitter Auto-reply bot
☆13Dec 10, 2014Updated 11 years ago
LAION-AI / General-GPT
View on GitHub
☆65Oct 4, 2023Updated 2 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
dzryk / clip-grams
View on GitHub
☆30Nov 25, 2021Updated 4 years ago
microsoft / un-knowledge-extraction
View on GitHub
The goal is to pilot Microsoft Cognitive Services to unlock the strategic value of UN unstructured content by building on AI and semantic…
☆16Jul 6, 2023Updated 3 years ago
felixbur / Emofilt
View on GitHub
Emofilt is a program to simulate emotional arousal with speech synthesis based on the free-for-non-commercial-use MBROLA synthesis engine…
☆14Mar 17, 2022Updated 4 years ago
LAION-AI / LAION-PEOPLE
View on GitHub
This project provides a data set with bounding boxes, body poses, 3D face meshes & captions of people from our LAION-2.2B. Additionally i…
☆14Jan 2, 2022Updated 4 years ago
moirage / alignment-research-dataset
View on GitHub
A dataset of alignment research and code to reproduce it
☆80Jun 22, 2023Updated 3 years ago
LAION-AI / medical
View on GitHub
This repository will be a summary and outlook on all our open, medical, AI advancements.
☆30Feb 24, 2023Updated 3 years ago
wassname / rl_2d_walker.js
View on GitHub
Teaching a humanoid to walk(ish), then displaying in your browser (using tensorflow.js and reinforcement learning)
☆10Sep 7, 2020Updated 5 years ago