DiffusionDB: A Large-Scale Text-to-Image Prompt Gallery Dataset

Category: Other

License: MIT

Model Type: Image Generation

DiffusionDB is the first large-scale dataset of text-to-image prompts, comprising 14 million images generated by Stable Diffusion using prompts and hyperparameters specified by real users. It serves as a valuable resource for understanding the relationship between textual prompts and generated images, facilitating research in prompt engineering, model interpretability, and human-AI interaction.

Key Features

Extensive Dataset: Contains 14 million images and 1.8 million unique prompts.
Rich Metadata: Includes hyperparameters such as seed, CFG scale, steps, sampler, width, and height.
Two Subsets: Offers DiffusionDB 2M (2 million images) and DiffusionDB Large (14 million images) to cater to different research needs.
Flexible Access: Provides multiple methods to access the dataset, including Hugging Face Datasets, a custom downloader script, and metadata files.
Anonymized Data: Ensures privacy by anonymizing user information and removing personal identifiers.

GitHub

Project Screenshots

Similar Projects

DiffusionDB: A Large-Scale Text-to-Image Prompt Gallery Dataset

Key Features

Project Screenshots

Similar Projects

Chat‑With‑PDF

Stable Diffusion Prompt Reader: Standalone Viewer for Image Prompts

Text-to-Music Generation App

ChatGPT Text-to-Speech Application

PaLM‑PDFChat – GUI for Chatting with Your PDFs via PaLM

LocalAI: Open-Source, Self-Hosted AI Inference Platform