1 of 11

Using 🤗 Diffusers for Generating Images

from Text

Suvaditya Mukherjee

(he/him)

USC School of Cinematic Arts

November 14th, 2024 (Thursday)

suvadity@usc.edu

2 of 11

Contents

  • Text-to-Image Generation
    • Problem Statement
    • Hugging Face?
    • 🤗Diffusers
  • Hands-on Code Example
    • Using 🤗Diffusers to generate images
    • Next steps

3 of 11

Text-to-Image Generation

An astronaut sitting on top of a horse while seeing Saturn's rings

An unicorn with a golden horn running on the Golden Gate Bridge, third-person PoV from on top of the unicorn looking at the arches of the bridge, realistic, 4k, full-sized road

A trojan warrior standing on the USC Campus holding a designer board with the text "Welcome to USC School of Cinematic Arts", realistic

4 of 11

Hugging Face

Hugging Face allows you to host models, datasets, ML apps, and more, all in one place.

Go to hf.co and create an account now.

Navigate to ‘Access Tokens’, Create a new token, and save it with you.

5 of 11

🤗 diffusers

🤗 Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules.

Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions.

6 of 11

🤗 diffusers: Getting Started

!pip install diffusers -q

import os

import torch

from diffusers import StableDiffusion3Pipeline

os.environ["HF_TOKEN"] = ""

# Fill your own here. Navigate to https://hf.co, create an account, go to

# the 'Access Tokens' section, and create one there.

7 of 11

🤗 diffusers: Getting Started

pipe = StableDiffusion3Pipeline.from_pretrained(

"stabilityai/stable-diffusion-3-medium-diffusers",

torch_dtype=torch.float16

)

# Disregard the next 2 lines, they're meant to just help speed up the computation and allow the large model we're using to fit on the smaller GPU we have in this free environment

pipe.vae.fuse_qkv_projections()

pipe.enable_sequential_cpu_offload()

8 of 11

🤗 diffusers: Getting Started

prompt = "A cat holding a sign that says hello world"

image = pipe(

prompt,

height=512,

width=512,

guidance_scale=3.5,

max_sequence_length=512,

generator=torch.Generator("cpu").manual_seed(0)

).images[0]

9 of 11

🤗 diffusers: Getting Started

image

10 of 11

Next Steps?

  • Possible applications?
    • Asset Generation
    • Artwork Generation
    • Bringing concepts/designs to life quickly
    • Personal “Van Gogh”
  • Optimizations?

11 of 11

Thank you!

bit.ly/sca-diffusers-slides

bit.ly/sca-diffusers-python