1 of 217

Generating Images & Videos with ML

Start at 9:15pm

Lia Coleman

AI Artathon 2021 - Framed AI Art

2 of 217

Art, AI, education.

LIA COLEMAN

3 of 217

4 of 217

5 of 217

6 of 217

I do / have done work for:

Rhode Island School of Design (RISD)

RunwayML

NeurIPS Workshop on Creativity & Design

Partnership on AI

ML effects for Polae, a 2021 Tribeca Film Fest official selection

School for Poetic Computation, Babycastles

MIT

7 of 217

THIS IS AN

INTERACTIVE CLASS.

8 of 217

WHAT I NEED FROM YOU:

  • ACTIVE PARTICIPATION!

We all come from different levels-- and that’s awesome!

9 of 217

TODAY

Poll: Familiarity with Code, ML?

Quick Intro & Inspiration

  1. RunwayML
  2. Google Colab: StyleGAN
  3. Google Colab: VQGAN + CLIP

10 of 217

INTRO TO FRAMED AI ART

ML BASICS & INSPIRATION

11 of 217

AI ART

Art that is made

using machine learning / AI

images, video, music, poetry, performance

12 of 217

13 of 217

The dataset

14 of 217

Posthuman Mobility

with Anastasiia Raina

and my RISD students

15 of 217

16 of 217

17 of 217

Video as �dataset

18 of 217

Kishi Yuma

Interview with Kishi Yuma

19 of 217

PROJECT INSPIRATION

20 of 217

FILM

DESIGN�FASHION

BOOKS

21 of 217

FILM

Welcome to Chechnya

Deepfakes for privacy protection in a documentary

22 of 217

23 of 217

24 of 217

FASHION

Robbie Barrat x Acne Studios

Fall/Winter 2020

25 of 217

26 of 217

MACHINE LEARNING 101

27 of 217

28 of 217

GENERAL A.I.

Robots, Supercomputers,�Fiction.

29 of 217

NARROW A.I.

Code that does one thing really well.

30 of 217

31 of 217

ML PROCESS

DATASETS

TRAINING

TESTING

32 of 217

DATASETS

33 of 217

DATASETS

34 of 217

DATASETS

35 of 217

TRAINING

36 of 217

TRAINING

37 of 217

TRAINING

38 of 217

39 of 217

TESTING

40 of 217

TESTING

41 of 217

TESTING

42 of 217

Questions?

43 of 217

  1. Intro to RunwayML

44 of 217

45 of 217

RUNWAY:

•“Photoshop for ML”

•GUI, no code

•Removes huge hurdles

•Interface improves functionality

46 of 217

RUNWAY ML

DATASETS

TRAINING

TESTING

47 of 217

RUNWAY:

•Pre-trained models

•Not every model

•Limited training

•Expensive to use

48 of 217

RUNWAY: ($ = USD)

•$.05 per min. testing

•$.005 per step*

•*training requires subscription ($15/m, $144/year)

49 of 217

Demo: �Exploring Runway

50 of 217

GROUP WORK

Explore the pre-trained models in Runway

Find at least 3 models to explore.

Write down:

      • What model you used
      • What the inputs and outputs were
      • What questions you have about the model

Copy a slide below & show us what you made!

51 of 217

Share Your Result Here!

Sabrina Kaune

52 of 217

eddy

Image analysis

53 of 217

eddy

Image analysis

54 of 217

Picatso

Kevin

55 of 217

Share Your Result Here!

lia

56 of 217

Share Your Result Here!

Mohammed Alali

57 of 217

Share Your Result Here!

YOUR NAME!

58 of 217

BREAK

59 of 217

2) Google Colab

with StyleGAN

60 of 217

What is Colab?

  • Free*
  • Text + Code
  • Dependencies incl.
  • Web URLs
  • Time Limits
  • Google

61 of 217

Colab Pro*

$10/month

• need a billing address from this list of countries

• 20 hours (vs. 10)

• Higher chance of � “good” GPU (P100)�

62 of 217

Colab vs RunwayML

  • Free*
  • Can use audio/video
  • Often more control
  • No GUI*
  • More error prone/ need for debugging

63 of 217

*

Music: Magenta, Jukebox

Text: GPT-3, BERT, RNN

ALL ABOUT GANS

WITH A FOCUS ON STYLEGAN

64 of 217

WHAT IS A GAN?

Generative

Adversarial

Network

65 of 217

GENERATOR

DISCRIMINATOR

66 of 217

GENERATOR

DISCRIMINATOR

HERE’S A REAL �IMAGE, I SWEAR

YEAH, THAT’S NOT A REAL IMAGE

67 of 217

GENERATOR

DISCRIMINATOR

OK HERE’S A �REAL IMAGE

WAIT,MAYBE THIS ONE IS A REAL IMAGE?

68 of 217

69 of 217

70 of 217

71 of 217

72 of 217

History of GANs

73 of 217

What is a Latent Space?

74 of 217

LATENT SPACE

75 of 217

LATENT SPACE

76 of 217

LATENT SPACE

77 of 217

LATENT SPACE

78 of 217

LATENT SPACE

79 of 217

LATENT SPACE

StyleGAN uses

512 dimensions

(that’s like three dimensions, but way more)

80 of 217

INFERENCE

81 of 217

  • Starts in a random place, moves using Simplex noise, ends up back where it began.
  • Very fluid, seamless transitions
  • Limited control: “diameter”, number of frames

NOISE LOOP INTERPOLATION

82 of 217

GOOGLE COLAB + STYLEGAN

NOISE LOOPS

  1. Open the StyleGAN Noise Loop Colab Notebook. Make a COPY for yourself.
  2. Run the cells as-is.*
  3. Wait 4-5 minutes. :)
  4. Upload it to the Class Drive, then share it in the slides!

15 mins

groups

83 of 217

Beetle boys

eddy

84 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

85 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

86 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

87 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

88 of 217

What did you notice?

What was surprising or difficult?

Questions?

89 of 217

-> STYLEGAN MASTER COLAB NB <-

  • *Generate Images
  • Animations:
    • *Noise Loop Interpolation
    • Linear Interpolation
    • *Flesh Digressions
    • Projection

* = easy!

90 of 217

Where to find pretrained styleGAN models? Justin Pinkney’s Awesome Pretrained StyleGAN2

91 of 217

  • Starts in a random place, moves using Simplex noise, ends up back where it began.
  • Very fluid, seamless transitions
  • Limited control: “diameter”, number of frames

NOISE LOOP INTERPOLATION

92 of 217

LINEAR INTERPOLATION

  • Moves from seed to seed, equal number of frames between each seed
  • Control of tween points
  • If first seed = last seed, it loops
  • Speed can feel uneven due to distance between points

93 of 217

LINEAR INTERPOLATION

94 of 217

LINEAR INTERPOLATION

  • Moves from seed to seed, equal number of frames between each seed.
  • Specify the keyframes with seeds
  • If loop=true, it loops.
  • Speed can feel uneven due to distance between points.

95 of 217

96 of 217

FLESH DIGRESSIONS

  • A “GAN surgery” technique from aydao
  • Simultaneous circular interpolations in the latent layer and the constant layer

97 of 217

98 of 217

PROJECTION

  • Take an image from outside your dataset and find the closest approximation in your model

EXAMPLE: Projecting my face into the FFHQ faces pretrained model

99 of 217

100 of 217

PROJECTING IN�LATENT SPACE

101 of 217

PROJECT INTO�LATENT SPACE

102 of 217

Share Your Result Here!

YOUR NAME!

103 of 217

(TIME PERMITTING)

GROUP WORK

  1. Make some images & animations with pretrained models in Colab with the master StyleGAN colab notebook on the next slide!

104 of 217

-> STYLEGAN MASTER COLAB NB <-

  • *Generate Images
  • Animations:
    • *Noise Loop Interpolation
    • Linear Interpolation
    • *Flesh Digressions
    • Projection

* = easy!

105 of 217

Share Your Result Here!

YOUR NAME!

106 of 217

Share Your Result Here!

YOUR NAME!

107 of 217

Share Your Result Here!

YOUR NAME!

108 of 217

3) Google Colab

with VQGAN + CLIP

109 of 217

VQGAN + CLIP

“demons are the powers and principalities of the air”

110 of 217

111 of 217

Ryan Murdock @advadnoun

“mind on fire”

“overshadowed”

112 of 217

VQGAN + CLIP

  • VQGAN generates the images

VQGAN’s latent space

113 of 217

VQGAN + CLIP

  • VQGAN generates the images
  • CLIP guides the path to an image that fits the text prompt

VQGAN’s latent space

114 of 217

VQGAN + CLIP

  • VQGAN generates the images
  • CLIP guides the path to an image that fits the text prompt

VQGAN’s latent space

115 of 217

VQGAN + CLIP IN GOOGLE COLAB

For step-by-step instructions, read this guide by @images_ai

116 of 217

“This was the text prompt”

YOUR NAME(S)!

117 of 217

“This was the text prompt”

YOUR NAME(S)!

118 of 217

“This was the text prompt”

YOUR NAME(S)!

119 of 217

What did you notice?

What was surprising or difficult?

Questions?

120 of 217

ONGOING RESOURCES

121 of 217

FUN AI GAMES

        • Play Which Face Is Real
        • Play AI Dungeon

122 of 217

HAPPY AI-ART MAKING!

123 of 217

NEXT CLASS

Make a linear interpolation latent walk video.

Upload the video to the class shared Drive,

then share it in a slide.

15 mins

groups

124 of 217

LINEAR INTERPOLATION

  • Moves from seed to seed, equal number of frames between each seed
  • Control of tween points
  • If first seed = last seed, it loops
  • Speed can feel uneven due to distance between points

125 of 217

LINEAR INTERPOLATION

126 of 217

HOW TO TRAIN YOUR OWN STYLEGAN MODEL

Lecture

127 of 217

DATASETS

Lecture

128 of 217

Why Datasets?

Datasets are the creativity in Machine Learning Art.

Datasets are the hardest part in Machine Learning Art.

129 of 217

Why Datasets?

What data do you uniquely have access to?

What skill sets do you have?

130 of 217

Pre-trained models are limiting

131 of 217

Pre-trained models only cover a small set of use cases.

You’re now a data scientist and ML researcher. Congrats :)

132 of 217

DATASET DIVERSITY

133 of 217

StyleGAN

134 of 217

What makes a good dataset?

135 of 217

What makes a good dataset?

136 of 217

WAYS TO MAKE DATASETS

  1. Make your own! Use your own illustrations, photography, text, or video.

B. Scrape existing media.

C. Use existing ready-made datasets.

137 of 217

  1. Make your own!

  • Safest approach
  • It’s all yours! :)
  • Labor-intensive

138 of 217

B. Scraping existing media

Least-risky: Use media that is in the creative commons or public domain.

Ex: NASA*, Biodiversity Heritage Library

*for non-commerical purposes

139 of 217

B. Scraping existing media

= using someone else’s data

To work with generative systems is to be a curator...to curate a corpus is to value the contributions inside of it...they are still artworks individually, and the people who make them are still artists.

Curating your own corpora is to be able to deal with the original creators of your corpora as humans and as collaborators, not as datapoints.”

Everest Pipkin

140 of 217

Say thank you! :)

B. Scraping existing media

= using someone else’s data

141 of 217

B. Scraping from the web

  • Download a video from Youtube, and split it into frames.
  • Instagram accounts or hashtags
  • Flickr search terms or groups
  • If you find other data you want to scrape, let me know and I’ll help!

142 of 217

C. Using an existing dataset

Mega list of Public Datasets

Think about:

  • Who made this dataset? Why?
  • Was the data ethically sourced?

Example: MegaFace Dataset

  • Public FaceRec dataset of 4.7M faces
  • Faces scraped by UW from Flickr users’

photo albums w/o permission

  • Used by Amazon, Google, Tencent, etc

for FaceRec tech

143 of 217

BEST PRACTICES

  • Least risky: Use your own illustrations, photography, text, or video.
  • If you are scraping work, prioritize work that is in the public domain, or directly ask for permission from those whose identity and/or work is represented in the dataset.
  • Blending multiple individuals artists is preferable to scraping one person’s entire work.
  • Credit the work of others whenever possible. If you are posting online, tag people and thank them!

144 of 217

HOMEWORK

Think about what your dataset could be. Post in Slack with your idea! Some approaches:

  1. If you’re scraping images, bring some:
  2. IG accounts or hashtags
  3. Flickr search terms or groups

for next class.

  • Bring a video that you could split into frames as your dataset. This could be from YouTube, or a video that you shot/rendered yourself.

  • Start photographing / scanning!

145 of 217

DATASET COLLECTION DETAILS:

THINK: What data can you find in large enough quantity that is interesting to you?

  1. Start assembling your StyleGAN dataset of 1000+ images. They will be cropped to 1024px x 1024px later. A few methods:
    • Collect your own images or video.
      • Photograph or scan real things. (tree leaves from the park, your comic book collection, etc.)
      • Algorithmically generate / draw 1000 images in Illustrator or with code.
    • Scrape images from an existing source:
      • instagram, flickr.
  2. If you are using video as a dataset source, whether it's a video you've taken yourself or one from the internet, you will need to split the video into frames. You can use your video editor of choice, or use FFMPEG, a command-line tool, to split it into pngs. (I find jpgs coming from FFMPEG to be low quality.)

  1. For more resources: Dataset Demo Youtube Playlist

146 of 217

FUN MEDIA

  1. Listen to a grammy-nominated AI-generated album. If you’re curious, you can watch the talk on how it was made (1hr)
  2. If you’re interested in AI music, play around with some Magenta demos!

147 of 217

AI makers to follow

Stay up to date- AI Artwork mailing list

148 of 217

BREAKOUT

Generate Images & Video from a StyleGAN model

149 of 217

BREAKOUT: GENERATE IMAGES & VIDEO FROM A STYLEGAN MODEL.

  • Generate new images.
  • From the generated images, pick out 4-5 seeds that you like. Also order them in a way that you like.
  • Generate a Latent Walk video of those seeds in that order. It will be saved out to a folder called ‘walk-w’. Remember to refresh the files if you don’t see it there!
  • Upload your result to Drive, duplicate a slide, and share your results!

150 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

151 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

152 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

153 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

154 of 217

Image Segmentation

155 of 217

Image Segmentation

156 of 217

Image Segmentation

157 of 217

Image Segmentation

158 of 217

SPADE COCO:

Household objects

159 of 217

160 of 217

161 of 217

162 of 217

163 of 217

SPADE Landscapes:

natural landscapes

164 of 217

HANDS-ON:

Play with SPADE!

15 mins

165 of 217

Share Your Result Here!

YOUR NAMES!

166 of 217

Share Your Result Here!

YOUR NAMES!

167 of 217

Share Your Result Here!

YOUR NAMES!

168 of 217

�Vector Input with StyleGAN

169 of 217

Latent Space

170 of 217

Latent Space

171 of 217

Latent Space

172 of 217

Latent Space

173 of 217

Latent Space

174 of 217

Latent Space

175 of 217

Latent Space

StyleGAN uses

512 dimensions

(that’s like three dimensions, but way more)

176 of 217

Latent Space

Runway tries to visualize this �high-dimensional

space with a 2D image grid using the vector input.

177 of 217

LATENT SPACE

StyleGAN uses

512 dimensions

(that’s like three dimensions, but way more)

178 of 217

IN GROUPS:

Generate images with a StyleGAN model.

15 mins

groups

179 of 217

Share Your Results Here!

YOUR NAMES!

180 of 217

Share Your Results Here!

YOUR NAMES!

181 of 217

Share Your Results Here!

YOUR NAMES!

182 of 217

Share Your Results Here!

YOUR NAMES!

183 of 217

Share Your Results Here!

YOUR NAMES!

184 of 217

INTERPOLATION/

LATENT WALKS

185 of 217

LINEAR INTERPOLATION

186 of 217

LINEAR INTERPOLATION

187 of 217

LINEAR INTERPOLATION

  • Moves from seed to seed, equal number of frames between each seed.
  • Control of tween points.
  • If loop=true, it loops.
  • Speed can feel uneven due to distance between points.

188 of 217

IN GROUPS:

Make a latent walk video.

Upload the video to the class shared Drive,

then share it in a slide.

15 mins

groups

189 of 217

Share Your Result Here!

YOUR NAMES!

190 of 217

Share Your Result Here!

YOUR NAMES!

191 of 217

Share Your Result Here!

YOUR NAMES!

192 of 217

Share Your Result Here!

YOUR NAMES!

193 of 217

Share Your Result Here!

YOUR NAMES!

194 of 217

Share Your Result Here!

YOUR NAMES!

195 of 217

EXPLORE RUNWAY

20 mins

groups

196 of 217

BREAK!

197 of 217

198 of 217

Hold up! Why am I even making this?

Be honest with yourself about your goals for using AI, even if you're just looking to learn and play!

  • Objectives?�
  • Pros & cons of using AI?�
  • Making “Computer Critical Computer Art”? (Sarah Groff Hennigh-Palermo)�
  • Checking self for “Creative Savior Complex”? (Omayeli Arenyeka)

199 of 217

Checkpoint 1: Dataset

Where does the training data come from?

  • History and social context of data�
  • Why dataset was created �

How diverse is the dataset?

  • What is/isn’t shown? How might it be skewed?�
  • Will it create near-copies of original works?

Am I respecting data creators and subjects?

  • Can I get their consent? Collaboration?

200 of 217

201 of 217

DISCUSSION

5 mins

groups

202 of 217

Checkpoint 2: Model Code

Whose code are you depending on for your work?

  • Relationship to creators of the tools/libraries I’m using?�
  • How was this codebase developed and labelled, and by who?

Am I respecting the people who contributed to the model code?

  • What people and labor went into the code?

203 of 217

204 of 217

DISCUSSION

5 mins

groups

205 of 217

Checkpoint 3: Training Resources

What are the environmental costs of my training?

  • What is min/max time to train?�
  • How energy efficient are my model and configurations? �
  • What are predicted emissions? (Machine Learning Emissions Calculator)�
  • Transfer learning vs. training from scratch?�
  • Can I use a pre-trained model?�

206 of 217

207 of 217

208 of 217

209 of 217

Checkpoint 4: Publishing

Who might benefit from this work?

  • Will I get $ or publicity from this?�
  • How am I crediting people involved in the model/ code/dataset?�

What are unintended consequences of releasing my model/code/dataset?

  • Who might misuse my work, how, and why?�
  • Where/how long will the model/code/dataset be stored?

�How might I make my work accessible to others?

  • Have I documented my work thoroughly?�
  • Are outputs accessible to people with varying access needs using video and image descriptions?

210 of 217

211 of 217

DISCUSSION

5 mins

groups

212 of 217

Best Practices

213 of 217

In Runway, other models to check out:

214 of 217

Automatic Sketch Colorization

Style2Paints

215 of 217

Rotoscoping Green Screen

216 of 217

10 mins

Breakout rooms

217 of 217

IN GROUPS:

READ & DISCUSS

Article on Kishi Yuma

15 mins

groups