1 of 217

Generating Images & Videos with ML

Start at 9:15pm

Lia Coleman

AI Artathon 2021 - Framed AI Art

2 of 217

Art, AI, education.

LIA COLEMAN

AI Artwork mailing list

Twitter: @lialialiacole

IG: @liacole7

Credit to Derrick Schultz!

3 of 217

4 of 217

5 of 217

MAKING

AI ART RESPONSIBLY

A FIELD GUIDE

6 of 217

I do / have done work for:

Rhode Island School of Design (RISD)

RunwayML

NeurIPS Workshop on Creativity & Design

Partnership on AI

ML effects for Polae, a 2021 Tribeca Film Fest official selection

School for Poetic Computation, Babycastles

MIT

7 of 217

THIS IS AN

INTERACTIVE CLASS.

8 of 217

WHAT I NEED FROM YOU:

ACTIVE PARTICIPATION!

We all come from different levels-- and that’s awesome!

9 of 217

TODAY

Poll: Familiarity with Code, ML?

Quick Intro & Inspiration

RunwayML
Google Colab: StyleGAN
Google Colab: VQGAN + CLIP

10 of 217

INTRO TO FRAMED AI ART

ML BASICS & INSPIRATION

11 of 217

AI ART

Art that is made

using machine learning / AI

images, video, music, poetry, performance

12 of 217

13 of 217

The dataset

14 of 217

Posthuman Mobility

with Anastasiia Raina

and my RISD students

15 of 217

Derrick

Schultz

16 of 217

Esteban

Salgado

@salyaku_ai

17 of 217

Video as �dataset

18 of 217

K ishi Yuma

Interview with Kishi Yuma

19 of 217

PROJECT INSPIRATION

20 of 217

FILM

DESIGN�FASHION

BOOKS

21 of 217

FILM

Welcome to Chechnya

Deepfakes for privacy protection in a documentary

22 of 217

DESIGN

Adam Pickard

23 of 217

FASHION

Robbie Barrat x Acne Studios

24 of 217

FASHION

Robbie Barrat x Acne Studios

Fall/Winter 2020

25 of 217

26 of 217

MACHINE LEARNING 101

27 of 217

28 of 217

GENERAL A.I.

Robots, Supercomputers,�Fiction.

29 of 217

NARROW A.I.

Code that does one thing really well.

30 of 217

31 of 217

ML PROCESS

DATASETS

TRAINING

TESTING

32 of 217

DATASETS

33 of 217

DATASETS

34 of 217

DATASETS

35 of 217

TRAINING

36 of 217

TRAINING

37 of 217

TRAINING

38 of 217

TRAINING

3Blue1Brown Series

39 of 217

TESTING

40 of 217

TESTING

41 of 217

TESTING

42 of 217

Questions?

43 of 217

Intro to RunwayML

44 of 217

45 of 217

RUNWAY:

•“Photoshop for ML”

•GUI, no code

•Removes huge hurdles

•Interface improves functionality

46 of 217

RUNWAY ML

DATASETS

TRAINING

TESTING

47 of 217

RUNWAY:

•Pre-trained models

•Not every model

•Limited training

•Expensive to use

48 of 217

RUNWAY: ($ = USD)

•$.05 per min. testing

•$.005 per step*

•*training requires subscription ($15/m, $144/year)

49 of 217

Demo: �Exploring Runway

50 of 217

GROUP WORK

Explore the pre-trained models in Runway

Find at least 3 models to explore.

Write down:

What model you used
What the inputs and outputs were
What questions you have about the model

Copy a slide below & show us what you made!

51 of 217

Share Your Result Here!

Sabrina Kaune

52 of 217

eddy

Image analysis

53 of 217

eddy

Image analysis

54 of 217

Picatso

Kevin

55 of 217

Share Your Result Here!

lia

56 of 217

Share Your Result Here!

Mohammed Alali

57 of 217

Share Your Result Here!

YOUR NAME!

58 of 217

BREAK

59 of 217

2) Google Colab

with StyleGAN

60 of 217

What is Colab?

Free*
Text + Code
Dependencies incl.
Web URLs
Time Limits
Google

61 of 217

Colab Pro*

• $10/month

• need a billing address from this list of countries

• 20 hours (vs. 10)

• Higher chance of � “good” GPU (P100)�

62 of 217

Colab vs RunwayML

Free*
Can use audio/video
Often more control
No GUI*
More error prone/ need for debugging

63 of 217

*

Music: Magenta, Jukebox

Text: GPT-3, BERT, RNN

ALL ABOUT GANS

WITH A FOCUS ON STYLEGAN

64 of 217

WHAT IS A GAN?

Generative

Adversarial

Network

Derrick Schultz

classes

65 of 217

GENERATOR

DISCRIMINATOR

66 of 217

GENERATOR

DISCRIMINATOR

HERE’S A REAL �IMAGE, I SWEAR

YEAH, THAT’S NOT A REAL IMAGE

67 of 217

GENERATOR

DISCRIMINATOR

OK HERE’S A �REAL IMAGE

WAIT,MAYBE THIS ONE IS A REAL IMAGE?

68 of 217

69 of 217

70 of 217

71 of 217

72 of 217

Gene Kogan

History of GANs

73 of 217

What is a Latent Space?

74 of 217

LATENT SPACE

75 of 217

LATENT SPACE

76 of 217

LATENT SPACE

77 of 217

LATENT SPACE

78 of 217

LATENT SPACE

79 of 217

LATENT SPACE

StyleGAN uses

512 dimensions

(that’s like three dimensions, but way more)

80 of 217

INFERENCE

81 of 217

Starts in a random place, moves using Simplex noise, ends up back where it began.
Very fluid, seamless transitions
Limited control: “diameter”, number of frames

NOISE LOOP INTERPOLATION

82 of 217

GOOGLE COLAB + STYLEGAN

NOISE LOOPS

Open the StyleGAN Noise Loop Colab Notebook. Make a COPY for yourself.
Run the cells as-is.*
Wait 4-5 minutes. :)
Upload it to the Class Drive, then share it in the slides!

15 mins

groups

83 of 217

Beetle boys

eddy

84 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

85 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

86 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

87 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

88 of 217

What did you notice?

What was surprising or difficult?

Questions?

89 of 217

-> STYLEGAN MASTER COLAB NB <-

*Generate Images
Animations:

*Noise Loop Interpolation
Linear Interpolation
*Flesh Digressions
Projection

* = easy!

90 of 217

Where to find pretrained styleGAN models? Justin Pinkney’s Awesome Pretrained StyleGAN2

91 of 217

Starts in a random place, moves using Simplex noise, ends up back where it began.
Very fluid, seamless transitions
Limited control: “diameter”, number of frames

NOISE LOOP INTERPOLATION

92 of 217

LINEAR INTERPOLATION

Moves from seed to seed, equal number of frames between each seed
Control of tween points
If first seed = last seed, it loops
Speed can feel uneven due to distance between points

93 of 217

LINEAR INTERPOLATION

94 of 217

LINEAR INTERPOLATION

Moves from seed to seed, equal number of frames between each seed.
Specify the keyframes with seeds
If loop=true, it loops.
Speed can feel uneven due to distance between points.

95 of 217

96 of 217

FLESH DIGRESSIONS

A “GAN surgery” technique from aydao
Simultaneous circular interpolations in the latent layer and the constant layer

97 of 217

98 of 217

PROJECTION

Take an image from outside your dataset and find the closest approximation in your model

EXAMPLE: Projecting my face into the FFHQ faces pretrained model

99 of 217

100 of 217

PROJECTING IN�LATENT SPACE

101 of 217

PROJECT INTO�LATENT SPACE

102 of 217

Share Your Result Here!

YOUR NAME!

103 of 217

(TIME PERMITTING)

GROUP WORK

Make some images & animations with pretrained models in Colab with the master StyleGAN colab notebook on the next slide!

104 of 217

-> STYLEGAN MASTER COLAB NB <-

*Generate Images
Animations:

*Noise Loop Interpolation
Linear Interpolation
*Flesh Digressions
Projection

* = easy!

105 of 217

Share Your Result Here!

YOUR NAME!

106 of 217

Share Your Result Here!

YOUR NAME!

107 of 217

Share Your Result Here!

YOUR NAME!

108 of 217

3) Google Colab

with VQGAN + CLIP

109 of 217

VQGAN + CLIP

“demons are the powers and principalities of the air”

@rivershavewings

110 of 217

Katherine

Crowson

@rivershavewings

111 of 217

Ryan Murdock @advadnoun

“mind on fire”

“overshadowed”

112 of 217

VQGAN + CLIP

VQGAN generates the images

VQGAN’s latent space

113 of 217

VQGAN + CLIP

VQGAN generates the images
CLIP guides the path to an image that fits the text prompt

VQGAN’s latent space

114 of 217

VQGAN + CLIP

VQGAN generates the images
CLIP guides the path to an image that fits the text prompt

VQGAN’s latent space

115 of 217

VQGAN + CLIP IN GOOGLE COLAB

Open the VQGAN + CLIP notebook. Make a COPY for yourself.
Upload your work to the Class Drive, then share it in the slides!

For step-by-step instructions, read this guide by @images_ai

116 of 217

“This was the text prompt”

YOUR NAME(S)!

117 of 217

“This was the text prompt”

YOUR NAME(S)!

118 of 217

“This was the text prompt”

YOUR NAME(S)!

119 of 217

What did you notice?

What was surprising or difficult?

Questions?

120 of 217

ONGOING RESOURCES

121 of 217

FUN AI GAMES

Play Which Face Is Real
Play AI Dungeon

122 of 217

HAPPY AI-ART MAKING!

LIA COLEMAN

AI Artwork mailing list

Twitter: @lialialiacole

IG: @liacole7

123 of 217

NEXT CLASS

Make a linear interpolation latent walk video.

Upload the video to the class shared Drive,

then share it in a slide.

15 mins

groups

124 of 217

LINEAR INTERPOLATION

Moves from seed to seed, equal number of frames between each seed
Control of tween points
If first seed = last seed, it loops
Speed can feel uneven due to distance between points

125 of 217

LINEAR INTERPOLATION

126 of 217

HOW TO TRAIN YOUR OWN STYLEGAN MODEL

Lecture

127 of 217

DATASETS

Lecture

128 of 217

Why Datasets?

Datasets are the creativity in Machine Learning Art.

Datasets are the hardest part in Machine Learning Art.

129 of 217

Why Datasets?

What data do you uniquely have access to?

What skill sets do you have?

130 of 217

Pre-trained models are limiting

131 of 217

Pre-trained models only cover a small set of use cases.

You’re now a data scientist and ML researcher. Congrats :)

132 of 217

DATASET DIVERSITY

133 of 217

StyleGAN

134 of 217

What makes a good dataset?

135 of 217

What makes a good dataset?

136 of 217

WAYS TO MAKE DATASETS

Make your own! Use your own illustrations, photography, text, or video.

B. Scrape existing media.

C. Use existing ready-made datasets.

137 of 217

Make your own!

Safest approach
It’s all yours! :)
Labor-intensive

Esteban Salgado

@salyaku_ai

138 of 217

B. Scraping existing media

Least-risky: Use media that is in the creative commons or public domain.

Ex: NASA*, Biodiversity Heritage Library

*for non-commerical purposes

139 of 217

B. Scraping existing media

= using someone else’s data

“To work with generative systems is to be a curator...to curate a corpus is to value the contributions inside of it...they are still artworks individually, and the people who make them are still artists.

Curating your own corpora is to be able to deal with the original creators of your corpora as humans and as collaborators, not as datapoints.”

Everest Pipkin

Corpora as medium: on the work of curating a poetic textual dataset

140 of 217

Say thank you! :)

Everest Pipkin, i've never picked a protected flower (concrete unicode poems)

B. Scraping existing media

= using someone else’s data

141 of 217

B. Scraping from the web

Download a video from Youtube, and split it into frames.
Instagram accounts or hashtags
Flickr search terms or groups
If you find other data you want to scrape, let me know and I’ll help!

142 of 217

C. Using an existing dataset

Mega list of Public Datasets

Think about:

Who made this dataset? Why?
Was the data ethically sourced?

Example: MegaFace Dataset

Public FaceRec dataset of 4.7M faces
Faces scraped by UW from Flickr users’

photo albums w/o permission

Used by Amazon, Google, Tencent, etc

for FaceRec tech

143 of 217

BEST PRACTICES

Least risky: Use your own illustrations, photography, text, or video.
If you are scraping work, prioritize work that is in the public domain, or directly ask for permission from those whose identity and/or work is represented in the dataset.
Blending multiple individuals artists is preferable to scraping one person’s entire work.
Credit the work of others whenever possible. If you are posting online, tag people and thank them!

144 of 217

HOMEWORK

Think about what your dataset could be. Post in Slack with your idea! Some approaches:

If you’re scraping images, bring some:
IG accounts or hashtags
Flickr search terms or groups

for next class.

Bring a video that you could split into frames as your dataset. This could be from YouTube, or a video that you shot/rendered yourself.

Start photographing / scanning!

145 of 217

DATASET COLLECTION DETAILS:

THINK: What data can you find in large enough quantity that is interesting to you?

Start assembling your StyleGAN dataset of 1000+ images. They will be cropped to 1024px x 1024px later. A few methods:

Collect your own images or video.

Photograph or scan real things. (tree leaves from the park, your comic book collection, etc.)
Algorithmically generate / draw 1000 images in Illustrator or with code.

Scrape images from an existing source:

instagram, flickr.

If you are using video as a dataset source, whether it's a video you've taken yourself or one from the internet, you will need to split the video into frames. You can use your video editor of choice, or use FFMPEG, a command-line tool, to split it into pngs. (I find jpgs coming from FFMPEG to be low quality.)

For more resources: Dataset Demo Youtube Playlist

146 of 217

FUN MEDIA

Listen to a grammy-nominated AI-generated album. If you’re curious, you can watch the talk on how it was made (1hr)
If you’re interested in AI music, play around with some Magenta demos!

147 of 217

AI makers to follow

Artists

Curator & AI art organizer

Luba Elliott

Creative AI Newsletter

ML music

Coders

Stay up to date- AI Artwork mailing list

148 of 217

BREAKOUT

Generate Images & Video from a StyleGAN model

149 of 217

BREAKOUT: GENERATE IMAGES & VIDEO FROM A STYLEGAN MODEL.

Generate new images.
From the generated images, pick out 4-5 seeds that you like. Also order them in a way that you like.
Generate a Latent Walk video of those seeds in that order. It will be saved out to a folder called ‘walk-w’. Remember to refresh the files if you don’t see it there!
Upload your result to Drive, duplicate a slide, and share your results!

150 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

151 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

152 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

153 of 217

Share Your Result Here!

https://drive.google.com/drive/my-drive

YOUR NAME(S)!

154 of 217

Image Segmentation

155 of 217

Image Segmentation

156 of 217

Image Segmentation

157 of 217

Image Segmentation

158 of 217

SPADE COCO:

Household objects

159 of 217

160 of 217

161 of 217

162 of 217

163 of 217

SPADE Landscapes:

natural landscapes

164 of 217

HANDS-ON:

Play with SPADE!

15 mins

165 of 217

Share Your Result Here!

YOUR NAMES!

166 of 217

Share Your Result Here!

YOUR NAMES!

167 of 217

Share Your Result Here!

YOUR NAMES!

168 of 217

�Vector Input with StyleGAN

169 of 217

Latent Space

170 of 217

Latent Space

171 of 217

Latent Space

172 of 217

Latent Space

173 of 217

Latent Space

174 of 217

Latent Space

175 of 217

Latent Space

StyleGAN uses

512 dimensions

(that’s like three dimensions, but way more)

176 of 217

Latent Space

Runway tries to visualize this �high-dimensional

space with a 2D image grid using the vector input.

177 of 217

LATENT SPACE

StyleGAN uses

512 dimensions

(that’s like three dimensions, but way more)

178 of 217

IN GROUPS:

Generate images with a StyleGAN model.

15 mins

groups

179 of 217

Share Your Results Here!

YOUR NAMES!

180 of 217

Share Your Results Here!

YOUR NAMES!

181 of 217

Share Your Results Here!

YOUR NAMES!

182 of 217

Share Your Results Here!

YOUR NAMES!

183 of 217

Share Your Results Here!

YOUR NAMES!

184 of 217

INTERPOLATION/

LATENT WALKS

185 of 217

LINEAR INTERPOLATION

186 of 217

LINEAR INTERPOLATION

187 of 217

LINEAR INTERPOLATION

Moves from seed to seed, equal number of frames between each seed.
Control of tween points.
If loop=true, it loops.
Speed can feel uneven due to distance between points.

188 of 217

IN GROUPS:

Make a latent walk video.

Upload the video to the class shared Drive,

then share it in a slide.

15 mins

groups

189 of 217

Share Your Result Here!

YOUR NAMES!

190 of 217

Share Your Result Here!

YOUR NAMES!

191 of 217

Share Your Result Here!

YOUR NAMES!

192 of 217

Share Your Result Here!

YOUR NAMES!

193 of 217

Share Your Result Here!

YOUR NAMES!

194 of 217

Share Your Result Here!

YOUR NAMES!

195 of 217

EXPLORE RUNWAY

20 mins

groups

196 of 217

BREAK!

197 of 217

198 of 217

Hold up! Why am I even making this?

Be honest with yourself about your goals for using AI, even if you're just looking to learn and play!

Objectives?�
Pros & cons of using AI?�
Making “Computer Critical Computer Art”? (Sarah Groff Hennigh-Palermo)�
Checking self for “Creative Savior Complex”? (Omayeli Arenyeka)

199 of 217

Checkpoint 1: Dataset

Where does the training data come from?

History and social context of data�
Why dataset was created �

How diverse is the dataset?

What is/isn’t shown? How might it be skewed?�
Will it create near-copies of original works?

Am I respecting data creators and subjects?

Can I get their consent? Collaboration?

200 of 217

The Internet Furry Drama Raising Big Questions About Artificial Intelligence (Gizmodo)

[Emily] For example, last year Seattle-based programmer named Arfa created this fursona does not exist, inspired by “this person does not exist.” They used StyleGAN2 to train a furry persona generating model on on more than 55,000 images pulled (without permission) from a furry art forum. But what exactly went into that dataset? Whose identities and labor were represented in that community forum? Within days the site had a DMCA copyright infringement complaint.

“Furry fandom has long been a close-knit community of independent creators supported by individual commissions. A project aimed at mass-producing fursonas—using original art as training material, no less—could be seen as a threat to creators’ livelihood. Some commenters accused Arfa of disrespect and asked for the choice to opt out of the project. Others complained that their work had been uploaded to e621 without their permission in the first place.”

201 of 217

DISCUSSION

5 mins

groups

202 of 217

Checkpoint 2: Model Code

Whose code are you depending on for your work?

Relationship to creators of the tools/libraries I’m using?�
How was this codebase developed and labelled, and by who?

Am I respecting the people who contributed to the model code?

What people and labor went into the code?

203 of 217

How a Teenager's Code Spawned a $432,500 Piece of Art (Wired)

204 of 217

DISCUSSION

5 mins

groups

205 of 217

Checkpoint 3: Training Resources

What are the environmental costs of my training?

What is min/max time to train?�
How energy efficient are my model and configurations? �
What are predicted emissions? (Machine Learning Emissions Calculator)�
Transfer learning vs. training from scratch?�
Can I use a pre-trained model?�

�

206 of 217

Training a single AI model can emit as much carbon as five cars in their lifetimes �(MIT Technology Review)

[Emily]

Machine Learning has a carbon footprint. Emma Strubell, a PhD candidate at the University of Massachusetts, Amherst, esp certain inefficiencies like neural architecture search. In practice, other research continue to train from scratch, with many rounds of fine-tuning for publishable results. As an individual artist with a certain tolerance for glitch, you likely won’t run into those astronomical figures of these commercial and academic models – if only because using that much energy is literally expensive, and requires resources just for cooling, as some of the most hardcore AI artists are finding, who train on their own GPUs around the clock.

Still, as a practice to check yourself, there are tools to compute your GPU's carbon emissions. This project was created to push for more transparency in our field by including the results in your publication (research paper, blog post etc.) and help you be mindful of when you’re doing duplicative or otherwise wasteful training.

207 of 217

From Memo Atken’s The Unreasonable Ecological Cost of #CryptoArt�Let’s Talk Energy Usage of Generative Machine Learning (Derrick Schultz)�

208 of 217

�Let’s Talk Energy Usage of Generative Machine Learning (Derrick Schultz)�

209 of 217

Checkpoint 4: Publishing

Who might benefit from this work?

Will I get $ or publicity from this?�
How am I crediting people involved in the model/ code/dataset?�

What are unintended consequences of releasing my model/code/dataset?

Who might misuse my work, how, and why?�
Where/how long will the model/code/dataset be stored?

�How might I make my work accessible to others?

Have I documented my work thoroughly?�
Are outputs accessible to people with varying access needs using video and image descriptions?

210 of 217

https://rarible.com/token/0xd07dc4262bcdbf85190c01c996b4c06a461d2430:108561:0x8ebe8b747cab6d473483aa3889870cf31c05d1c6

[Emily]

Former Google AI ethics leads developed over the last few years templates, some of the inspiration to address the fact that “the machine learning community currently has no standardized process for documenting datasets, which can lead to severe consequences in high-stakes domains. [compared to electronics need for datasheets] . In the electronics industry, every component, no matter how simple or complex, is accompanied with a datasheet that describes its operating characteristics, test results, recommended uses, and other information. By analogy, we propose that every dataset be accompanied with a datasheet that documents its motivation, composition, collection process, recommended uses, and so on. Datasheets for datasets will facilitate better communication between dataset creators and dataset consumers, and encourage the machine learning community to prioritize transparency and accountability.

Framework for transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model

211 of 217

DISCUSSION

5 mins

groups

212 of 217

Best Practices

213 of 217

In Runway, other models to check out:

214 of 217

Automatic Sketch Colorization

Style2Paints

215 of 217

Rotoscoping Green Screen

216 of 217

FUN MEDIA

play.aidungeon.io

10 mins

Breakout rooms

217 of 217

IN GROUPS:

READ & DISCUSS

Article on Kishi Yuma

15 mins

groups