1 of 19

Exploratory Data Analysis

2 of 19

Learning Goals

  1. Visualizing RNA-seq
  2. Lunch Exercises
  3. Python Ecosystem

3 of 19

Visualizing RNA-seq

4 of 19

r4ds.had.co.nz

5 of 19

6 of 19

Lott, et al., 2011 PLoS Biology

7 of 19

Lott, et al., 2011 PLoS Biology

8 of 19

Lott, et al., 2011 PLoS Biology

9 of 19

ncbi.nlm.nih.gov/bioproject/PRJNA134445

10 of 19

(base) (10:15:02)~/$head ~/qbb2021/data/fpkms.csv

t_name,gene_name,male_10,male_11,male_12,male_13,male_14A,male_14

FBtr0114258,CR41571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

FBtr0346770,CG45784,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

FBtr0302440,CR12798,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

FBtr0302347,CR40182,23.712564,11.967821,20.767498,13.566818,18.8

FBtr0346769,CG45783,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

FBtr0345282,CR45220,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

FBtr0345281,CR45220,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

FBtr0300207,spok,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0

FBtr0113885,Parp,10.997173,0.0,0.0,0.0,14.805335000000001,0.0,17.60

FBtr0301810,Alg-2,19.067923999999998,0.0,14.832370000000001,27.79

FBtr0113895,Tim17b,6.315716,3.93539,6.244336,69.94291700000001,6

FBtr0345179,Tim17b,119.325905,11.631035,120.13729099999999,75.77

FBtr0301812,CG41128,6.421,0.0,10.366037,22.998929999999998,25.15

FBtr0113990,CG41099,26.929454999999997,0.0,0.0,28.79262,29.601716

FBtr0113991,CG41099,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.

11 of 19

Lunch Exercises

12 of 19

Basic Exercises

gander2.wustl.edu/cgi-bin/hgTracks?db=dm6 // Lott, et al., 2011 PLoS Biology

13 of 19

Advanced Exercise

gander2.wustl.edu/cgi-bin/hgTracks?db=dm6 // Lott, et al., 2011 PLoS Biology

14 of 19

Python Ecosystem

15 of 19

speakerdeck.com/jakevdp/the-unexpected-effectiveness-of-python-in-science

16 of 19

pandas.pydata.org

17 of 19

pandas.pydata.org/docs/user_guide/10min.html

import pandas as pd

df = read_csv(

"fpkms.csv",

index_col="t_name"

)

df.shape

goi = "FBtr0331261"

df.loc[goi,:]

18 of 19

matplotlib.org

19 of 19

matplotlib.org/tutorials/introductory/usage.html

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]

y = [1, 4, 9, 16, 25]

fig, ax = plt.subplot()

ax.plot( x, y )

ax.set_title( "y = x^2" )

plt.show()