1 of 11

Layout theoretical part

1. Data visualization of biomedical data: concepts, current challenges and misconceptions

2. Data visualization principles

2.1 Data volume

2.2 Data complexity

2.3 Data integration and tailored visualizations

3. 10 rules for better figures and common pitfalls

DataVis workshop for UCLA Collaboratory

2 of 11

Anscombe’s quartet

We cannot jump from analysis to discovery without visualization of all relevant data!

These four data sets have:

  • Identical mean, variance and correlation coefficient
  • Fit an identical linear regression line

DataVis is a necessary and rate-limiting step for discovery

3 of 11

  • Rapid increase in volume and complexity of biomedical data requires development of new methods and practices to visualize data
  • If no changes are made, many biomedical discoveries will remain buried in data already collected, and many misdiagnoses (10-30% of all diagnoses) will remain unrecognized

Since 2000, unifying term: Data Visualization

Use of computer-aided, interactive visual representations of data �to amplify cognition and accelerate discovery and communication

DataVis: why scientists need to get better at it

Scientific Visualization

Visualization of data that directly map into 2 or 3 spatial dimensions �(e.g. cartography, tomography scans)

Information Visualization

Visualization of abstract data (e.g. 2-dimensional data plots, network graphs)

4 of 11

  • Relatively few scientist use data visualization resources because of the following misconceptions:
  • Misconception 1: “The goal of data visualization is to impress”

The goal of DataVis in is not aesthetics, but to reveal patterns in data

  • Misconception 2: “Data visualization is easy”

Well-designed DataVis is easy to understand, but not easy to create!

  • Misconception 3: ”Studying data visualization is unnecessary”

Underestimating the difficulty of DataVis can lead us to overestimate our current skills and conclude we would not gain benefit from it

  • Misconception 4: “Visualization is just a synonym for imaging”

Broader meaning! DataVis encompasses abstract data, interactive analysis, design and visual and cognitive abilities… its purpose is insight, not pictures!

DataVis resources are underused

5 of 11

  • How can visualization help us deal with the increasingly large volume of data sets?

  • Second option is better when we think about journals’ space limitation requirement and the fact that the human eye can resolve features to 0.1 mm
  • Visualization research provides clear guidelines for and specialist tools to visualize large data volumes with high data density

DataVis principles - Data Volume

Get more pixels:

larger displays with higher resolution

Create visualizations with greater data density

6 of 11

Guidelines

7 of 11

Tools

8 of 11

  • Biomedical data sets are complex: multivariate, multiscale, highly interconnected and dependent on very specific conditions
  • A common strategy is to use analytical methods to reduce dimensionality (clustering, PCA…), but we need to be careful about drawing conclusions before inspecting all relevant data!
  • Multivariate data of any dimensionality can be visualized in 2 dimensions without loss of information, but data patterns can be hard to recognize!

DataVis principles - Data Complexity

9 of 11

DataVis principles - Data Integration and Tailored Visualizations

Data patterns are hard to recognize and interpret when encoded in 2 dimensions!

We need tailored visualization!

10 of 11

  1. Know your audience
  2. Identify your message
  3. Adapt the figure to the support medium
  4. Captions are not optional
  5. Do not trust the defaults
  6. Use color effectively
  7. Do not mislead the reader
  8. Avoid “Chartjunk”
  9. Message Trumps Beauty
  10. Get the right tool

10 rules for better figures

Rougier et al PLOS 2014

Expand on this!

https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003833

11 of 11

  • Nature Methods one-page articles focused on specific visualization issues faced by life scientists:

Evanko D. 2013. Data visualization: a view of every Points of View column. Methagora: A Blog from Nature Methods

  • Nature Methods special issue on visualizing biological data, covering molecular biology, biomedical science and evolution

O’Donoghe SI et al. 2010. Visualizing biological data - now and in the future. Nat Methods 7:S2-4

  • Concise guide to principles and tools for creating scientific figures

Rougier NP et al. 2014. Ten simple rules for better figures. PLOS Comput. Biol. 10:e1003833

References and resources