1 of 20

Fist major assignment

Data consideration and visualization

Sept 1 at midnight

2 of 20

Goals

  • So far we’ve worked through examples where you could find answers. I think of this as training wheels.

  • Now our goal is to have you work independently on actual data – ideally data that you’re invested in.

  • This is an opportunity to integrate all of what we have learned up till a week from weds and unleash it on a data set

3 of 20

Think of this as a take home test (rather than a research paper)

  • As such you should focus on connecting your data to what we learned.
  • The only biology background you need to provide is what we need to appreciate and responsibly interpret the research.
    • What biological question? How does the data relate to this question?
    • Experiment or observational study?
    • Consider potential for bias, Non-independence, and confounds
  • Spend more time on stats and explaining it.
  • No need top fully analyze large complex datasets.
  • Try to stay away from techniques / approaches we haven’t covered

4 of 20

Think of this as a take home test (rather than a research paper)

  • As such you should focus on connecting your data to what we learned.
  • The only biology background you need to provide is what we need to appreciate and responsibly interpret the research.
    • What biological question? How does the data relate to this question?
    • Experiment or observational study?
    • Consider potential for bias, Non-independence, and confounds
  • Spend more time on stats and explaining it.
  • No need top fully analyze large complex datasets.
  • Try to stay away from techniques / approaches we haven’t covered

5 of 20

Think of this as a take home test (rather than a research paper)

  • As such you should focus on connecting your data to what we learned.
  • The only biology background you need to provide is what we need to appreciate and responsibly interpret the research.
    • What biological question? How does the data relate to this question?
    • Experiment or observational study?
    • Consider potential for bias, Non-independence, and confounds
  • Spend more time on stats and explaining stats.
  • No need top fully analyze large complex datasets.
  • Try to stay away from techniques / approaches we haven’t covered

6 of 20

Think of this as a take home test (rather than a research paper)

  • As such you should focus on connecting your data to what we learned.
  • The only biology background you need to provide is what we need to appreciate and responsibly interpret the research.
    • What biological question? How does the data relate to this question?
    • Experiment or observational study?
    • Consider potential for bias, Non-independence, and confounds
  • Spend more time on stats and explaining it.
  • No need to fully analyze large complex datasets (reuse later).
  • Try to stay away from techniques / approaches we haven’t covered

7 of 20

Think of this as a take home test (rather than a research paper)

  • As such you should focus on connecting your data to what we learned.
  • The only biology background you need to provide is what we need to appreciate and responsibly interpret the research.
    • What biological question? How does the data relate to this question?
    • Experiment or observational study?
    • Consider potential for bias, Non-independence, and confounds
  • Spend more time on stats and explaining it.
  • No need top fully analyze large complex datasets.
  • Try to stay away from techniques / approaches we haven’t covered

8 of 20

What’s a good data set

  • Anything you care about
  • No such thing as too small or too big
  • No such thing as too simple
  • If it’s too complex, just do a bit
  • If you’ve already analyzed the data be sure you get something out of this you haven’t already gotten.

9 of 20

Sections

Data consideration and visualization

10 of 20

What to turn in

  • A roughly two page presentation of your work including two high quality explanatory figures with figure legends.
  • Another reflection [~1 page] on figure design, discussing the
    • Story each figure is telling
    • Why you chose the formats you did and how they follow the best practices in figure making
    • What you did to go from exploratory to explanatory figure [intermediate figures welcome]
  • The raw set you analyzed.
  • An R script that can be used to recreate your figures and summaries.
  • RMarkdown optional

11 of 20

Presentation of your work

  • Introduction

  • Figures and figure legends

  • Analysis / results

  • Conclusions

12 of 20

Rubric: Introduction

13 of 20

Rubric: Introduction

Introduction:

  • Explain motivation for the study. What is the key motivating question? What do you hope to learn? Why might it be interesting? What are the implications of alternative results? How were the data collected? How worried should we be about sampling bias and/or sampling error.

Full credit (10 points) Clearly states hypothesis and gives adequate but concise background beyond superficial explanation of the data / question, with only a few missing pieces. 

Good try (5-9 points) Clearly states hypothesis and gives adequate but concise background beyond superficial explanation of the data / question, with only a few missing pieces. 

At least you tried (1-4 points) Scientific question is unclear and little to no background is provided. 

Nothing turned in (0 points)

14 of 20

Rubric: Figures ++

Figures

Two high quality explanatory plots with legends appropriately described in text, that follow best practices of figure design. 

Full credit (10 points): Two high quality explanatory figures, highlighting the results that follow best practices in figure making with appropriate legends and results clearly communicated in text.

Good try (5-9 points): Some deficiencies in these areas e.g. figure does not follow best practices, or is exploratory rather than explanatory, or unclear legend, or improper communication in text.

At least you tried (1-4 points): A large number of deficiencies in these areas.

Nothing turned in (0 points)

15 of 20

Rubric: Analysis / results

Full credit (10 points): Provides well explained and relevant summary statistics (central tendency, variability, and association (if relevant)), explains how they relate to the motivating question, how to interpret them in light of the shape of the data. 

Nice job! (7-9 points): Does or consider the shape of the data, or summaries are inappropriate for the scientific question. 

Ok! (4-6 points): Two mistakes from the list above.

You tried (1-3 points): More than two deficiencies from the list above.

Nothing turned in (0 points)

Analysis / results

  • Explain results and figures in narrative format and how they relate to the scientific question. Responsibly present summary statistics, making sure to bring this back to our motivating ideas.

16 of 20

Rubric: Conclusions

Conclusions

  • Explain results and figures in narrative format and how they relate to the scientific question. Responsibly present summary statistics, making sure to bring this back to our motivating ideas.

Full credit (5  points) Explains how results tie back into the question in concise conclusion, includes some discussion of shortcomings.

At least you tried (1-4 points) Missing shortcomings and/or does not tie results back to scientific question

Nothing turned in (0 points)

17 of 20

Rubric: Figure Reflection

Figure Reflection

Your description (outside of main text) of the thought behind your figure design, explaining what choices you made to follow best practices, and how you went from an exploratory to an explanatory figure.

Full credit (5 points): Clear communication and explanation of the decisions you made and how these decisions followed best practices, demonstrating a mastery of best practice in making explanatory figures.�

Solid (3-4 points): Clear communication and explanation of the decisions you made but does not show a mastery of best practice in making explanatory figures. 

�Good try (1-2 points): A superficial description of figure choices (e.g. largely describing why it’s pretty rather than why it’s effective).

�Nothing turned in (0 points)

18 of 20

Rubric: Code

Code

Your well commented code should run with minimal effort on a my computer and reproduce the results and figures you prest

Full credit (10 points): Clear code works without issue.It is obvious what code generates which results, and how it works

Solid (7-9 points): Code works with minimal issues. All results and figures are recreated but its difficult to follow

�Good try (4-6 points): Code takes some effort to work or does not all work. Some but not all of the results and figures can be reproduced.

Be better (1-3 points): Code hardly works. Very little of the work can be understood or recovered

�Nothing turned in (0 points)

19 of 20

Suggested timeline

20 of 20

Suggested Timeline

  • Decide on Dataset ASAP
  • Rough go of Exploratory Figures and Summaries by Dataset identified by end of class Monday
  • Draft Code and Data summaries before class weds. Stabilized during class.
  • Better figures by friday sept 27.
  • Due monday sept 30