Lion in a what?

Digital Media, Organizational, and Personal Development

twitter linkedin github gtalk

Coursera - Reproducible Research - Week 1 - Organizing Your Analysis

Data analysis files

  • Data - raw and processed
  • Figures - exploratory and final
  • R code - raw/unused scripts, final scripts, R Markdown files
  • Text - README files, text of analysis/report

Raw Data

  • should be stored in your analysis folder
  • if accessed from the web, include the URL, description, and data accessed in the README

Processed Data

  • should be named so it is easy to see which script generated the data
  • the processing script - processed data mapping should occur in the README
  • should be tidy

Exploratory figures

  • made during the course of your analysis, not necessarily part of your final report
  • do not need to be 'pretty'

Final figures

  • usually a small subset of the original figures
  • axes/colors set to make the figure clear
  • possibly multiple panels

Raw scripts

  • may be less commented (but comments help you!)
  • may be multiple versions
  • may include analyses that are later discarded

Final scripts

  • clearly commented - small comments liberally (what, why, when, how), bigger commented blocks for whole sections
  • include processing details
  • only analyses that appear in the final write-up

R Markdown Files

  • can be used to generate reproducible reports
  • text and R code are integrated
  • very easy to create in Rstudio

Readme files

  • Not necessary if you use R Markdown
  • should contain step-by-step instructions for analysis
  • Example: http://github.com/jtleek/swfdr/blob/master/REAMDE.md

Text of the document

  • should include a title. introduction (motivation), methods (statistics you used), results (including measures of uncertainty), and conclusions (including potential problems)
  • should tell a story
  • should not include every analysis you performed
  • references should be included for statistical methods