These slides available at: https://arcus.github.io/demystifying_r_rstudio_skills_series/session_2.html
Arcus is an initiative by the Research Institute aimed at promoting data discovery and reuse and increasing research reproducibility.
Among the many teams in Arcus, I represent Arcus Education!
Arcus education provides data science training to researchers …
(and often this is useful to non-researchers too!).
https://arcus.chop.edu/i-want-to/arcus-education
Email us! arcus-education@chop.edu
Arcus Education provides “Skills Series” for the entire CHOP community.
This Skills Series is a short, 2-session series aimed at Demystifying R and RStudio!
Literate Statistical Programmi
Goals:
R Programming language for data analysis
RStudio Integrated development environment (IDE)
Literate programming – Donald Knuth’s term for programming that is effective not just for computers but for people.
Statistical programming – when you analyze data statistically, using a programming language.
Literate statistical programming – when you create scripts (for example in R) that describe for the computer and for human readers the analysis you’re doing and why and how you’re doing each step.
We describe what we do and use headers, bullet points, and other formatting to make it easier for humans to make sense of the code.
Duke
"1881_at"
"31321_at"
"31725_s_at"
"32307_r_at"
MD Anderson
"1882_g_at"
"31322_at"
"31726_at"
"32308_r_at"
Do you see the off-by-one indexing error?
Off-by-one indexing error
Sensitive / resistant label reversal
Confounding in experimental design
Inclusion of data from non-reported sources
Wrong figure shown
… add up to huge patient consequences!
Your closest collaborator is you from 6 months ago…
Quarto allows you create documents that interlace:
Which helps future-you AND your colleagues!
Five Mondays at 1 pm, April and First Monday in May:
Session 2: Projects and File Ingestion
Session 3: Exploring Data Visually Using ggplot2
As you can tell from our data analysis, we like to measure our effectiveness.
Goals:
If you want, totally optional additional learning:
Module | Description | Duration |
---|---|---|
Learning to Learn Data Science | Discover how learning data science is different than learning other subjects. | 20 mins |
Reproducibility, Generalizability, and Reuse | This module provides learners with an approachable introduction to the concepts and impact of research reproducibility…. | 60 min |
R Basics | Are you brand new to R, and ready to get started? This module teaches concepts and vocabulary related to R, RStudio, and R Markdown…. | 60 min |
Arcus Education, Children’s Hospital of Philadelphia