Skip to the content.

This page includes a complete list of our published modules.

We’re also building a self-service tool to help you find the modules most relevant to you. Test out our prototype module discovery application, and please leave feedback to help us improve!

Training Course Description Estimated Time Collection Coding Language Task Domain
Bash: Combining Commands This module will teach you how to combine two or more commands in Bash to create more complicated pipelines in Bash. 30 min Learn to code Bash    
Bash / Command Line 101 This course teaches learners to navigate their computer, as well as view and edit files, from the command line using Bash. 40 min Learn to code Bash    
Bash: Searching and Organizing Files This module will teach you how to use the bash shell to search and organize your files. 30 min Learn to code Bash Data management  
Bash: Conditionals and Loops This module teaches you how to iterate through "for" loops and write conditional statements in Bash. 60 min Learn to code Bash    
Bash: Reusable Scripts This module will teach you how to create and use simple Bash scripts to make repetitive tasks as simple as possible. 60 min Learn to code Bash    
Understanding the Bias-Variance Tradeoff The bias-variance tradeoff is a central issue in nearly all machine learning analyses. This module explains what the tradeoff is, why it matters for machine learning, and what you can do to manage it in your own analyses. 20 min Machine Learning, Statistics      
Citizen Science This is an overview of citizen science for biomedical researchers. 45 min Intro to data science      
Research Data Management Basics Learn the basics about research data management. 40 min Intro to data science   Data management  
Types of Data Storage Models This course will focus on different data storage solutions available to an end user and the unique characteristics of each type. This course will also cover how each storage type impacts one’s access to data and computing capabilities. 30 min Infrastructure and technology   Data management  
Data Visualization in ggplot2 This module includes code and explanations for several popular data visualizations, using R’s ggplot2 package. It also includes examples of how to modify ggplot2 plots to customize them for different uses (e.g. adhering to journal requirements for visualizations). 60 min Learn to code R Data visualization  
Data Visualization in Open Source Software Introduction to principles of data vizualization and typical data vizualization workflows using two common open source libraries: ggplot2 and seaborn. 20 min     Data visualization  
Data Visualization in seaborn This module includes code and explanations for several popular data visualizations using python’s seaborn library. It also includes examples of how to modify seaborn plots to customize them for different uses. 60 min Learn to code Python Data visualization  
Database Normalization Learn about the concept of normalization and why it’s important for organizing complicated data in relational databases. 40 min     Data management EHR data
Demystifying Application Programming Interfaces (APIs) Understand what an application programming interface (API) is and why APIs are useful! 30 min Demystifying, Infrastructure and technology      
Demystifying the Command Line Interface Understand what the command line interface is and why it’s useful! 15 min Demystifying, Infrastructure and technology      
Demystifying Containers Containers can be a useful tool for reproducible workflows and collaboration. This module describes what containers are, why a researcher might want to use them, and what your options are for implementation. 20 min Demystifying      
Demystifying Geospatial Data This module is a brief introduction to geospatial (location) data. 15 min Demystifying     Geospatial data
Demystifying Large Language Models Learn about large language models (LLM) like ChatGPT. 60 min Demystifying, Machine Learning     Text data
Demystifying Machine Learning An approachable and practical introduction to machine learning for biomedical researchers. 60 min Demystifying, Machine Learning      
Demystifying Python This module introduces the Python programming language, explores why Python is useful in research, and describes how to download Python and Jupyter. 20 min Demystifying Python    
Demystifying Regular Expressions Learn about pattern matching using regular expressions, or regex. 30 min Demystifying     Text data
Demystifying SQL SQL is a relational database solution that has been around for decades. Learn more about this technology at a high level, without having to write code. 40 min Demystifying     EHR data
Directories and File Paths In this module, learners will explore what a directory is and how to describe the location of a file using its file path. 15 min Infrastructure and technology      
Getting Started with Docker for Research This tutorial combines a hands-on interactive Docker tutorial published by Docker Inc with an academic article outlining best practices for using Docker for research. 60 min Infrastructure and technology Bash    
The Elements of Maps This is a general overview of ways that geospatial data can be communicated visually using maps. 45 min     Data visualization Geospatial data
Generalized Linear Regression What is generalized linear regression (including logistic regression) and when might you need it? 60 min Statistics   Data analysis  
Genomics Tools and Methods: Quality Control Get started with genomics! This module walks you through how to analyze FASTQ files to assess read quality, the first step in a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population (variant calling). 40 min   Bash   Omics data
Genomics Tools and Methods: Computing Setup This module walks you through setting up your own copy of a genomics analysis AMI (Amazon Machine Image) to run genomics analyses in the cloud. 30 min Infrastructure and technology Bash   Omics data
Encoding Geospatial Data: Latitude and Longitude This is an introduction to latitude and longitude and the importance of geocoding - encoding geospatial data in the coordinate system. 15 min     Data visualization Geospatial data
Git Command Line Interface versus Graphical User Interface Compare the two ways of interacting with Git to decide which is best for you. 30 min Infrastructure and technology   Data management  
Creating a Git Repository Create a new Git repository and get started with version control. 60 min Learn to code Git, Bash    
Exploring the History of your Git Repository This module will teach you how to look at past versions of your work on Git and compare your project with previous versions. 30 min   Git, Bash    
Intro to Version Control An introduction to what version control systems do and why you might want to use one. 15 min Infrastructure and technology   Data management  
Setting Up Git on Mac and Linux This module provides recommendations and examples to help new users configure git on their computer for the first time on a Mac or Linux computer. 15 min Infrastructure and technology Git Data management  
Setting Up Git on Windows This module provides recommendations and examples to help new users configure Git on their Windows computer for the first time. 25 min Infrastructure and technology Git, Bash Data management  
How to Troubleshoot Learning to use technical methods like coding and version control in your research inevitably means running into problems. Learn practical methods for troubleshooting and moving past error codes and other difficulties. 30 min Intro to data science      
Introduction to Null Hypothesis Significance Testing This is an introduction to NHST for biomedical researchers. 40 min Statistics   Data analysis  
Learning to Learn Data Science Discover how learning data science is different than learning other subjects. 20 min Intro to data science      
Omics Orientation This module provides a brief introduction to omics and its associated fields. 15 min Demystifying     Omics data
Transform Data with pandas This is an introduction to transforming data using a Python library named pandas. 60 min   Python Data wrangling  
Python Basics: Exercise Practice the skills acquired in the Python Basics sequence by working through an exercise. 30 min Learn to code Python    
Python Basics: Lists and Dictionaries Learn about collection objects, specifically lists and dictionaries, in Python. 15 min Learn to code Python    
Python Basics: Loops and Conditionals Learn how to use loops and conditional statements in Python. 20 min Learn to code Python    
Python Basics: Functions, Methods, and Variables Learn the foundations of writing Python code, including the use of functions, methods, and variables. 20 min Learn to code Python    
Python Practice Use the basics of Python coding, data transformation, and data visualization to work with real data. 60 min Learn to code Python    
R Basics: Introduction Introduction to R and hands-on first steps for brand new beginners. 60 min Infrastructure and technology, Learn to code, Intro to data science R    
R Basics Practice Use the basics of R coding, data transformation, and data visualization to work with real data. 60 min   R Data visualization, Data wrangling  
R Basics: Transforming Data With dplyr Learn how to transform (or wrangle) data using R’s dplyr package. 60 min Learn to code R Data wrangling  
R Basics: Visualizing Data With ggplot2 Learn how to visualize data using R’s ggplot2 package. 60 min Learn to code R Data visualization  
Missing Values in R A practical demonstration of how missing values show up in R and how to deal with them. Note that this module does not cover statistical approaches for handling missing data, but instead focuses on the code you need to find, work with, and assign missing values in R. 45 min Learn to code R Data wrangling  
R Practice Use the basics of R coding, data transformation, and data visualization to work with real data. 60 min Learn to code R    
Reshaping Data in R: Long and Wide Data A module that teaches how to reshape tabular data in R, concentrating on some typical shapes known as "long" and "wide" data. 60 min Learn to code R Data wrangling  
Summary Statistics in R Learn to calculate summary statistics in R, and how to present them in a table for publication. 30 min Learn to code, Statistics R Data analysis  
Regular Expressions Basics Begin to use regular expressions, or regex, for simple pattern matching. 60 min Learn to code     Text data
Regular Expressions: Flags, Anchors, and Boundaries Use flags, anchors, and boundaries in regular expressions, or regex, for complex pattern matching. 45 min Learn to code     Text data
Regular Expressions: Groups Use regular expressions, or regex, for complex pattern matching involving capturing and non-capturing groups. 30 min Learn to code     Text data
Regular Expressions: Lookaheads Use regular expressions, or regex, for complex pattern matching involving lookaheads. 30 min Learn to code     Text data
Reproducibility, Generalizability, and Reuse This module provides learners with an approachable introduction to the concepts and impact of research reproducibility, generalizability, and data reuse, and how technical approaches can help make these goals more attainable. 60 min Intro to data science      
SQL Basics Structured Query Language, or SQL, is a relational database solution that has been around for decades. Learn how to do basic SQL queries on single tables, by using code, hands-on. 60 min Learn to code SQL Data wrangling EHR data
SQL, Intermediate Level Learn how to do intermediate SQL queries on single tables, by using code, hands-on. 60 min Learn to code SQL Data wrangling EHR data
SQL Joins Learn about SQL joins: what they accomplish, and how to write them. 60 min Learn to code SQL Data wrangling EHR data
Statistical Tests in Open Source Software This module provides an overview of the most commonly used kinds of statistical tests and links to code for running many of them in both R and python. 20 min Statistics R, Python Data analysis  
Tidy Data Tidy is a technical term in data analysis and describes an optimal way for organizing data that will be analyzed computationally. 45 min Intro to data science, Demystifying      
Using the REDCap API REDCap is a research data capture tool used by many researchers in basic, translational, and clinical research efforts. Learn how to use the REDCap API in this module. 60 min Infrastructure and technology R, Python