Data Exploration (DECS-922-5)
0.50 Credit
The ability to quickly and efficiently manipulate and analyze real-world data has become a basic skill. In this course we will use R --- which has become a standard tool for bridging the last mile to the corporate data center --- to explore the basic elements of data management and exploration. Specifically, we will discuss data acquisition and cleaning; data manipulation, including subsetting, summaries, and the rationale for long and wide formats; data normalization; visualization; basic statistical analysis; reproducibility; and reporting. We will emphasize the use of the R packages dplyr for data manipulation and ggplot for visualization, and illustrate the use of R to query SQL data servers. The only prerequisite for the course is an interest in data analysis. No previous experience with R or SQL is assumed and you should acquire basic fluency in R in the course. While not required, students are encouraged to come to the course with their own data projects in mind.


