The ability to quickly and efficiently manipulate and analyze real-world data has become a basic skill. In this course we will use R --- which has become a standard tool for bridging the last mile to the corporate data center --- to explore the basic elements of data management and exploration. Specifically, we will discuss data acquisition and cleaning; data manipulation, including subsetting, summaries, and the rationale for long and wide formats; data normalization; visualization; basic statistical analysis; reproducibility; and reporting. We will emphasize the use of the R packages dplyr for data manipulation and ggplot for visualization, and illustrate the use of R to query SQL data servers. The only prerequisite for the course is an interest in data analysis. No previous experience with R or SQL is assumed and you should acquire basic fluency in R in the course. While not required, students are encouraged to come to the course with their own data projects in mind.
There are no sections matching your search criteria. Please search again with different criteria or contact the Kellogg School registrar for more information.