As data are increasingly available online, data analysis has replaced data acquisition as the bottleneck to empirical research in the social sciences. 80% of empirical research is spent sourcing, cleaning and preparing often noisy data, while the remaining 20% is actual data analysis. Extracting knowledge from heterogeneous datasets requires not only computational tools, but the programming skills to use them effectively.
This course introduces computational methods needed for data generation, data manipulation, data visualization, and data reproducibility and provides students with the ability to apply them to their own projects. The course is organized in three parts. The first part of the course will introduce ways to effectively extract, load, transform, and visualize structured and unstructured data. The second and third part will focus on practical applications of data science methods in academic research and in the industry respectively.
There is an increasing demand inside and outside of academia for skills to effectively analyze data as well as present results to a range of audiences making this course equally relevant for students seeking scientific or business careers.
Prerequisites: The course is intended for students with experience in working with R. Ideally, Students should have also already taken Statistics II.
This course is for 2nd year MIA and MPP students only.