The Internet allows students to access resources from all over the world. Although some instructors see this is a challenge to what they do, I believe that online education is a very important tool that complements what happens in the classroom. Online courses also allow students to engage in life long learning. Therefore, in addition to teaching traditional face-to-face courses, I have recently started working on designing online courses. These courses are more technical in nature than my traditional courses. The courses teach students how to use the very powerful software Stata in order to analyze large data sets, or what people refer to today as Data Analytics.
An Introduction to Stata
This is an introductory course to Stata. The course assumed to previous knowledge of the software nor any statistical knowledge. The course does not teach statistics. The goal of the course is to teach students about the basic functionality of Stata and how it can be used to analyze large data sets. The course contains two projects for students to work on. It also provides a step-by-step approach in covering all of the material where I go through the commands one by one. In addition to the video lectures, I have included the scripts of the lectures so that students can also study and revise the material without having to watch videos. Although Stata comes with many data sets, this course utilizes my own data sets in order to explain to students the thought process involved in collecting data.
Visualizing Data in Stata
This course introduces the student to the graphical capabilities of Stata. The course assumes only basic knowledge of data management in Stata. The student should be familiar with the graphical user interface, as well as with loading data sets into memory. The goal of this course is to teach the student the logic of extracting meaning from data sets using visualization tools. This is accomplished by using a single data set from the start of the course up until the very end. Students will learn how to use histograms, quantile plots, and symmetry plots. In addition, students will also learn how to use these tools in order to investigate whether group differences exist. The course then introduces students to bar graphs, box plots, and dot plots, and how these graphs can be used to study differences in groups that are divided along more than one dimension. Finally, the course shows students how to produce graphs that describe the relationship between two variables. Students are taught how to decide which type of plot is best suited for their needs. Throughout the course, students will also learn how to customize the colors and shapes used in the graphs.
Linear Regression using Stata
The course is divided into two parts. In the first part, students are introduced to the theory behind linear regression. The theory is explained in an intuitive way. No math is involved other than a few equations in which addition and subtraction are used. The purpose of this part of the course is for students to understand what linear regression is and when it is used. Students will learn the differences between simple linear regression and multiple linear regression. They will be able to understand the output of linear regression, test model accuracy and assumptions. Students will also learn how to include different types of variables in the model, such as categorical variables and quadratic variables. All this theory is explained in the slides, which are made available to the students, as well as in the e-book that is freely available for students who enroll in the course.
In the second part of the course, students will learn how to apply what they learned using Stata. In this part, students will use Stata to fit multiple regression models, produce graphs that describe model fit and assumptions, and to use variable specific commands that will make the output more readable. This part assumed very basic knowledge of Stata.
Logistic Regression Using Stata
Included in this course is an e-book and a set of slides. The course is divided into two parts. In the first part, students are introduced to the theory behind logistic regression. The theory is explained in an intuitive way. The math is kept to a minimum. The course starts with an introduction to contingency tables, in which students learn how to calculate and interpret the odds and the odds ratios. From there, the course moves on to the topic of logistic regression, where students will learn when and how to use this regression technique. Topics such as model building, prediction, and assessment of model fit are covered. In addition, the course also covers diagnostics by covering the topics of residuals and influential observations.
In the second part of the course, students learn how to apply what they learned using Stata. In this part, students will walk through a large project in order to understand the type of questions that are raised throughout the process, and which commands to use in order to address these questions.
Modeling Count Data using Stata
Included in this course is an e-book and a set of slides. The course is divided into two parts. In the first part, students are introduced to the theory behind count models. The theory is explained in an intuitive way while keeping the math at a minimum. The course starts with an introduction to count tables, where students learn how to calculate the incidence-rate ratio. From there, the course moves on to Poisson regression where students learn how to include continuous, binary, and categorical variables. Students are then introduced to the concept of overdispersion and the use of negative binomial models to address this issue. Other count models such as truncated models and zero-inflated models are discussed.
In the second part of the course, students learn how to apply what they have learned using Stata. In this part, students will walk through a large project in order to fit Poisson, negative binomial, and zero-inflated models. The tools used to compare these models are also introduced.
An Introduction to Predictive Analytics for Data Scientists
Included in this course is an e-book and a set of slides. The purpose of the course is to introduce the students to regression techniques. The course covers linear regression, logistic regression and count model regression. The theory behind each of these three techniques is described in an intuitive and non-mathematical way. Students will learn when to use each of these three techniques, how to test the assumptions, how to build models, how to assess the goodness-of-fit of the models, and how to interpret the results. The course does not assume the use of any specific statistical software. Therefore, this course should be of use to anyone intending on applying regression techniques no matter which software they use. The course also walks students through three detailed case studies.