Data Cleaning is a critically important part of any data analysis. Without properly prepared data, the analysis will yield inaccurate results. Correcting errors later in the analysis adds to the time, effort, and cost of the project.
In this training, you’ll get a software-agnostic overview of the major steps required to clean data. You’ll see examples of the issues that you will encounter in cleaning data.
Key components covered in the presentation are:
- Understanding the workflow
- How to measure data quality
- Outliers and missing data
- Data dependencies
- Eliminating duplicates
- Altering data
Note: This training is an exclusive benefit to members of the Statistically Speaking Membership Program and part of the Stat’s Amore Trainings Series. Each Stat’s Amore Training is approximately 90 minutes long.
About the Instructor
John Williams is an R expert and biostatistician at the Columbia University Vagelos College of Physicians and Surgeons. He followed a long career in software development with an M.S. in Applied Statistics from Columbia University’s Teachers College. A lifelong musician, he led a variety of bands in the 1980’s and continues to perform in New York City.
Just head over and sign up for Statistically Speaking. You'll get access to this training webinar, 130+ other stats trainings, a pathway to work through the trainings that you need — plus the expert guidance you need to build statistical skill with live Q&A sessions and an ask-a-mentor forum.