Data Science Skills: Data Visualization
A picture is worth 1000 words.
This is the second article in a series about the skills required to be a successful data scientist. One of the most basic skills is how to represent data visually, whether the language is R or Python or another type. It involves common concepts and principles if the software is Canva, PowerPoint, Tableau, PowerBI or something else.
What is the 20 percent of this skill that will get you 80 percent of the results you need?
Familiarity with basic plotting packages: If you program in R, start by learning the basics of ggplot2
and base R
graphics packages. These two packages alone will allow you to create a wide variety of visualizations.
Data wrangling skills: Before you can create effective visualizations, you need to be able to prepare and manipulate your data. If you program with R, learn the basics of data wrangling using packages like dplyr
and tidyr
.
Understanding data types: It’s important to understand the different types of data (e.g. categorical, continuous, ordinal) and how they can be represented in visualizations. This knowledge will help you choose the appropriate types of plots for your data.
Choosing appropriate visualization types: There are many types of visualizations, each with their…