Data Science is in a cross-field of different fields.
This means you need a lot of different skills.
By experience, most Data Scientists work in teams and do not necessarily need to be experts in all areas. Hence, the list of skills is an idea of what you need.
A great way to look at it is in hard and soft skills.
I would say, that the soft skills you learn by experience and but you need an interest in them. The hard skills are the ones you need to get good or at least decent at.
Looking at the hard skills, you do not need to master all aspects of them.
Before we dive into the hard skills, let’s also understand what a Data Scientist does.
I understand that many get scared of this one and if you take a formal education in Data Science, you will learn a lot of Statistics.
Experience shows, that it is the few specialists that need a high level of statistics as a Data Scientists. That said, you still need to understand some aspects in-depth.
What does that include?
The most important is.
Also understanding box-plots.
What correlation means.
You can learn more about it here.
Python is used in scientific communities for a set of reasons.
Python is the most popular programming language in the Scientific Community including Data Science. It is a solid choice to learn.
But do you need to master Python programming on a high level?
No, you need to understand Python programming to a simple level where you master the following.
This sounds like a lot but can be broken down into steps.
Most beginners’ courses in Python will do fine, while some specialize too much. But what you need to understand and get a feeling of, is how Python code works.
Some common things you learn in the Basic Python course.
Other things you learn, that are good to understand, but not needed to master.
A great source is this free course.
For the most part, you get really far with pandas DataFrames as a Data Scientist. If you understand them and can work with data with them. Then you are really far.
NumPy is an extension on top of DataFrames (even though it is implemented opposite).
But what are DataFrames and NumPy?
They are data structures used to contain the data you work with as a Data Scientist.
A great place to learn about DataFrames is to follow this free course.
The Machine Learning models you create are the one that creates your insights to deliver value to your clients. Therefore you need skills to master them and understand how they work.
There are a lot of models and you don’t need to be an expert in all of them. But it is a great idea to understand them.
A few ones could be.
And be knowledgeable in frameworks like.
You can build up your skills in this free course.
This is actually often the key to getting a job as a Data Scientist.
If you know a lot about Windmills, power prediction patterns, and so forth. Well, then it will be easier for you to get a job as a Data Scientist for a company predicting power production by Windmills.
Or you are an expert in the weather forecast. You can also, get a job as a Data Scientist for predicting power production by Windmills.
The point is twofold.
First, if you have worked in an industry for a few years, then you have deep domain knowledge about that field. Is there is cross-field where you can apply Data Science? Well, find those jobs and you will have a great edge to getting them.
Why?
Well, most say that it is easier to train people to make Data Science, than give them 3-4 years of experience in a Domain.
Take advantage of that.
Second, if you have an interest in some specific area of Data Science. Focus on it. Become an expert.
Again, having Domain Knowledge is crucial to set yourself apart from the other applicants.
Data Visualization is often misunderstood by beginners in Data Science.
It is actually crucial in 3 different aspects.
Most only focus on the Data Presentation – presenting your findings. While this is an art in itself, most do not fully capture the importance of the other ones at first.
Our human brain is not wired to understand data as digits, but when we see them visually on a chart, we can immediately see and understand them.
Just look at this one.
What is wrong? Well, it looks that some heights are not fitting the other heights.
This tells you something about Data Quality. Is there something wrong with it?
The chart would tell you something is wrong no matter how many data points you have. But image you had to look through 10,000 data points manually in a table. That would take hours and you might miss it.
When it comes to exploring data, seeing it visually on a chart shows you patterns.
Again, you would notice that looking at the data in a table.
Finally, data presentation is an art in itself.
Does this one tell you a story?
A great resource to learn about Data Visualization can be found here.
This gives you the hard skills you need as a Data Scientist.
A great way to think of it is also to understand the Data Science Workflow.
It gives you an idea of what steps a Data Science Project goes through.
Data Science for beginners in this free online course Data Science is over complicated by…
15 Machine Learning Projects That Will Teach You All You Need as Machine Learning Authority…
Why learn Python? There are many reasons to learn Python, and that is the power…
What will you learn? How to use the modulo operator to check if a number…
There are a lot of Myths out there There are lot of Myths about being…
To be honest, I am not really a great programmer - that is not what…