Data science is a pretty intense field to get into. Most degree programs really prefer you to have an undergraduate degree in computer science, statistics, or some other related field.
Some schools required college credit in certain topics, but most were ok with professional experience, passing a proficiency exam, or a certificate from an online course. Some of them are even flexible enough to accept free courses, like the MITx courses.
The schools want some mix of knowledge in:
- Object oriented programming concepts – any kind of C++, Java, or Python that uses classes, objects, and methods
- Python programming – at least enough to open, parse, and process text data in files. Being able to work with Jupyter notebooks helps too.
- Data Structures & Algorithms – importance and use of Abstract Data Types, Design and implement elementary Data Structures such as arrays, trees, Stacks, Queues, and Hash Tables. Explain best, average, and worst-cases of an algorithm using Big-O notation. Describe the differences between the use of sequential and binary search algorithms.
- Linear Algebra: at least systems of equations, vector spaces, determinants, eigenvalues, similarity, and positive definite matrices.
- Single-variable calculus: Concepts of Function, Limits and Continuity Differentiation Rules, Application to Graphing, Rates, Approximations, and Extremum Problems Definite and Indefinite Integration, The Fundamental Theorem of Calculus, Applications to Geometry: Area, Volume, and Arc Length, Applications to Science: Average Values, Work, and Probability, Techniques of Integration, Approximation of Definite Integrals, Improper Integrals
- Statistics – regression analysis and statistical inference
Like I said, it’s a LOT… especially if I want to finish it all before I hopefully start classes in January.
I got myself a Coursera unlimited subscription and I’m working through a few specializations to cover all of this, plus I snuck in a data science one to help get me ahead in whatever program I end up in. I figure it’s a good thing to have at least seen some data cleaning, exploratory analysis, and machine learning before I’m being graded on it.
I plan on taking courses from now until I start classes… and might even find the course-load easier than what I’m doing now! (maybe?)