Cheat Sheets #1: Machine learning, Python, Visualization, Data Science Libraries, Jupyter Notebook, Big-O & Math
Learning Machine learning and Deep learning is difficult for newbies. As well as deep learning libraries are difficult to understand. I am creating this series with cheat sheets which I collected from different sources.
Over the past few months, totally redesigned the cheat sheets. The goal was to make them easy to read and beautiful so you will want to look at them, print them and share them.
Do read this and contribute cheat sheets if you have any. If you like this post, give it a ❤️! Here we go:
1. Machine Learning Overview
2. Algorithm Pro/Con
Source: https://blog.dataiku.com/machine-learning-explained-algorithms-are-your-friend
3. Scikit-Learn
Scikit-learn (formerly scikits.learn) is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy.
4. Machine Learning: Scikit-learn algorithm
This machine learning cheat sheet will help you find the right estimator for the job which is the most difficult part. The flowchart will help you check the documentation and rough guide of each estimator that will help you to know more about the problems and how to solve it.
5. [Microsoft Azure] MACHINE LEARNING : ALGORITHM CHEAT SHEET
This machine learning cheat sheet from Microsoft Azure will help you choose the appropriate machine learning algorithms for your predictive analytics solution. First, the cheat sheet will asks you about the data nature and then suggests the best algorithm for the job.
6. Python Basics
Source: http://datasciencefree.com/python.pdf
Source: https://www.datacamp.com/community/tutorials/python-data-science-cheat-sheet-basics#gs.0x1rxEA
7. Python for Data Science
8. Numpy
NumPy targets the CPython reference implementation of Python, which is a non-optimizing bytecode interpreter. Mathematical algorithms written for this version of Python often run much slower than compiled equivalents. NumPy address the slowness problem partly by providing multidimensional arrays and functions and operators that operate efficiently on arrays, requiring rewriting some code, mostly inner loops using NumPy.
9. Pandas
The name ‘Pandas’ is derived from the term “panel data”, an econometrics term for multidimensional structured data sets.
10. Data Wrangling
The term “data wrangler” is starting to infiltrate pop culture. In the 2017 movie Kong: Skull Island, one of the characters, played by actor Marc Evan Jackson is introduced as “Steve Woodward, our data wrangler”.
11. Data Wrangling with dplyr and tidyr
12. Scipy
SciPy builds on the NumPy array object and is part of the NumPy stack which includes tools like Matplotlib, pandas and SymPy, and an expanding set of scientific computing libraries. This NumPy stack has similar users to other applications such as MATLAB, GNU Octave, and Scilab. The NumPy stack is also sometimes referred to as the SciPy stack.[3]
13. Data Visualization
Matplotlib
matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. It provides an object-oriented API for embedding plots into applications using general-purpose GUI toolkits like Tkinter, wxPython, Qt, or GTK+. There is also a procedural “pylab” interface based on a state machine (like OpenGL), designed to closely resemble that of MATLAB, though its use is discouraged.[2] SciPy makes use of matplotlib.
pyplot is a matplotlib module which provides a MATLAB-like interface.[6] matplotlib is designed to be as usable as MATLAB, with the ability to use Python, with the advantage that it is free.
14. PySpark
15. Big-O
16. Math
If you really want to understand Machine Learning, you need a solid understanding of Statistics (especially Probability), Linear Algebra, and some Calculus. I minored in Math during undergrad, but I definitely needed a refresher. These cheat sheets provide most of what you need to understand the Math behind the most common Machine Learning algorithms.
Probability
Linear Algebra
Source: https://minireference.com/static/tutorials/linear_algebra_in_4_pages.pdf
Statistics
Source: http://web.mit.edu/~csvoss/Public/usabo/stats_handout.pdf
Calculus
Source: http://tutorial.math.lamar.edu/getfile.aspx?file=B,41,N
17 . Jupyter Notebook
If you like this post, give it a 👏 and ❤️. And Many Thanks for your genuine Support, it matters.
Till then- Keep Learning, keep Sharing, keep Growing.