Numpy is the fundamental library for performing scientific calculations and many of the libraries use Numpy arrays as their basic inputs and outputs. It is an open source extension module of python. In addition, it optimizes python with powerful data structures for computation of multi-dimensional arrays and matrices efficiently.
Even understanding NumPy arrays and array-oriented computing will help you to understand a data set more effectively.
Pandas is an open source and one of those libraries for data analysis, that contains high-level data structures to manipulate data in a very simple way. It is well suited if your data is in tabular, ordered and unordered time series form and provides tools for shaping, merging slicing datasets and reshaping. It is the best tool for doing data munging and to handle missing data.
With Pandas, you can load your data into data frames, select columns for specific value, perform statistical operations, merge data frames etc.
Matplotlib is best and well-known Python data visualization library. It allows you to quickly make line-graphs, histograms, pie charts etc. Using Matplotlib library, you can customize every aspect of a figure. It exports graphics to common vector and graphics formats like PNG, BMP, JPG etc.
4. SciKit Learn
SciKit-learn is an open source library for the python. It has several supervised and unsupervised machine learning algorithms. It is a free library which contains efficient tools for data analysis and mining purposes. You can perform various algorithms such as support vector machines, naïve Bayes, random forests, DBSCAN etc.
SciPy is a python module which is widely used in scientific and N-dimensional array manipulation. It is a free library that provides the core mathematical methods to do the complex machine learning process. It contains modules for linear algebra, interpolation and image processing.