Machine Learning libraries (NumPy, SciPy, matplotlib, scikit-learn, pandas)

Short introduction of python libraries which are used widely for Machine Learning like NumPy, SciPy, matplotlib, scikit-learn, pandas


Category: Machine Learning Tags: Python, Python 3

Machine Learning Libraries Code Files

    Till today I have written all tutorials without libraries and now I’m taking our journey to next level where we will use python libraries for classification, visualization and clustering. In this article, we will have a short introduction of NumPy, SciPy, matplotlib, scikit-learn, pandas.

NumPy

    NumPy basically provides n-dimensional array object. NumPy also provides mathematical functions which can be used in many calculations.

Command to install: pip install numpy

import numpy as np
arr = np.array([[1,2,3],[4,5,6]])
print("Numpy array\n {}".format(arr))

Output

Output

Numpy array

 [[1 2 3]

 [4 5 6]]

SciPy

    SciPy is collection of scientific computing functions. It provides advanced linear algebra routines, mathematical function optimization, signal processing, special mathematical functions, and statistical distributions.

Command to install: pip install scipy

from scipy import sparse
# Create a 2D NumPy array with a diagonal of ones, and zeros everywhere else
eye = np.eye(3)
print("NumPy array:\n{}".format(eye))
sparse_matrix = sparse.csr_matrix(eye)
print("\nSciPy sparse CSR matrix:\n{}".format(sparse_matrix))

Output

NumPy array:

[[1. 0. 0.]

 [0. 1. 0.]

 [0. 0. 1.]]

SciPy sparse CSR matrix:

  (0, 0)        1.0

  (1, 1)        1.0

  (2, 2)        1.0

matplotlib

    matplotlib is scientific plotting library usually required to visualize data. Importantly visualization is required to analyze the data. You can plot histograms, scatter graphs, lines etc.

Command to install: pip install matplotlib

import matplotlib.pyplot as plt
x = [1,2,3]
y = [4,5,6]
plt.scatter(x,y)
plt.show()

Output

 

matplotlib scatter plot

scikit-learn

    scikit-learn is built on NumPy, SciPy and matplotlib provides tools for data analysis and data mining. It provides classification and clustering algorithms built in and some datasets for practice like iris dataset, Boston house prices dataset, diabetes dataset etc.

Command to install: pip install scikit-learn

from sklearn import datasets
iris_data = datasets.load_iris()
sample = iris_data['data'][:3]
print("iris dataset sample data: \n{}".format(iris_data['feature_names']))
print("{}".format(sample))

Output

iris dataset sample data:

['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

[[5.1 3.5 1.4 0.2]

 [4.9 3.  1.4 0.2]

 [4.7 3.2 1.3 0.2]]

pandas

    pandas is used for data analysis it can take multi-dimensional arrays as input and produce charts/graphs. pandas may take a table with columns of different datatypes. It may ingest data from various data files and database like SQL, Excel, CSV etc.

Command to install: pip install pandas

import pandas as pd
age = {'age': [4, 6, 8, 34, 5, 30, 41] }
dataframe = pd.DataFrame(age)
print("all age:\n{}".format(dataframe))
filtered = dataframe[dataframe.age > 20]
print("age above 20:\n{}".format(filtered))

Output

all age:

   age

0    4

1    6

2    8

3   34

4    5

5   30

6   41

age above 20:

   age

3   34

5   30

6   41


Like 0 People
Last modified on 11 October 2018
Nikhil Joshi

Nikhil Joshi
Ceo & Founder at Dotnetlovers
Atricles: 125
Questions: 9
Given Best Solutions: 8 *

Comments:

No Comments Yet

You are not loggedin, please login or signup to add comments:

Existing User

Login via:

New User



x