Practical NumPy — Understanding Python library through its functions

Practical NumPy — Understanding Python library through its functionsKillol GovaniBlockedUnblockFollowFollowingJan 10Before embarking on the journey of data science and machine learning, it is very important to learn a few python libraries which are ubiquitous in the world of data science like Numpy, Pandas and Matplotlib.

Numpy is one such powerful library for array processing along with a large collection of high-level mathematical functions to operate on these arrays.

These functions fall into categories like Linear Algebra, Trigonometry, Statistics, Matrix manipulation, etc.

Today we will see few such important function’s examples.

Getting NumPyTo install NumPy on your local machine, I would suggest downloading the anaconda package distribution from here which installs python and other important python libraries including NumPy, Pandas and Matplotlib, useful for machine learning.

Anaconda supports Windows, Mac and Linux.

To quickly get started with NumPy without installing anything on your local machine, check out Google Colab.

It provides the Jupyter Notebooks hosted on the cloud for free which are associated with your Google Drive account and it comes with all the important packages pre-installed.

You can also run your code on GPU which helps in faster computation though we don't need GPU computation for this tutorial.

To quickly get started with Google Colab, check out this amazing article on the same.

Getting familiar with the basicsNumPy’s main object is a homogeneous multidimensional array.

Unlike python’s array class which only handles one-dimensional array, NumPy’s ndarray class can handle multidimensional array and provides more functionality.

NumPy’s dimensions are known as axes.

For example, the array below has 2 dimensions or 2 axes namely rows and columns.

Sometimes dimension is also known as a rank of that particular array or matrix.

[[1, 4, 7], [2, 5, 8], [3, 6, 9]]Importing NumPyNumPy is imported using the following command.

Note here np is the convention followed for the alias so that we don't need to write numpy every time.

import numpy as npCreating ArraysThere are many ways for creating an array with NumPy but the most common way is to use array function.

Once an array is created we can also check its dimension using ndim method.

#creating a one-dimensional arraya = np.

array([1 ,2 ,3])print (a)#ouptput[1 2 3]a.

ndim#output1#creating a two-dimensional arrayb = np.

array([ [1, 5 , 7], [2, 4, 6] ])print (b)#output[[1 5 7] [2 4 6]]b.

ndim#output2#creating a three-dimensional arrayc = np.

array([ [[1,2,3], [3,4,5]], [[5,6,7], [7,8,9]] ])print (c)#output[[[1,2,3] [3,4,5]] [[[5,6,7] [7,8,9]]]c.

ndim#output3You can also specify the data type for the array at the time of its creation by using the argument dtype as well as use it to check the data type of an array.

d = np.

array([1 , 4 , 7], dtype=float)print (d)#output[1.

4.

7.

]a.

dtype#outputdtype('int64')Some special ways of creating ArraysNumPy provides multiple ways to create an array.

There are special functions like zeros and ones which creates arrays of elements consisting of only zeros and ones as their elements respectively.

You can also specify the length (in case of single dimensional) or shape (in case of multidimensional) of the array as an argument.

One can use arange method to create arrays with evenly spaced values within a given interval.

By default, the space between values it assumes is 1 unit but we can specify the spacing between values in the parameters along with the starting and ending values of the interval.

Note that arange method does not print the last value of the interval specified as a parameter.

zeros_array = np.

zeros(3, dtype=int)print (zeos_array)#output[0 0 0]zeros_array_nd = np.

zeros((3,2), dtype=int)print (zeros_array_nd)#output[[0 0] [0 0] [0 0]]ones_array = np.

ones(4, dtype=int)print (ones_array)#output[1 1 1 1]ones_array_nd = np.

ones((2,3), dtype=int)print (ones_array_nd)#output[[1 1 1] [1 1 1]]range_array = np.

arange(2,5)print (range_array)#output[2 3 4]range_array_space = np.

arange(1,7,2)print (range_array_space)#output[1 3 5]Shape of an ArrayThe shape attribute of an array returns a tuple depicting its dimensions.

Refer to the above example where we created array b which is a two-dimensional array and hence we get its shape as (2, 3) which means it has 2 rows and 3 columns.

Here note that the shape attribute of array c returns (2, 2, 3), this is because array c is a 3-dimensional array and it indicates that there are two arrays with 2 rows and 3 columns each.

NumPy also provides reshape method to resize an array.

b.

shape#output(2, 3)B = b.

reshape(3,2)print (B)#output [[1 5] [7 2] [4 6]]c.

shape#output(2, 2, 3)C = c.

resize(2,3,2)print (C)#output[[[1 2] [3 3] [4 5]] [[5 6] [7 8] [8 9]]]Indexing of ArraysNumPy arrays can be indexed using standard python syntax x[obj] where x is the array and obj is the selection.

In NumPy arrays also, just like python, all indices are zero-based.

Slicing in NumPy is similar to that of Python.

Basic slicing occurs when obj is a slice object constructed by start:stop:step notation inside the square brackets.

It is not always necessary that all the threestart, stop and step will be present inside the square bracket during slicing.

If a negative integer j is present while slicing then the indexing will be considered as n+j where n is the number elements in the array.

In the case of a multidimensional array, slicing is done in the form of passing a tuple inside the square brackets with the same notation convention.

In advanced indexing, the first parameter which is passed in the form of a list is nothing but the specific rows which we want to select and the second parameter’s list indicates specific elements which we want to select from that row.

#Basic Slicingx = np.

array([9,8,7,6,5,4,3,2,1,0])print(x[2:5]) #output: [7 6 5]print(x[5:]) #output: [4 3 2 1 0]print(x[:4]) #output: [9 8 7 6]print(x[1:7:3]) #output: [8 5]print(x[-5:10]) #output: [4 3 2 1 0]#Boolean Indexingprint(x[x>4]) #output: [9 8 7 6 5]#Indexing in multidimensional arrayy = np.

array([ [1, 3], [4, 6], [7, 9]])#Advanced Indexing print(y[:2,1:2]) #output: [[3] [6]]print(y[[0,1,2], [1,0,1]]) #output: [3, 4, 9]Vectors, Matrices and their basic operationsIn the field of Linear Algebra and Machine Learning, a one-dimensional array is known as vector and a two-dimensional array is known as matrix.

A higher ranked or an n-dimensional array is known n-dimensional Tensor.

Data which is being fed as an input to various machine learning and deep learning models are in the form of matrices and tensors only and hence learning about matrix operations becomes very important.

Transpose of the matrixEarlier in this article, we saw the concept of shape and reshape for a 2-dimensional array where the shape method returns a tuple depicting the number of rows and columns of a matrix.

The transpose of a matrix is a new matrix whose rows are the columns of the original.

This makes the columns of the new matrix the rows of the original.

Here is a matrix and its transpose.

Let us continue with the example of the matrix ‘b’.

print (b)#output[[1 5 7] [2 4 6]]b.

shape#output(2, 3)b_transpose = b.

Tprint (b_transpose)#output[[1 5] [7 2] [4 6]]b_transpose.

shape#output(3, 2)Arithmetic OperationsMatrices can be added or subtracted if they have the same shape.

One element of a matrix is added or subtracted corresponding with the element at the same position of other the matrix.

Aij + Bij = CijIt is also possible to add a scalar value to the matrix which means adding that value to every element of that matrix.

Elementwise multiplication is carried out using * operator or multiply method.

Elementwise multiplication is different from matrix multiplication which we will see in the section of Linear Algebra.

Similarly, we can divide two matrices elementwise and find the square root and exponential value of each element of the matrix as shown in the example below.

A = np.

array([[5, 6], [7, 8]])B = np.

array([[4, 3], [2, 1]])add = A + B #np.

add(A,B) can also be usedprint (add)#output[[9 9] [9 9]]sub = A – B #np.

subtract(A,B)can also be usedprint (sub)#output[[1 3] [5 7]]add_scalar = A + 5print (add_scalar)#output[[10 11] [12 13]]multiply = A * B #np.

multiply(A,B) can also be usedprint (multiply)#output[[20 18] [14 8]]divide = A / B #np.

divide(A,B) can also be usedprint (divide)#output[[1.

25 2] [3.

5 8]]square_root = np.

sqrt(A)print (square_root)#output[[2.

23606798 2.

44948974] [2.

64575131 2.

82842712]]exponential = np.

exp(A)print (exponential)#output[[ 148.

4131591 403.

42879349] [1096.

63315843 2980.

95798704]]BroadcastingBroadcasting is an important concept in machine learning.

Broadcasting basically means adding two matrices of different shapes.

When matrices of two different shapes are added, the smaller matrix assumes the shape of the bigger one by extending itself.

In the above example where we added a scalar to the matrix A, we essentially used broadcasting where the scalar assumed the shape of matrix A.

Let us look at the example below.

Here matrix X has shape of (3, 3) and matrix Y has shape of (3, 1) but the result of the addition of both the matrices is a new (3, 3) matrix because matrix Y extends itself to become a (3, 3) matrix.

X = np.

array([ [1,2,3], [4,5,6], [7,8,9] ])Y = np.

array([[2],[4],[6]])matrix_broad = X + Yprint (matrix_broad)#output[[3 4 5] [8 9 10] [13 14 15]]Functions for Linear AlgebraThe dot product (also known as the inner product) is the sum of the products between rows and columns as shown in the figure below.

NumPy provides dot method to calculate the dot product of two matrices.

In order to calculate dot product of two matrices, the number of columns of the 1st matrix should be equal to the number of rows of the 2nd matrix.

If you do not comply with this rule, NumPy will throw an error stating the shapes are not aligned.

Matrix multiplication or Dot productmatrix_1 = np.

array([ [2, 3], [1, 4], [4, 5]])matrix_2 = np.

array([ [2, 3, 5], [1, 6, 7]])dot_product = np.

dot(matrix_1, matrix_2)print (dot_product)#output[[7 24 31] [6 27 33] [13 42 55]]In the above example, the shape of matrix_1 is (3,2) and that of matrix_2 is (2,3) i.

e.

the number of columns of matrix_1 is equal to the number of rows of matrix_2 and the resultant matrix has the shape of (3,3).

Determinant, Inverse and Norm of a MatrixFunctions for Linear Algebra can be found in module linalg .

Some of the functions are listed below.

#Determinant for 2 dimensional matrixmatrix_A = np.

array([ [1, 2], [3, 4] ])det_A = np.

linalg.

det(matrix_A)print (det_A)#output -2.

0#Determinant for 3 dimensional tensor (stack of matrices)matrix_A_3d = np.

arange(1,13).

reshape(3,2,2)det_A_3d = np.

linalg.

det(matrix_A_3d)print (det_A_3d)#output [-2.

-2.

-2.

]#Inverse of 2-D Matrix and 3-D Tensorinv_A = np.

linalg.

inv(matrix_A)inv_A_3d = np.

linalg.

inv(matrix_A_3d)print (inv_A)print (inv_A_3d)#output[[-2.

1.

] [1.

5 -0.

5]][[[-2.

1.

] [ 1.

5 -0.

5]] [[-4.

3.

] [ 3.

5 -2.

5]] [[-6.

5.

] [ 5.

5 -4.

5]]]Norm of 2-D Matrix and 3-D Tensornorm_A = np.

linalg.

norm(matrix_A)norm_A_3d = np.

linalg.

norm(matrix_A_3d)print (norm_A)print (Norm_A_3d)#output5.

4772255750525.

49509756796Statistical FunctionsNumPy has a rich set of functions for performing statistical operations.

NumPy’s random module’s rand and randn methods are used to generate matrix and tensors of random values and values which are normally distributed respectively of required dimensions.

These functions come in handy when we want to generate random weights for our deep learning model for its first pass of the forward propagation.

We can also calculate the mean, median and standard deviation of input data as shown below.

#Create array of desired shape with random valuesrandom_array = np.

random.

rand(3,2)print (random_array)#output[[0.

42598214 0.

49227853] [0.

06742446 0.

46793263] [0.

23422854 0.

80702256]]#Create array with values from a normal distributionrandom_normal_array = np.

random.

randn(3,2)print (random_normal_array)#output[[ 1.

99670851 0.

40954136] [ 0.

5125924 -0.

04957141] [ 0.

33359663 0.

26610965]]P = np.

array([ [10, 12, 14], [8, 10, 12], [3, 5, 7]])#Calculate mean of the whole array, columns and rows respectivelyP_mean = np.

mean(P) #output: 9.

0P_mean_column = np.

mean(P, axis=0) #output: [7.

9.

11.

]P_mean_row = np.

mean(P, axis=1) #output: [12.

10.

5.

] #Calculate median of the whole array, columns and rows respectivelyprint(np.

median(P)) #output: 10.

0print(np.

median(P, axis=0)) #output: [8.

10.

12.

]print(np.

median(P, axis=1)) #output: [12.

10.

5.

]#Calculate standard deviation of the whole array, columns and rows #respectivelyprint(np.

std(P))print(np.

std(P, axis=0))print(np.

std(P, axis=1))#output3.

366501646120693 [2.

94392029 2.

94392029 2.

94392029] [1.

63299316 1.

63299316 1.

63299316]ConclusionNumPy is the basic library for scientific computations in Python and this article illustrates some of its most frequently used functions.

Understanding NumPy is the first major step in the journey of machine learning and deep learning.

Hope this article was useful to you.

Please express your views, comments and appreciation.

.

. More details

Leave a Reply