# NumPy

[NumPy](https://numpy.org/) is a fundamental library for computation in Python. 

Additional resources: 

* [NumPy Quickstart](https://numpy.org/doc/stable/user/quickstart.html) 
* [NumPy absolute basics](https://numpy.org/doc/stable/user/absolute_beginners.html)
* [NumPy for MATLAB users](https://numpy.org/doc/stable/user/numpy-for-matlab-users.html)


In [None]:
import numpy as np

## Python list

Lets's compare regular Python lists and NumPy arrays.

In [None]:
# A list is created with [.., ..]
myvals = [1.0, 2.0, 1.5]
myvals

In [None]:
type(myvals)

## Numpy 1D array (vector)

In [None]:
myvals_np = np.array([1.2, 3.0, 4.0])

In [None]:
myvals_np

In [None]:
type(myvals_np)

In [None]:
myvals_np.dtype

In [None]:
myvals_np.sum()

## Indexing

In [None]:
x = np.array([1.0,1.5, 2.0, 5.3]) 
x

In [None]:
x[1]

In [None]:
x[-1]

In [None]:
x[1] = 2.0 # modify the second value in the array
x

## Slicing

In [None]:
x[:2]

**Inline exercise**

1. Create an array `x` with three values: 1, 2, 3
2. What is is the data type of `x`?
3. Create a new array: `y = x/2`
4. What is the data type of `y`?`


## Math operations

Python is a general purpose language *not* designed with numerical computing in mind.

However, NumPy is designed for numerical computing!

In [None]:
[1.2, 4.5] + [2.3, 4.3] # is this the result you expected??

In [None]:
np.array([1.2, 4.5]) + np.array([2.3, 4.3]) 

In [None]:
np.array([1.2, 4.5]) * np.array([2.3, 4.3]) 

*Note for Matlab users, all operators such as `*` are element wise*

In [None]:
np.array([1.2, 4.5]) @ np.array([2.3, 4.3]) # in case you actually wanted to do a dot product

In [None]:
x = np.arange(5, 100, 5)
x

In [None]:
x.dtype # Integers!

In [None]:
x + 1 # add 1!

In [None]:
x = x + 3.0 # add a float to some integers, can we do that?
x

In [None]:
x.dtype # but now it became floats!

In [None]:
xr = np.random.random(10)
xr

In [None]:
xr.mean()

In [None]:
xr.std()

In [None]:
xr.max()

In [None]:
xr - xr.mean()

In [None]:
xn = np.random.normal(loc=5.0, scale=2.0, size=100)
xn[30] = 99.0

In [None]:
mu = xn.mean()
sigma = xn.std()

xn[xn < mu - 3*sigma]

In [None]:
xn[xn > mu + 3*sigma]

## Missing values (delete values)

NumPy has support for missing values.

In [None]:
y = np.random.random(10)
y

In [None]:
y[5:] = np.nan
y

In [None]:
y.mean()

In [None]:
np.nanmean(y)

In [None]:
y * np.pi

## Boolean indexing

In [None]:
z = np.random.normal(loc=0.0, scale=3.0, size=10)

z_sorted = np.sort(z)
z_sorted

In [None]:
z<0.0

In [None]:
z_sorted<0.0

In [None]:
z_sorted[z_sorted<0.0]

In [None]:
z_sorted[z_sorted<0.0] = 0.0

In [None]:
z_sorted

In [None]:
np.where(z<0.0)

In [None]:
xn = np.random.normal(loc=5.0, scale=2.0, size=100)

xn[30] = 99.0 # outlier

median = np.median(xn)
sigma = xn.std()

In [None]:
sigma # sample std affected by outlier

In [None]:
xn[xn > median + 3*sigma] # but 1 abnormally high value

In [None]:
xn[xn > median + 3*sigma] = np.nan

In [None]:
np.nanstd(xn) # much closer to the true std==2.0

## 2D arrays

In [None]:
X = np.array([
              [0.0, 1.0, 2.0],
              [3.0, 4.0, 5.0]
])
    

In [None]:
X

In [None]:
X.shape

In [None]:
nrows = X.shape[0]
nrows

In [None]:
ncols = X.shape[1]
ncols

In [None]:
X[0,0]

In [None]:
X[1,1]

In [None]:
X[-1,-1]

In [None]:
X[0,:]

In [None]:
X[0]

In [None]:
X.mean()

In [None]:
colmeans = X.mean(axis=0)
colmeans

In [None]:
colmeans.shape

In [None]:
rowmeans = X.mean(axis=1)
rowmeans

In [None]:
X - colmeans

In [None]:
# X - rowmeans    # executing this will fail

[NumPy broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html) (detailed explanation of how arrays can be used in expressions)

In [None]:
R = rowmeans[:, np.newaxis] # add a new dimension to create a 2D array
R

In [None]:
np.expand_dims(rowmeans, 1) # same result

In [None]:
X.shape

In [None]:
R.shape

In [None]:
X - R

## Reshaping

In [None]:
x = X.flatten()

In [None]:
x

In [None]:
x.reshape(2,3)