Introduction to NumPy

In [1]:
import numpy as np

with np.load("weather_data.npz") as weather:
    rain = weather["rain"]
    uk_mask = weather["uk"]
    irl_mask = weather["ireland"]
    spain_mask = weather["spain"]
In [2]:
np.mean(rain)
Out[2]:
0.003113156467013889

This is very small since it's in metres, so let's convert it into mm:

In [3]:
rain *= 1000
In [4]:
np.mean(rain)
Out[4]:
3.113156467013889
In [5]:
uk_mask.dtype
Out[5]:
dtype('bool')
In [6]:
uk_mask.shape
Out[6]:
(75, 75)

This is the same as the rain array:

In [7]:
rain.shape
Out[7]:
(75, 75)

What filtering method to use:

  • Using [] will lose the original shape, but since we're averaging the whole thing that doesn't matter
  • Using np.where will be tricky as if we fill the masked-out areads with 0 then it will skew the mean
  • Using masked_array would work fine

For questions like this, using [] is often simplest:

In [8]:
uk_rain = rain[uk_mask]
In [9]:
np.mean(uk_rain)
Out[9]:
7.5836526862097005
In [10]:
np.mean(rain[irl_mask])
Out[10]:
4.611112633529975
In [11]:
np.mean(rain[spain_mask])
Out[11]:
0.8478509374411709

The UK has the heaviest rain, followed by Ireland, followed by Spain.