from pandas import DataFrame
data = {
"city": ["Paris", "Paris", "Paris", "Paris",
"London", "London", "London", "London",
"Rome", "Rome", "Rome", "Rome"],
"year": [2001, 2008, 2009, 2010,
2001, 2006, 2011, 2015,
2001, 2006, 2009, 2012],
"pop": [2.148, 2.211, 2.234, 2.244,
7.322, 7.657, 8.174, 8.615,
2.547, 2.627, 2.734, 2.627]
}
census = DataFrame(data)
We start by grabbing the year that we care about:
census[census["year"] == 2001]
We can see that the smallest population was in Paris that year but let's try to extract it using pandas.
First, we get the population data:
pop = census[census["year"] == 2001]["pop"]
pop
If we call min
on the Series
we get back the smallest value:
pop.min()
But what we actually want is the index for the row on which the smallest value was found, not the value itself. For this we can use the function idxmin
:
pop.idxmin()
We can then take that value and pass it back into the city column to find out which city is on row 0
:
census["city"][pop.idxmin()]
And indeed we see that the answer is Paris