Selecting rows and columns in pandas DataFrames#
Often, it makes sense to select certain columns or rows for our analysis. Lets have a look on how we can do so.
import pandas as pd
Therefore, we create a dictionary with random values and turn it into a table
data = {
'A': [0, 1, 22, 21, 12, 23],
'B': [2, 3, 2, 2, 12, 22],
'C': [2, 3, 44, 2, 52, 52],
}
table = pd.DataFrame(data)
table
A | B | C | |
---|---|---|---|
0 | 0 | 2 | 2 |
1 | 1 | 3 | 3 |
2 | 22 | 2 | 44 |
3 | 21 | 2 | 2 |
4 | 12 | 12 | 52 |
5 | 23 | 22 | 52 |
Selecting columns#
Now we can select one or more columns by putting them as ‘strings’ into [square brackets]:
selected_columns = table[['B', 'C']]
selected_columns
B | C | |
---|---|---|
0 | 2 | 2 |
1 | 3 | 3 |
2 | 2 | 44 |
3 | 2 | 2 |
4 | 12 | 52 |
5 | 22 | 52 |
Selecting rows#
Now we are selecting rows which have in column ‘A’ a value higher than 20:
selected_rows = table['A'] > 20
table[selected_rows]
A | B | C | |
---|---|---|---|
2 | 22 | 2 | 44 |
3 | 21 | 2 | 2 |
5 | 23 | 22 | 52 |
We can also shorten these two lines of code into one line. See for example here if we want to get the rows which have in column ‘A’ a value lower than 20.
table[table['A'] < 20]
A | B | C | |
---|---|---|---|
0 | 0 | 2 | 2 |
1 | 1 | 3 | 3 |
4 | 12 | 12 | 52 |