Selecting rows and columns in pandas DataFrames

Selecting rows and columns in pandas DataFrames#

Often, it makes sense to select certain columns or rows for our analysis. Lets have a look on how we can do so.

import pandas as pd

Therefore, we create a dictionary with random values and turn it into a table

data = {
    'A': [0, 1, 22, 21, 12, 23],
    'B': [2, 3, 2,  2,  12, 22],
    'C': [2, 3, 44,  2,  52, 52],
}

table = pd.DataFrame(data)
table
A B C
0 0 2 2
1 1 3 3
2 22 2 44
3 21 2 2
4 12 12 52
5 23 22 52

Selecting columns#

Now we can select one or more columns by putting them as ‘strings’ into [square brackets]:

selected_columns = table[['B', 'C']]
selected_columns
B C
0 2 2
1 3 3
2 2 44
3 2 2
4 12 52
5 22 52

Selecting rows#

Now we are selecting rows which have in column ‘A’ a value higher than 20:

selected_rows = table['A'] > 20
table[selected_rows]
A B C
2 22 2 44
3 21 2 2
5 23 22 52

We can also shorten these two lines of code into one line. See for example here if we want to get the rows which have in column ‘A’ a value lower than 20.

table[table['A'] < 20]
A B C
0 0 2 2
1 1 3 3
4 12 12 52