{
"cells": [
{
"cell_type": "markdown",
"id": "f190d0c2-9b60-4af2-a17d-d94547de8c7f",
"metadata": {},
"source": [
"# Selecting rows and columns in pandas DataFrames"
]
},
{
"cell_type": "markdown",
"id": "a1916d87-2731-45b2-ba28-a211bc29a4df",
"metadata": {},
"source": [
"Often, it makes sense to select certain columns or rows for our analysis. Lets have a look on how we can do so."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "bff2f03f-5392-46ba-b77e-faa510428675",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"id": "462f6a8d-9544-4586-b5c3-4dcd4ecc11fe",
"metadata": {},
"source": [
"Therefore, we create a dictionary with rando values and turn it into a table"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "f0b71329-5775-496f-9bd2-8d69355d7292",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" C | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" 1 | \n",
" 1 | \n",
" 3 | \n",
" 3 | \n",
"
\n",
" \n",
" 2 | \n",
" 22 | \n",
" 2 | \n",
" 44 | \n",
"
\n",
" \n",
" 3 | \n",
" 21 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" 4 | \n",
" 12 | \n",
" 12 | \n",
" 52 | \n",
"
\n",
" \n",
" 5 | \n",
" 23 | \n",
" 22 | \n",
" 52 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B C\n",
"0 0 2 2\n",
"1 1 3 3\n",
"2 22 2 44\n",
"3 21 2 2\n",
"4 12 12 52\n",
"5 23 22 52"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = {\n",
" 'A': [0, 1, 22, 21, 12, 23],\n",
" 'B': [2, 3, 2, 2, 12, 22],\n",
" 'C': [2, 3, 44, 2, 52, 52],\n",
"}\n",
"\n",
"table = pd.DataFrame(data)\n",
"table"
]
},
{
"cell_type": "markdown",
"id": "5bec258d-8dd5-4229-944b-c200aeb7c310",
"metadata": {},
"source": [
"## Selecting columns"
]
},
{
"cell_type": "markdown",
"id": "fb49cd6a-c159-4491-a4c5-24432858dbc8",
"metadata": {},
"source": [
"Now we can select one or more columns by putting them as 'strings' into [square brackets]:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "0ff85932-0cff-4d6b-97e5-764b84127b4e",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" B | \n",
" C | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" 1 | \n",
" 3 | \n",
" 3 | \n",
"
\n",
" \n",
" 2 | \n",
" 2 | \n",
" 44 | \n",
"
\n",
" \n",
" 3 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" 4 | \n",
" 12 | \n",
" 52 | \n",
"
\n",
" \n",
" 5 | \n",
" 22 | \n",
" 52 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" B C\n",
"0 2 2\n",
"1 3 3\n",
"2 2 44\n",
"3 2 2\n",
"4 12 52\n",
"5 22 52"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"selected_columns = table[['B', 'C']]\n",
"selected_columns"
]
},
{
"cell_type": "markdown",
"id": "6eea6d6a-e2f9-4395-9866-dfded6507b8e",
"metadata": {},
"source": [
"## Selecting rows"
]
},
{
"cell_type": "markdown",
"id": "63e2eb8e-f2a0-4e5e-ad09-3e6ba28ffc7b",
"metadata": {},
"source": [
"Now we are selecting rows which have in column 'A' a value higher than 20:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "11550485-3305-42d6-b752-5332e4f77976",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" C | \n",
"
\n",
" \n",
" \n",
" \n",
" 2 | \n",
" 22 | \n",
" 2 | \n",
" 44 | \n",
"
\n",
" \n",
" 3 | \n",
" 21 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" 5 | \n",
" 23 | \n",
" 22 | \n",
" 52 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B C\n",
"2 22 2 44\n",
"3 21 2 2\n",
"5 23 22 52"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"selected_rows = table['A'] > 20\n",
"table[selected_rows]"
]
},
{
"cell_type": "markdown",
"id": "ba05a667-7c5f-402b-b709-9ceca87d9ec5",
"metadata": {},
"source": [
"We can also shorten these two lines of code into one line. See for example here if we want to get the rows which have in column 'A' a value lower than 20."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0a6328a6-affc-40a2-a711-55941b728bf0",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" A | \n",
" B | \n",
" C | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" 0 | \n",
" 2 | \n",
" 2 | \n",
"
\n",
" \n",
" 1 | \n",
" 1 | \n",
" 3 | \n",
" 3 | \n",
"
\n",
" \n",
" 4 | \n",
" 12 | \n",
" 12 | \n",
" 52 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" A B C\n",
"0 0 2 2\n",
"1 1 3 3\n",
"4 12 12 52"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"table[table['A'] < 20]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "af52bbb8-80b5-47a0-b377-170aef3daebe",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 5
}