{ "cells": [ { "cell_type": "markdown", "id": "f190d0c2-9b60-4af2-a17d-d94547de8c7f", "metadata": {}, "source": [ "# Selecting rows and columns in pandas DataFrames" ] }, { "cell_type": "markdown", "id": "a1916d87-2731-45b2-ba28-a211bc29a4df", "metadata": {}, "source": [ "Often, it makes sense to select certain columns or rows for our analysis. Lets have a look on how we can do so." ] }, { "cell_type": "code", "execution_count": 6, "id": "bff2f03f-5392-46ba-b77e-faa510428675", "metadata": { "tags": [] }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "id": "462f6a8d-9544-4586-b5c3-4dcd4ecc11fe", "metadata": {}, "source": [ "Therefore, we create a dictionary with rando values and turn it into a table" ] }, { "cell_type": "code", "execution_count": 7, "id": "f0b71329-5775-496f-9bd2-8d69355d7292", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
0022
1133
222244
32122
4121252
5232252
\n", "
" ], "text/plain": [ " A B C\n", "0 0 2 2\n", "1 1 3 3\n", "2 22 2 44\n", "3 21 2 2\n", "4 12 12 52\n", "5 23 22 52" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data = {\n", " 'A': [0, 1, 22, 21, 12, 23],\n", " 'B': [2, 3, 2, 2, 12, 22],\n", " 'C': [2, 3, 44, 2, 52, 52],\n", "}\n", "\n", "table = pd.DataFrame(data)\n", "table" ] }, { "cell_type": "markdown", "id": "5bec258d-8dd5-4229-944b-c200aeb7c310", "metadata": {}, "source": [ "## Selecting columns" ] }, { "cell_type": "markdown", "id": "fb49cd6a-c159-4491-a4c5-24432858dbc8", "metadata": {}, "source": [ "Now we can select one or more columns by putting them as 'strings' into [square brackets]:" ] }, { "cell_type": "code", "execution_count": 8, "id": "0ff85932-0cff-4d6b-97e5-764b84127b4e", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
BC
022
133
2244
322
41252
52252
\n", "
" ], "text/plain": [ " B C\n", "0 2 2\n", "1 3 3\n", "2 2 44\n", "3 2 2\n", "4 12 52\n", "5 22 52" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "selected_columns = table[['B', 'C']]\n", "selected_columns" ] }, { "cell_type": "markdown", "id": "6eea6d6a-e2f9-4395-9866-dfded6507b8e", "metadata": {}, "source": [ "## Selecting rows" ] }, { "cell_type": "markdown", "id": "63e2eb8e-f2a0-4e5e-ad09-3e6ba28ffc7b", "metadata": {}, "source": [ "Now we are selecting rows which have in column 'A' a value higher than 20:" ] }, { "cell_type": "code", "execution_count": 10, "id": "11550485-3305-42d6-b752-5332e4f77976", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
222244
32122
5232252
\n", "
" ], "text/plain": [ " A B C\n", "2 22 2 44\n", "3 21 2 2\n", "5 23 22 52" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "selected_rows = table['A'] > 20\n", "table[selected_rows]" ] }, { "cell_type": "markdown", "id": "ba05a667-7c5f-402b-b709-9ceca87d9ec5", "metadata": {}, "source": [ "We can also shorten these two lines of code into one line. See for example here if we want to get the rows which have in column 'A' a value lower than 20." ] }, { "cell_type": "code", "execution_count": 5, "id": "0a6328a6-affc-40a2-a711-55941b728bf0", "metadata": { "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ABC
0022
1133
4121252
\n", "
" ], "text/plain": [ " A B C\n", "0 0 2 2\n", "1 1 3 3\n", "4 12 12 52" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "table[table['A'] < 20]" ] }, { "cell_type": "code", "execution_count": null, "id": "af52bbb8-80b5-47a0-b377-170aef3daebe", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.16" } }, "nbformat": 4, "nbformat_minor": 5 }