Category Archives: Pandas

Rename columns in pandas data-frame

pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

We know for selecting a …┬áin a pandas data-frame we need to use bracket notation with full name of a column. Sometimes our column name is very long with space. So we need to rename this with another name. We can do this with following pandas commands.

import pandas as pd
ufo = pd.read_csv('http://bit.ly/uforeports')
ufo.head()
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
3 Abilene NaN DISK KS 6/1/1931 13:00
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00
ufo.columns
Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time',
       'Location'],
      dtype='object')
Index(['City', 'Colors Reported', 'Shape Reported', 'State', 'Time',
       'Location'],
      dtype='object')
ufo.rename(columns={'Colors Reported' : 'colors_reported',
'Shape Reported' : 'shape_reported'}, inplace=True)

This will rename the old column with new column names.

We can also rename column names without specifying old names. To do so we need to create a python list and replace the old column names.

ufo_cols = ['city', 'colors reported', 'shape reported', 'state', 'time']
ufo.columns = ufo_cols

This will replace all old columns with new columns.

If we have too many columns in a data-frame, we can simply use python replace method replace columns.

Following command will lower case the word and replace spaces with underscore:

ufo.columns = ufo.columns.str.lower().str.replace(' ', '_')
Advertisements

Create new column from Pandas data-frame

pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.

For data analysis purpose sometimes we need to create a virtual column in existing data-frame. We can do that easily with following commands:

import pandas as pd
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')

or we can use read_csv() method which have a comma separator by default.

ufo = pd.read_csv ('http://bit.ly/uforeports')

ufo.head()
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
3 Abilene NaN DISK KS 6/1/1931 13:00
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00

To create new column with concatenate two other column

ufo['Location'] = ufo['City'] +', '+ ufo['State']
ufo.head()
Out[14]:
City Colors Reported Shape Reported State Time Location
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00 Ithaca, NY
1 Willingboro NaN OTHER NJ 6/30/1930 20:00 Willingboro, NJ
2 Holyoke NaN OVAL CO 2/15/1931 14:00 Holyoke, CO
3 Abilene NaN DISK KS 6/1/1931 13:00 Abilene, KS
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00 New York Worlds Fair, NY

Selecting series in a datframe

We know pandas have a most common data structure which is data-frame. We can select some values from a data-frame with some basic commands.

import pandas as pd
ufo = pd.read_table('http://bit.ly/uforeports', sep=',')

or we can use read_csv() method which have a comma separator by default.

ufo = pd.read_csv ('http://bit.ly/uforeports')

ufo.head()
City Colors Reported Shape Reported State Time
0 Ithaca NaN TRIANGLE NY 6/1/1930 22:00
1 Willingboro NaN OTHER NJ 6/30/1930 20:00
2 Holyoke NaN OVAL CO 2/15/1931 14:00
3 Abilene NaN DISK KS 6/1/1931 13:00
4 New York Worlds Fair NaN LIGHT NY 4/18/1933 19:00

We can select a series with bracket notation

ufo['City']
0                      Ithaca
1                 Willingboro
2                     Holyoke
3                     Abilene
4        New York Worlds Fair
5                 Valley City

]
We can also concatenate two column with simple python operation.

ufo['City'] +', '+ ufo['State']
0                      Ithaca, NY
1                 Willingboro, NJ
2                     Holyoke, CO
3                     Abilene, KS
4        New York Worlds Fair, NY
5                 Valley City, ND