How to Convert a Column in a Pandas DataFrame from Object to Integer
Pandas is a popular data manipulation library in Python used for data analysis and data science studies. It provides a fast and efficient way to work with structured data, e.g. CSV files, Excel spreadsheets, SQL tables, and many others.
One of the common tasks in data analysis is to convert a column data type from object to integer in a pandas DataFrame. In this article, we will show you how to do that in a few easy steps.
Why would you need to convert a column from object to integer? In many cases, datasets that you’re working with have columns that are in string format (object in pandas).
While string data is useful for storing text information, you cannot use string data in numerical calculations. Therefore, you may need to convert a column to integer to perform arithmetic operations, plot graphs, or for other analysis purposes.
Syntax for conversion
The syntax to convert a column data type in pandas DataFrame is as follows:
df['column_name'] = df['column_name'].astype(int)
This syntax simply converts the data type of the column_name
column in a DataFrame called df
to integer. Example 1: Convert One Column from Object to Integer
Let’s say you have a DataFrame containing data on player performances in a basketball game, and there is a column called ‘points’ that contains the number of points scored by each player in the game, but it is in the object data type.
You can convert it to integer format using the following code:
import pandas as pd
# create a sample DataFrame with an object column
data = {
'player': ['John', 'Mike', 'Sara', 'Bill', 'Adam'],
'points': ['20', '15', '23', '17', '19']
}
df = pd.DataFrame(data)
# display the current data types
print(df.dtypes)
# convert the 'points' column to integer type
df['points'] = df['points'].astype(int)
print(df.dtypes)
In this example, we created a sample DataFrame called df
with two columns: ‘player’ and ‘points’. We then displayed the data type of each column before and after the conversion.
The output of the code looks like this:
player object
points object
dtype: object
player object
points int64
dtype: object
As can be seen from the output, the ‘points’ column was successfully converted to the int64
data type. Example 2: Convert Multiple Columns to Integer
In some cases, you may need to convert multiple columns to integer data type in pandas DataFrame.
The steps are quite similar to the first example, except you need to specify multiple columns to be converted.
import pandas as pd
# create a sample DataFrame with object columns
data = {
'player': ['John', 'Mike', 'Sara', 'Bill', 'Adam'],
'points': ['20', '15', '23', '17', '19'],
'assists': ['6', '3', '5', '8', '4']
}
df = pd.DataFrame(data)
# display the current data types
print(df.dtypes)
# convert the 'points' and 'assists' columns to integer type
df[['points', 'assists']] = df[['points', 'assists']].astype(int)
print(df.dtypes)
In this example, we created a sample DataFrame called df
with three columns: ‘player’, ‘points’, and ‘assists’. We then displayed the data type of each column before and after the conversion.
The output of the code looks like this:
player object
points object
assists object
dtype: object
player object
points int64
assists int64
dtype: object
As can be seen from the output, the ‘points’ and ‘assists’ columns were successfully converted to the int64
data type.
Additional Resources for Common Conversions in Pandas
While the above examples show you how to convert columns from object to integer data type in pandas DataFrame, there are many other types of conversions you may encounter in data analysis tasks. Often, you might need to convert data to a specific type, such as binary, hexadecimal, or date-time formats.
To help you with common conversions, here are some additional resources that may be useful:
- Pandas docs on data types: https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#data-types
- Datacamp tutorial on data type conversions: https://www.datacamp.com/community/tutorials/python-data-type-conversion
- RealPython article on data type conversions: https://realpython.com/python-data-types/
Conclusion
In this article, we discussed how to convert a column in a pandas DataFrame from object to integer data type. Converting columns is essential in data analysis tasks, and knowing how to do it in pandas can save you time and effort.
We hope you found this article useful and that it helps you with your data analysis projects. This article described the process of converting a column in a pandas DataFrame from an object to an integer data type.
We provided syntax and examples covering how to convert single and multiple columns, along with additional resources for common conversions in pandas. Understanding how to convert data types is essential in data analysis tasks, and knowing how to do it in pandas can save considerable time and effort.
By following the syntax and examples we provided, readers should be able to convert columns in their data frames efficiently and with ease.