Extracting Quarter Information from a Date Column in Pandas
Do you have a date column in your Pandas DataFrame and want to extract the quarter information? No problem! In this article, we will explore two methods to extract the quarter from a date column in a Pandas DataFrame.
Method 1: Get Quarter from Date (Year & Quarter Format)
The first method involves extracting the year and quarter information and combining them into a new column in a year and quarter format. We’ll be using the pd.PeriodIndex
function with the freq='Q'
parameter to do this.
Let’s say we have a DataFrame containing sales data with a “date” column:
import pandas as pd
df = pd.DataFrame({'date': ['2019-01-01', '2019-04-01', '2019-07-01', '2019-10-01']})
To get the quarter information in a year and quarter format, we can create a new column called “quarter” using the following code:
df['quarter'] = pd.PeriodIndex(df['date'], freq='Q')
The result will be a new column containing the year and quarter information in the format “2019Q1”, “2019Q2”, etc.
Method 2: Get Quarter from Date (Quarter Number Format)
The second method involves extracting the quarter number directly from the date column using the dt.quarter
function.
Let’s say we have the same DataFrame as before:
import pandas as pd
df = pd.DataFrame({'date': ['2019-01-01', '2019-04-01', '2019-07-01', '2019-10-01']})
To get the quarter information as a number, we can create a new column called “quarter” using the following code:
df['quarter'] = df['date'].dt.quarter
The result will be a new column containing the quarter number for each date.
Example 1: Get Quarter from Date (Year & Quarter Format)
Let’s go back to Method 1 and see how we can create a new column for the year and quarter format.
We’ll use the same DataFrame as before:
import pandas as pd
df = pd.DataFrame({'date': ['2019-01-01', '2019-04-01', '2019-07-01', '2019-10-01']})
To create a new column for the year and quarter format, we can use the pd.PeriodIndex
function with the freq='Q'
parameter:
df['quarter'] = pd.PeriodIndex(df['date'], freq='Q')
The resulting DataFrame will look like this:
date quarter
0 2019-01-01 2019Q1
1 2019-04-01 2019Q2
2 2019-07-01 2019Q3
3 2019-10-01 2019Q4
The “quarter” column now contains the year and quarter information in the format “YYYYQX”.
Method 2: Get Quarter from Date (Quarter Number Format) – In Depth
As a reminder, the second method involves extracting the quarter number directly from the date column using the dt.quarter
function.
Let’s dive deeper into how this works. To use this method, you’ll need a DataFrame containing a date column.
Let’s create a simple example:
import pandas as pd
df = pd.DataFrame({
'date': ['2021-01-01', '2021-04-01', '2021-07-01', '2021-10-01']
})
Now let’s create a new column “quarter” that contains the quarter information as a number:
df['quarter'] = df['date'].dt.quarter
This will create a new column containing the quarter number for each date:
date quarter
0 2021-01-01 1
1 2021-04-01 2
2 2021-07-01 3
3 2021-10-01 4
That’s it – it’s that simple! You now have a new column containing the quarter information in a quarter number format.
Example 2: Get Quarter from Date (Quarter Number Format)
Let’s use a more practical example to demonstrate how to extract the quarter from a date column.
Suppose you have a DataFrame containing sales data for a company, and you want to see how sales vary by quarter. You have a “date” column containing the date of the sale, and a “sales” column containing the amount of the sale:
import pandas as pd
df = pd.DataFrame({
'date': ['2021-01-01', '2021-01-15', '2021-02-01', '2021-03-01', '2021-04-01', '2021-07-01'],
'sales': [100, 200, 150, 300, 250, 400]
})
To extract the quarter information as a number, we can use the dt.quarter
function to create a new column “quarter”:
df['quarter'] = df['date'].dt.quarter
This will create a new column “quarter” containing the quarter number for each sale:
date sales quarter
0 2021-01-01 100 1
1 2021-01-15 200 1
2 2021-02-01 150 1
3 2021-03-01 300 1
4 2021-04-01 250 2
5 2021-07-01 400 3
Now we can group the data by quarter and sum the sales to see how sales vary by quarter:
sales_by_quarter = df.groupby('quarter')['sales'].sum()
print(sales_by_quarter)
This will output:
quarter
1 750
2 250
3 400
Name: sales, dtype: int64
We can see that sales were highest in Q1 (750), followed by Q3 (400), and then Q2 (250).
Conclusion
Extracting quarter information from a date column in a Pandas DataFrame is easy with the two methods we’ve discussed. Whether you prefer the year and quarter format or the quarter number format, there’s a method for you.
By using the pd.PeriodIndex
function or the dt.quarter
function, you can create a new column containing the quarter information for each date in your DataFrame. Happy coding!
Additional Resources
- Official Pandas documentation: https://pandas.pydata.org/pandas-docs/stable/index.html
- pandas Tutorials on DataCamp: https://www.datacamp.com/courses/pandas-foundations
- Pandas Cheat Sheet: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
With these resources, you’ll be well on your way to mastering Pandas and performing any operation you need on your data.