Mean Absolute Percentage Error (MAPE): Understanding and Calculation in Python
The Mean Absolute Percentage Error (MAPE) is a widely used metric in predictive analytics to evaluate the accuracy of a particular model’s predictions. It is a measure of how well predictions match actual results, expressed as a percentage.
In this article, we will explore MAPE in detail and learn how to calculate it using Python.
Calculation of MAPE
To calculate MAPE, we need to first determine the absolute percentage difference between the actual values and the predicted values. This is done by subtracting the predicted value from the actual value, taking the absolute value of the result, and then dividing by the actual value.
We can then calculate the mean of these differences to get the overall MAPE. The formula for calculating MAPE is:
MAPE = (1/n) * |(Actual – Predicted)/Actual| * 100
where n is the sample size, Actual is the actual value, and Predicted is the predicted value.
It is important to note that MAPE is expressed as a percentage. This means that the value is always between 0 and 100.
The closer the MAPE value is to 0, the better the prediction accuracy. Conversely, the higher the MAPE value, the more inaccurate the prediction.
Interpretation and Importance of MAPE
Understanding Prediction Accuracy
The MAPE metric provides an understanding of the accuracy of predictive models. The lower the MAPE, the more accurate the model is in predicting future outcomes.
For example, a MAPE of 5% indicates that the model’s predictions are accurate within 5% of the actual values.
Importance in Model Evaluation
MAPE is an important tool in evaluating predictive models because it takes into account the size of the error relative to the actual value. This is important because a large error may not be significant if the actual value is also large. MAPE allows for meaningful comparison of errors across different sample sizes and magnitudes.
Limitations of MAPE
However, one thing to keep in mind is that MAPE does not account for directionality of the error. A prediction that is higher than the actual value will be penalized the same as a prediction that is lower than the actual value, even though the direction of the error is opposite.
This can be addressed by using other metrics such as Root Mean Squared Error (RMSE) or Mean Absolute Error (MAE).
Creation of Python Function for MAPE Calculation
Python is a popular language for data analysis and machine learning, and many libraries such as numpy and pandas provide functions for statistical calculations like MAPE. However, we will take a step further by creating a Python function that can calculate the MAPE value for any given dataset.
Defining the MAPE Function
To create the MAPE function, we’ll need to import numpy and define a function that takes two arrays as inputs, one for actual data and the other for predicted data. The function will then calculate the absolute percentage differences and then take the mean of those values to calculate the overall MAPE.
Python Function Example
import numpy as np
def mape(actual, predicted):
""" Mean Absolute Percentage Error """
actual, predicted = np.array(actual), np.array(predicted)
return np.mean(np.abs((actual-predicted) / actual)) * 100
The code above defines the MAPE function using numpy. The function accepts two arrays, one with the actual data, and the other with the predicted data.
It first converts both to numpy arrays before applying the formula we learned above.
Example of Using the Python Function to Calculate MAPE
Sample Dataset
Now that we have defined the MAPE function in Python, let’s use it to calculate the MAPE for a sample dataset. Let’s say we have actual sales data and predicted sales data for a specific product:
actual_sales = [10, 20, 30, 40, 50]
predicted_sales = [11, 19, 31, 37, 49]
Calculating MAPE
We can use the MAPE function we defined earlier to calculate the overall MAPE for this dataset:
mape(actual_sales, predicted_sales)
This will output the MAPE value, which in this case will be around 6%.
Conclusion
In this article, we have explored the Mean Absolute Percentage Error (MAPE) and learned how to calculate it using Python. We have also seen how important this metric is in evaluating predictive models and understanding their accuracy.
MAPE is a simple yet powerful tool that can be used to compare the performance of different models, and Python makes it easy to implement and calculate MAPE for any dataset. Whether you are a data analyst or a predictive modeler, MAPE should be an important part of your toolkit.
Potential Drawbacks of MAPE
Understanding Limitations
While MAPE is a useful tool for evaluating predictive models, it does have some potential drawbacks that users should be aware of.
Undefined Values
MAPE is undefined when one or more actual values are equal to zero.
This is because the division by zero in the MAPE formula is undefined. When using MAPE, it is important to ensure that the actual values are never zero.
Actual Values Close to Zero
MAPE can be highly sensitive to actual values that are close to zero. This can be problematic in situations where the actual values of a dataset are generally small, or if there are some instances where the actual values are zero.
Sample Size
MAPE can be inappropriate for datasets with a low number of observations. In such cases, the MAPE estimates may be unreliable because of underestimation or overestimation errors.
Limitations of MAPE for Forecast Error Estimation
In addition to the limitations mentioned above, there are other potential problems with using MAPE when estimating the forecast error.
Directionality of Error
One of the key limitations of MAPE is that it does not consider the direction of the error, only the magnitude.
Demand and Forecast
In many real-world applications, actual demand for a given product or service may be influenced by many factors, such as marketing efforts, economic conditions, and other external factors. Given these complexities, the actual demand can vary from the expected forecast.
Absolute Percent Error (APE)
A better alternative to MAPE is the absolute percent error (APE), which measures the percentage difference between the actual and forecast values, without taking into account the direction of the error.
The calculation of APE is similar to MAPE; the only difference is that instead of averaging the absolute percentage differences, APE takes the absolute value of the percentage difference. APE = |(Actual – Forecast)/Actual|*100
Like MAPE, APE is expressed as a percentage, and a lower value indicates a better forecasting performance.
However, the APE takes into account the direction and magnitude of the error.
Choosing the Right Metric
While MAPE is still a valuable tool in evaluating and comparing predictive models, there are some limitations that users should be aware of. The key drawback of MAPE is that it does not account for the direction of the error.
A better alternative to MAPE is APE, which takes into account the magnitude and direction of the error. When choosing between MAPE and APE, it is important to consider the specific application and to use the appropriate metric based on the characteristics of the dataset.
Thus, it is always good practice to evaluate different methods and their performance before choosing a specific metric.
Summary and Final Thoughts
In summary, Mean Absolute Percentage Error (MAPE) is a valuable metric in evaluating the accuracy of predictive models.
However, there are some potential drawbacks and limitations to its use. MAPE is undefined when actual values are equal to zero, and it can be sensitive to small actual values or low sample sizes.
Additionally, MAPE does not account for the direction of error, and Absolute Percent Error (APE) is a better alternative in such cases. Therefore, choosing the appropriate metric based on the dataset characteristics is essential.
Consistent evaluation of the methods and their performance can help us make better-informed decisions. Ultimately, a thorough understanding of MAPE’s benefits, cautions, and limitations can aid in developing more accurate predictions and forecasts.