Need advice on evaluating forecast accuracy in R

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

I'm trying to evaluate some software for forecast accuracy. It works by summing up all the orders from a number of locations for each month, then determines the best model out of a series of models based on the one the generates the minimum MSE. The it takes that model to forecast the demand for each location. For example, for Jan-Jun, Location A has demand (1,0,2,0,0,3) and Location B has demand (2,1,0,0,3,1). The aggregate would be A+B =(3,1,2,0,3,4). The software would then build models using ses, holt, MA, Croston's and Weighted Average. The one that produces the smallest MSE (in-sample) would be chosen to build the forecast for July. Then it would do the same thing again for August when it has an actual demand for July. It continues this way and may change the forecasting method at each month based on the minimum MSE. Therefore, it may generate forecasts for July-Dec using methods like, for example, (ses, ses, MA, Croston's ses, holt).

I currently have data from Jan 2016 to Dec 2017 (24 months) and I'm looking for advice regarding how to determine how well the tool determines a forecast. I thought about using tsCV, but that assume the same model will be applied each month in a rolling forecast, which isn't the case.

edited 2 hours ago

asked 2 hours ago

Angus

486

@SecretAgentMan: MAD/Mean is not a good idea, especially for intermittent demands. If you try to minimize this error measure, you may end up with "optimal" flat zero forecasts. See here for details and a few pointers to literature, and here for why this effect occurs.
â€“Â Stephan Kolassa
52 mins ago

@SecretAgentMan: the Smart-Willemain method is nice, but it cannot deal with dynamics in the time series, like trend, seasonality or causal factors. In addition, it is patented, which may be an IP problem for some practitioners.
â€“Â Stephan Kolassa
50 mins ago

@StephanKolassa, Thank you for the clarification. Since you're probably the SK from the forecasting book I have, I'll remove my comment until I find evidence to present with it.
â€“Â SecretAgentMan
44 mins ago

@StephanKolassa at the Foresight Practitioner conference, there were talks specifically encouraging using this metric for ID. Further, Willemain gives out R code for the Markov Bootstrap (without the jittering) but your point on the IP is well-taken. Thanks for the links.
â€“Â SecretAgentMan
41 mins ago

add a commentÂ |Â

up vote
1
down vote

favorite

edited 2 hours ago

asked 2 hours ago

Angus

486

@SecretAgentMan: MAD/Mean is not a good idea, especially for intermittent demands. If you try to minimize this error measure, you may end up with "optimal" flat zero forecasts. See here for details and a few pointers to literature, and here for why this effect occurs.
â€“Â Stephan Kolassa
52 mins ago

@SecretAgentMan: the Smart-Willemain method is nice, but it cannot deal with dynamics in the time series, like trend, seasonality or causal factors. In addition, it is patented, which may be an IP problem for some practitioners.
â€“Â Stephan Kolassa
50 mins ago

@StephanKolassa, Thank you for the clarification. Since you're probably the SK from the forecasting book I have, I'll remove my comment until I find evidence to present with it.
â€“Â SecretAgentMan
44 mins ago

@StephanKolassa at the Foresight Practitioner conference, there were talks specifically encouraging using this metric for ID. Further, Willemain gives out R code for the Markov Bootstrap (without the jittering) but your point on the IP is well-taken. Thanks for the links.
â€“Â SecretAgentMan
41 mins ago

add a commentÂ |Â

up vote
1
down vote

favorite

edited 2 hours ago

asked 2 hours ago

Angus

486

time-series forecasting cross-validation

edited 2 hours ago

asked 2 hours ago

Angus

486

edited 2 hours ago

asked 2 hours ago

Angus

486

edited 2 hours ago

asked 2 hours ago

Angus

486

asked 2 hours ago

Angus

486

asked 2 hours ago

Angus

486

@SecretAgentMan: MAD/Mean is not a good idea, especially for intermittent demands. If you try to minimize this error measure, you may end up with "optimal" flat zero forecasts. See here for details and a few pointers to literature, and here for why this effect occurs.
â€“Â Stephan Kolassa
52 mins ago

@SecretAgentMan: the Smart-Willemain method is nice, but it cannot deal with dynamics in the time series, like trend, seasonality or causal factors. In addition, it is patented, which may be an IP problem for some practitioners.
â€“Â Stephan Kolassa
50 mins ago

@StephanKolassa, Thank you for the clarification. Since you're probably the SK from the forecasting book I have, I'll remove my comment until I find evidence to present with it.
â€“Â SecretAgentMan
44 mins ago

@StephanKolassa at the Foresight Practitioner conference, there were talks specifically encouraging using this metric for ID. Further, Willemain gives out R code for the Markov Bootstrap (without the jittering) but your point on the IP is well-taken. Thanks for the links.
â€“Â SecretAgentMan
41 mins ago

add a commentÂ |Â

@SecretAgentMan: MAD/Mean is not a good idea, especially for intermittent demands. If you try to minimize this error measure, you may end up with "optimal" flat zero forecasts. See here for details and a few pointers to literature, and here for why this effect occurs.
â€“Â Stephan Kolassa
52 mins ago

@SecretAgentMan: the Smart-Willemain method is nice, but it cannot deal with dynamics in the time series, like trend, seasonality or causal factors. In addition, it is patented, which may be an IP problem for some practitioners.
â€“Â Stephan Kolassa
50 mins ago

@StephanKolassa, Thank you for the clarification. Since you're probably the SK from the forecasting book I have, I'll remove my comment until I find evidence to present with it.
â€“Â SecretAgentMan
44 mins ago

@StephanKolassa at the Foresight Practitioner conference, there were talks specifically encouraging using this metric for ID. Further, Willemain gives out R code for the Markov Bootstrap (without the jittering) but your point on the IP is well-taken. Thanks for the links.
â€“Â SecretAgentMan
41 mins ago

@SecretAgentMan: MAD/Mean is not a good idea, especially for intermittent demands. If you try to minimize this error measure, you may end up with "optimal" flat zero forecasts. See here for details and a few pointers to literature, and here for why this effect occurs.
â€“Â Stephan Kolassa
52 mins ago

@SecretAgentMan: the Smart-Willemain method is nice, but it cannot deal with dynamics in the time series, like trend, seasonality or causal factors. In addition, it is patented, which may be an IP problem for some practitioners.
â€“Â Stephan Kolassa
50 mins ago

@StephanKolassa, Thank you for the clarification. Since you're probably the SK from the forecasting book I have, I'll remove my comment until I find evidence to present with it.
â€“Â SecretAgentMan
44 mins ago

@StephanKolassa at the Foresight Practitioner conference, there were talks specifically encouraging using this metric for ID. Further, Willemain gives out R code for the Markov Bootstrap (without the jittering) but your point on the IP is well-taken. Thanks for the links.
â€“Â SecretAgentMan
41 mins ago

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
5
down vote

accepted

First off, don't use the in-sample accuracy to choose a model. This will invariably lead to overfitting. In-sample accuracy is not a good guide to out-of-sample prediction. Instead, use a holdout sample.

Regarding your main question: again, use a holdout sample to see how well your algorithm performs on truly new data.

Thus, if you are interested in $h$-month-ahead forecasts:

Fit your models to the data except for the last $2h$ months.

Forecast all of them out to a horizon of $h$ months. Note the forecast error of each model, using rmse or whatever.

Pick the model that performed best. Re-fit this model to the data except the last $h$ months. Forecast $h$ months ahead. Note the forecast error.

Do this for all your time series. Check how well this algorithm worked, and compare it to the performance of a few very simple benchmark methods, like always forecasting the historical mean, or the last observation. Or taking the average of all your candidate models' forecasts - averages of forecasts often outperform choosing the "best" method by some criterion.

answered 2 hours ago

Stephan Kolassa

40.6k686150

1

Stephan Kolassa is perfectly right. I'd just like to add a purely business criteria. Which forecast will have the least negative impact on the business in case of error ?
â€“Â AlainD
1 hour ago

+1, " don't use the in-sample accuracy to choose a model"
â€“Â SecretAgentMan
58 mins ago

Stephan, if I only have 12 months of data to forecast month 13, could I still use the forecast package with tsCV?
â€“Â Angus
9 mins ago

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f366935%2fneed-advice-on-evaluating-forecast-accuracy-in-r%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
5
down vote

accepted

Regarding your main question: again, use a holdout sample to see how well your algorithm performs on truly new data.

Thus, if you are interested in $h$-month-ahead forecasts:

Fit your models to the data except for the last $2h$ months.

Forecast all of them out to a horizon of $h$ months. Note the forecast error of each model, using rmse or whatever.

Pick the model that performed best. Re-fit this model to the data except the last $h$ months. Forecast $h$ months ahead. Note the forecast error.

answered 2 hours ago

Stephan Kolassa

40.6k686150

1

Stephan Kolassa is perfectly right. I'd just like to add a purely business criteria. Which forecast will have the least negative impact on the business in case of error ?
â€“Â AlainD
1 hour ago

+1, " don't use the in-sample accuracy to choose a model"
â€“Â SecretAgentMan
58 mins ago

Stephan, if I only have 12 months of data to forecast month 13, could I still use the forecast package with tsCV?
â€“Â Angus
9 mins ago

add a commentÂ |Â

up vote
5
down vote

accepted

Regarding your main question: again, use a holdout sample to see how well your algorithm performs on truly new data.

Thus, if you are interested in $h$-month-ahead forecasts:

Fit your models to the data except for the last $2h$ months.

Forecast all of them out to a horizon of $h$ months. Note the forecast error of each model, using rmse or whatever.

Pick the model that performed best. Re-fit this model to the data except the last $h$ months. Forecast $h$ months ahead. Note the forecast error.

answered 2 hours ago

Stephan Kolassa

40.6k686150

1

Stephan Kolassa is perfectly right. I'd just like to add a purely business criteria. Which forecast will have the least negative impact on the business in case of error ?
â€“Â AlainD
1 hour ago

+1, " don't use the in-sample accuracy to choose a model"
â€“Â SecretAgentMan
58 mins ago

Stephan, if I only have 12 months of data to forecast month 13, could I still use the forecast package with tsCV?
â€“Â Angus
9 mins ago

add a commentÂ |Â

up vote
5
down vote

accepted

Regarding your main question: again, use a holdout sample to see how well your algorithm performs on truly new data.

Thus, if you are interested in $h$-month-ahead forecasts:

Fit your models to the data except for the last $2h$ months.

Forecast all of them out to a horizon of $h$ months. Note the forecast error of each model, using rmse or whatever.

Pick the model that performed best. Re-fit this model to the data except the last $h$ months. Forecast $h$ months ahead. Note the forecast error.

answered 2 hours ago

Stephan Kolassa

40.6k686150

Regarding your main question: again, use a holdout sample to see how well your algorithm performs on truly new data.

Thus, if you are interested in $h$-month-ahead forecasts:

Fit your models to the data except for the last $2h$ months.

Forecast all of them out to a horizon of $h$ months. Note the forecast error of each model, using rmse or whatever.

Pick the model that performed best. Re-fit this model to the data except the last $h$ months. Forecast $h$ months ahead. Note the forecast error.

answered 2 hours ago

Stephan Kolassa

40.6k686150

answered 2 hours ago

Stephan Kolassa

40.6k686150

answered 2 hours ago

Stephan Kolassa

40.6k686150

answered 2 hours ago

Stephan Kolassa

40.6k686150

1

Stephan Kolassa is perfectly right. I'd just like to add a purely business criteria. Which forecast will have the least negative impact on the business in case of error ?
â€“Â AlainD
1 hour ago

+1, " don't use the in-sample accuracy to choose a model"
â€“Â SecretAgentMan
58 mins ago

Stephan, if I only have 12 months of data to forecast month 13, could I still use the forecast package with tsCV?
â€“Â Angus
9 mins ago

add a commentÂ |Â

1

Stephan Kolassa is perfectly right. I'd just like to add a purely business criteria. Which forecast will have the least negative impact on the business in case of error ?
â€“Â AlainD
1 hour ago

+1, " don't use the in-sample accuracy to choose a model"
â€“Â SecretAgentMan
58 mins ago

Stephan, if I only have 12 months of data to forecast month 13, could I still use the forecast package with tsCV?
â€“Â Angus
9 mins ago

Stephan Kolassa is perfectly right. I'd just like to add a purely business criteria. Which forecast will have the least negative impact on the business in case of error ?
â€“Â AlainD
1 hour ago

+1, " don't use the in-sample accuracy to choose a model"
â€“Â SecretAgentMan
58 mins ago

Stephan, if I only have 12 months of data to forecast month 13, could I still use the forecast package with tsCV?
â€“Â Angus
9 mins ago

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky