Is using both training and test sets for hyperparameter tuning overfitting?

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
1
down vote

favorite

You have a training and a test set. You combine them and do something like GridSearch to decide the hyperparameters of the model. Then, you fit a model on the training set using these hyperparameters, and you use the test set to evaluate it.

Is this overfitting ? Ultimately, the data was not fitted on the test set, but the test set was considered when deciding hyperparameters.

edited 5 hours ago

asked 6 hours ago

FranGoitia

1112

add a commentÂ |Â

up vote
1
down vote

favorite

Is this overfitting ? Ultimately, the data was not fitted on the test set, but the test set was considered when deciding hyperparameters.

edited 5 hours ago

asked 6 hours ago

FranGoitia

1112

add a commentÂ |Â

up vote
1
down vote

favorite

Is this overfitting ? Ultimately, the data was not fitted on the test set, but the test set was considered when deciding hyperparameters.

edited 5 hours ago

asked 6 hours ago

FranGoitia

1112

Is this overfitting ? Ultimately, the data was not fitted on the test set, but the test set was considered when deciding hyperparameters.

machine-learning cross-validation overfitting

edited 5 hours ago

asked 6 hours ago

FranGoitia

1112

edited 5 hours ago

asked 6 hours ago

FranGoitia

1112

edited 5 hours ago

asked 6 hours ago

FranGoitia

1112

asked 6 hours ago

FranGoitia

1112

asked 6 hours ago

FranGoitia

1112

add a commentÂ |Â

4 Answers
4

active

oldest

votes

up vote
2
down vote

It is an "in-sample" forecast since you eventually make the forecast on observations that are already part of your training set. Why not use n-fold cross-validation? By doing that, at each time, you are making "out-out" sample forecast, in which test set and training set are separate.

answered 3 hours ago

Ray Yang

564

New contributor

add a commentÂ |Â

up vote
2
down vote

Yes, you are overfitting. The test set should be used only for testing, not for parameter tuning. Searching for parameters on the test set will learn the rules that are present in the test set, and eventually overfit it.

answered 1 hour ago

user2974951

15810

add a commentÂ |Â

up vote
2
down vote

The idea behind holdout and cross validation is to estimate the generalization performance of a learning algorithm--that is, the expected performance on future data drawn from the same distribution as the training data. This can be used to tune hyperparameters or report the final performance. The validity of this estimate depends on the independence of the data used for training and estimating performance. If this independence is violated, the performance estimate will be overoptimistically biased. The most egregious way this can happen is by estimating performance on data that has already been used for training or hyperpameter tuning, but there are many more subtle and insidious ways too.

The procedure you asked about goes wrong in multiple ways. First, the same data is used for both training and hyperpameter tuning. The goal of hyperparameter tuning is to select hyperparameters that will give good generalization performance. Typically, this works by estimating the generalization performance for different choices of hyperparameters (e.g. using a validation set), and then choosing the best. But, as above, this estimate will be overoptimistic if the same data has been used for training. The consequence is that sub-optimal hyperparameters will be chosen. In particular, there will be a bias toward high capacity models that will overfit.

Second, data that has already been used to tune hyperparameters is being re-used to estimate performance. This will give a deceptive estimate, as above. This isn't overfitting itself but it means that, if overfitting is happening (and it probably is, as above), then you won't know it.

The remedy is to use three separate datasets: a training set for training, a validation set for hyperparameter tuning, and a test set for estimating the final performance. Or, use nested cross validation, which will give better estimates, and is necessary if there isn't enough data.

answered 1 hour ago

user20160

13.7k12250

add a commentÂ |Â

up vote
1
down vote

I would say you are not necessarily overfitting, because overfitting is a term that is normally used to indicate that your model does not generalise well. E.g. If you would be doing linear regression on something like MNIST images, you are probably still underfitting (it does not generalise enough) when training on both training and test data.

What you are doing, however, is still not a good thing. The test set is normally a part of the data that you want to use to check how good the final, trained model will perform on data it has never seen before. If you use this data to choose hyperparameters, you actually give the model a chance to "see" the test data and to develop a bias towards this test data. Therefore, you actually lose the possibility to find out how good your model would actually be on unseen data (because it has already seen the test data).

It might be possible that you do not really care about how well your model performs, but then you would not need a test set either. Because in most scenarios you do want to have an idea how good a model is, it is best to lock the test data away before you start doing anything with the data. Something as little as using test data during pre-processing, will probably lead to a biased model.

Now you might be asking yourself: "How should I find hyperparameters then?". The easiest way would be to split the available data (assuming that you already safely put away some data for testing) into a training set and a so-called validation set. If you have little data to work with, it probably makes more sense to take a look at cross validation

answered 1 hour ago

Mr Tsjolder

470315

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f366862%2fis-using-both-training-and-test-sets-for-hyperparameter-tuning-overfitting%23new-answer', 'question_page');

);

Post as a guest

Name

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
2
down vote

answered 3 hours ago

Ray Yang

564

New contributor

add a commentÂ |Â

up vote
2
down vote

answered 3 hours ago

Ray Yang

564

New contributor

add a commentÂ |Â

up vote
2
down vote

answered 3 hours ago

Ray Yang

564

New contributor

answered 3 hours ago

Ray Yang

564

New contributor

answered 3 hours ago

Ray Yang

564

New contributor

answered 3 hours ago

Ray Yang

564

answered 3 hours ago

Ray Yang

564

New contributor

Ray Yang is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

user2974951

15810

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

user2974951

15810

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

user2974951

15810

answered 1 hour ago

user2974951

15810

answered 1 hour ago

user2974951

15810

answered 1 hour ago

user2974951

15810

answered 1 hour ago

user2974951

15810

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

user20160

13.7k12250

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

user20160

13.7k12250

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

user20160

13.7k12250

answered 1 hour ago

user20160

13.7k12250

answered 1 hour ago

user20160

13.7k12250

answered 1 hour ago

user20160

13.7k12250

answered 1 hour ago

user20160

13.7k12250

add a commentÂ |Â

up vote
1
down vote

answered 1 hour ago

Mr Tsjolder

470315

add a commentÂ |Â

up vote
1
down vote

answered 1 hour ago

Mr Tsjolder

470315

add a commentÂ |Â

up vote
1
down vote

answered 1 hour ago

Mr Tsjolder

470315

answered 1 hour ago

Mr Tsjolder

470315

answered 1 hour ago

Mr Tsjolder

470315

answered 1 hour ago

Mr Tsjolder

470315

answered 1 hour ago

Mr Tsjolder

470315

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky