Can it be over fitting when validation loss and validation accuracy is both increasing?
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
2
down vote
favorite
Training a simple neural network over a very sparse matrix (Has 2400 features and 18000 train rows) for a binary classification problem. At the end of 1st epoch validation loss started to increase, whereas validation accuracy is also increasing. Can i call this over fitting? I'm thinking of stopping the training after 6th epoch. My criteria would be: stop if the accuracy is decreasing. Is there something really wrong going on?
ps: I have perfectly balanced binary classification dataset and a random classifier would result around %50 accuracy.
classification neural-networks overfitting sparse
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
add a comment |Â
up vote
2
down vote
favorite
Training a simple neural network over a very sparse matrix (Has 2400 features and 18000 train rows) for a binary classification problem. At the end of 1st epoch validation loss started to increase, whereas validation accuracy is also increasing. Can i call this over fitting? I'm thinking of stopping the training after 6th epoch. My criteria would be: stop if the accuracy is decreasing. Is there something really wrong going on?
ps: I have perfectly balanced binary classification dataset and a random classifier would result around %50 accuracy.
classification neural-networks overfitting sparse
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Based on the second graph you would say that your model is overfitting, since a similar increase in accuracy from the training set does not happen on the test set.
– user2974951
1 hour ago
Yes but the accuracy is increasing a bit, even to the %65 accuracy level.. If i stop my network when validation loss is increasing i get %55 accuracy. Wouldn't it be better to have %65 accurate network over %55? I am questioning the stopping criteria.
– betelgeuse
1 hour ago
I would say from epoch 3 forward the increase is marginal. Maybe this validation loss criterion is not the best in your case.
– user2974951
1 hour ago
add a comment |Â
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Training a simple neural network over a very sparse matrix (Has 2400 features and 18000 train rows) for a binary classification problem. At the end of 1st epoch validation loss started to increase, whereas validation accuracy is also increasing. Can i call this over fitting? I'm thinking of stopping the training after 6th epoch. My criteria would be: stop if the accuracy is decreasing. Is there something really wrong going on?
ps: I have perfectly balanced binary classification dataset and a random classifier would result around %50 accuracy.
classification neural-networks overfitting sparse
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Training a simple neural network over a very sparse matrix (Has 2400 features and 18000 train rows) for a binary classification problem. At the end of 1st epoch validation loss started to increase, whereas validation accuracy is also increasing. Can i call this over fitting? I'm thinking of stopping the training after 6th epoch. My criteria would be: stop if the accuracy is decreasing. Is there something really wrong going on?
ps: I have perfectly balanced binary classification dataset and a random classifier would result around %50 accuracy.
classification neural-networks overfitting sparse
classification neural-networks overfitting sparse
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
edited 50 mins ago
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
asked 1 hour ago
betelgeuse
134
134
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
New contributor
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
betelgeuse is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
Based on the second graph you would say that your model is overfitting, since a similar increase in accuracy from the training set does not happen on the test set.
– user2974951
1 hour ago
Yes but the accuracy is increasing a bit, even to the %65 accuracy level.. If i stop my network when validation loss is increasing i get %55 accuracy. Wouldn't it be better to have %65 accurate network over %55? I am questioning the stopping criteria.
– betelgeuse
1 hour ago
I would say from epoch 3 forward the increase is marginal. Maybe this validation loss criterion is not the best in your case.
– user2974951
1 hour ago
add a comment |Â
Based on the second graph you would say that your model is overfitting, since a similar increase in accuracy from the training set does not happen on the test set.
– user2974951
1 hour ago
Yes but the accuracy is increasing a bit, even to the %65 accuracy level.. If i stop my network when validation loss is increasing i get %55 accuracy. Wouldn't it be better to have %65 accurate network over %55? I am questioning the stopping criteria.
– betelgeuse
1 hour ago
I would say from epoch 3 forward the increase is marginal. Maybe this validation loss criterion is not the best in your case.
– user2974951
1 hour ago
Based on the second graph you would say that your model is overfitting, since a similar increase in accuracy from the training set does not happen on the test set.
– user2974951
1 hour ago
Based on the second graph you would say that your model is overfitting, since a similar increase in accuracy from the training set does not happen on the test set.
– user2974951
1 hour ago
Yes but the accuracy is increasing a bit, even to the %65 accuracy level.. If i stop my network when validation loss is increasing i get %55 accuracy. Wouldn't it be better to have %65 accurate network over %55? I am questioning the stopping criteria.
– betelgeuse
1 hour ago
Yes but the accuracy is increasing a bit, even to the %65 accuracy level.. If i stop my network when validation loss is increasing i get %55 accuracy. Wouldn't it be better to have %65 accurate network over %55? I am questioning the stopping criteria.
– betelgeuse
1 hour ago
I would say from epoch 3 forward the increase is marginal. Maybe this validation loss criterion is not the best in your case.
– user2974951
1 hour ago
I would say from epoch 3 forward the increase is marginal. Maybe this validation loss criterion is not the best in your case.
– user2974951
1 hour ago
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
2
down vote
accepted
Yes, absolutely. First of all, overfitting is best judged by looking at loss, rather than accuracy, for a series of reasons including the fact that accuracy is not a good way to estimate the performance of classification models. See here:
https://stats.stackexchange.com/a/312787/58675
Why is accuracy not the best measure for assessing classification models?
Classification probability threshold
Secondly, even if you use accuracy, rather than loss, to judge overfitting (and you shouldn't), you can't just look at the (smoothed) derivative of accuracy on the test curve, i.e., if it's increasing on average or not. You should first of all look at the gap between training accuracy and test accuracy. And in your case this gap is very large: you'd better use a scale which starts either at 0, or at the accuracy of the random classifier (i.e., the classifiers which assigns each instance to the majority class), but even with your scale, we're talking a training accuracy of nearly 100%, vs. a test accuracy which doesn't even get to 65%.
TL;DR: you don't want to hear it, but your model is as overfit as they get.
PS: you're focusing on the wrong problem. The issue here is not whether to do early stopping at the 1th epoch for a test accuracy of 55%, or whether to stop at epoch 7 for an accuracy of 65%. The real issue here is that your training accuracy (but again, I would focus on the test loss) is way too high with respect to your test accuracy. 55%, 65% or even 75% are all crap with respect to 99%. This is a textbook case of overfitting. You need to do something about it, not focus on the "less worse" epoch for early stopping.
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
Yes, absolutely. First of all, overfitting is best judged by looking at loss, rather than accuracy, for a series of reasons including the fact that accuracy is not a good way to estimate the performance of classification models. See here:
https://stats.stackexchange.com/a/312787/58675
Why is accuracy not the best measure for assessing classification models?
Classification probability threshold
Secondly, even if you use accuracy, rather than loss, to judge overfitting (and you shouldn't), you can't just look at the (smoothed) derivative of accuracy on the test curve, i.e., if it's increasing on average or not. You should first of all look at the gap between training accuracy and test accuracy. And in your case this gap is very large: you'd better use a scale which starts either at 0, or at the accuracy of the random classifier (i.e., the classifiers which assigns each instance to the majority class), but even with your scale, we're talking a training accuracy of nearly 100%, vs. a test accuracy which doesn't even get to 65%.
TL;DR: you don't want to hear it, but your model is as overfit as they get.
PS: you're focusing on the wrong problem. The issue here is not whether to do early stopping at the 1th epoch for a test accuracy of 55%, or whether to stop at epoch 7 for an accuracy of 65%. The real issue here is that your training accuracy (but again, I would focus on the test loss) is way too high with respect to your test accuracy. 55%, 65% or even 75% are all crap with respect to 99%. This is a textbook case of overfitting. You need to do something about it, not focus on the "less worse" epoch for early stopping.
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
add a comment |Â
up vote
2
down vote
accepted
Yes, absolutely. First of all, overfitting is best judged by looking at loss, rather than accuracy, for a series of reasons including the fact that accuracy is not a good way to estimate the performance of classification models. See here:
https://stats.stackexchange.com/a/312787/58675
Why is accuracy not the best measure for assessing classification models?
Classification probability threshold
Secondly, even if you use accuracy, rather than loss, to judge overfitting (and you shouldn't), you can't just look at the (smoothed) derivative of accuracy on the test curve, i.e., if it's increasing on average or not. You should first of all look at the gap between training accuracy and test accuracy. And in your case this gap is very large: you'd better use a scale which starts either at 0, or at the accuracy of the random classifier (i.e., the classifiers which assigns each instance to the majority class), but even with your scale, we're talking a training accuracy of nearly 100%, vs. a test accuracy which doesn't even get to 65%.
TL;DR: you don't want to hear it, but your model is as overfit as they get.
PS: you're focusing on the wrong problem. The issue here is not whether to do early stopping at the 1th epoch for a test accuracy of 55%, or whether to stop at epoch 7 for an accuracy of 65%. The real issue here is that your training accuracy (but again, I would focus on the test loss) is way too high with respect to your test accuracy. 55%, 65% or even 75% are all crap with respect to 99%. This is a textbook case of overfitting. You need to do something about it, not focus on the "less worse" epoch for early stopping.
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
add a comment |Â
up vote
2
down vote
accepted
up vote
2
down vote
accepted
Yes, absolutely. First of all, overfitting is best judged by looking at loss, rather than accuracy, for a series of reasons including the fact that accuracy is not a good way to estimate the performance of classification models. See here:
https://stats.stackexchange.com/a/312787/58675
Why is accuracy not the best measure for assessing classification models?
Classification probability threshold
Secondly, even if you use accuracy, rather than loss, to judge overfitting (and you shouldn't), you can't just look at the (smoothed) derivative of accuracy on the test curve, i.e., if it's increasing on average or not. You should first of all look at the gap between training accuracy and test accuracy. And in your case this gap is very large: you'd better use a scale which starts either at 0, or at the accuracy of the random classifier (i.e., the classifiers which assigns each instance to the majority class), but even with your scale, we're talking a training accuracy of nearly 100%, vs. a test accuracy which doesn't even get to 65%.
TL;DR: you don't want to hear it, but your model is as overfit as they get.
PS: you're focusing on the wrong problem. The issue here is not whether to do early stopping at the 1th epoch for a test accuracy of 55%, or whether to stop at epoch 7 for an accuracy of 65%. The real issue here is that your training accuracy (but again, I would focus on the test loss) is way too high with respect to your test accuracy. 55%, 65% or even 75% are all crap with respect to 99%. This is a textbook case of overfitting. You need to do something about it, not focus on the "less worse" epoch for early stopping.
Yes, absolutely. First of all, overfitting is best judged by looking at loss, rather than accuracy, for a series of reasons including the fact that accuracy is not a good way to estimate the performance of classification models. See here:
https://stats.stackexchange.com/a/312787/58675
Why is accuracy not the best measure for assessing classification models?
Classification probability threshold
Secondly, even if you use accuracy, rather than loss, to judge overfitting (and you shouldn't), you can't just look at the (smoothed) derivative of accuracy on the test curve, i.e., if it's increasing on average or not. You should first of all look at the gap between training accuracy and test accuracy. And in your case this gap is very large: you'd better use a scale which starts either at 0, or at the accuracy of the random classifier (i.e., the classifiers which assigns each instance to the majority class), but even with your scale, we're talking a training accuracy of nearly 100%, vs. a test accuracy which doesn't even get to 65%.
TL;DR: you don't want to hear it, but your model is as overfit as they get.
PS: you're focusing on the wrong problem. The issue here is not whether to do early stopping at the 1th epoch for a test accuracy of 55%, or whether to stop at epoch 7 for an accuracy of 65%. The real issue here is that your training accuracy (but again, I would focus on the test loss) is way too high with respect to your test accuracy. 55%, 65% or even 75% are all crap with respect to 99%. This is a textbook case of overfitting. You need to do something about it, not focus on the "less worse" epoch for early stopping.
edited 1 hour ago
answered 1 hour ago
DeltaIV
6,14611852
6,14611852
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
add a comment |Â
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
What would be the best way to approach to the problem i have tried noises, dropouts etc? I wouldn't expect a better test accuracy then %65 for this dataset. Should i be preventing the network to get a high train accuracy? Or should i prevent the gap to be widen a lot?
– betelgeuse
1 hour ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
as a side note: I have perfectly balanced binary classification dataset.
– betelgeuse
52 mins ago
add a comment |Â
betelgeuse is a new contributor. Be nice, and check out our Code of Conduct.
betelgeuse is a new contributor. Be nice, and check out our Code of Conduct.
betelgeuse is a new contributor. Be nice, and check out our Code of Conduct.
betelgeuse is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f368431%2fcan-it-be-over-fitting-when-validation-loss-and-validation-accuracy-is-both-incr%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Based on the second graph you would say that your model is overfitting, since a similar increase in accuracy from the training set does not happen on the test set.
– user2974951
1 hour ago
Yes but the accuracy is increasing a bit, even to the %65 accuracy level.. If i stop my network when validation loss is increasing i get %55 accuracy. Wouldn't it be better to have %65 accurate network over %55? I am questioning the stopping criteria.
– betelgeuse
1 hour ago
I would say from epoch 3 forward the increase is marginal. Maybe this validation loss criterion is not the best in your case.
– user2974951
1 hour ago