Why 0-1 loss function is intractable
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
8
down vote
favorite
In Ian Goodfellow's deep learning book, it is written that
Sometimes, the loss function we actually care about (say, classification error) is not one that can be optimized efficiently. For example, exactly minimizing expected 0-1 loss is typically intractable (exponential in the input dimension), even for a linear classifier. In such situations, one typically optimizes
a surrogate loss function instead, which acts as a proxy but has advantages.
I do not understand why 0-1 loss is intractable or how is it exponential in the input dimensions?
neural-networks deep-learning loss-functions
add a comment |Â
up vote
8
down vote
favorite
In Ian Goodfellow's deep learning book, it is written that
Sometimes, the loss function we actually care about (say, classification error) is not one that can be optimized efficiently. For example, exactly minimizing expected 0-1 loss is typically intractable (exponential in the input dimension), even for a linear classifier. In such situations, one typically optimizes
a surrogate loss function instead, which acts as a proxy but has advantages.
I do not understand why 0-1 loss is intractable or how is it exponential in the input dimensions?
neural-networks deep-learning loss-functions
add a comment |Â
up vote
8
down vote
favorite
up vote
8
down vote
favorite
In Ian Goodfellow's deep learning book, it is written that
Sometimes, the loss function we actually care about (say, classification error) is not one that can be optimized efficiently. For example, exactly minimizing expected 0-1 loss is typically intractable (exponential in the input dimension), even for a linear classifier. In such situations, one typically optimizes
a surrogate loss function instead, which acts as a proxy but has advantages.
I do not understand why 0-1 loss is intractable or how is it exponential in the input dimensions?
neural-networks deep-learning loss-functions
In Ian Goodfellow's deep learning book, it is written that
Sometimes, the loss function we actually care about (say, classification error) is not one that can be optimized efficiently. For example, exactly minimizing expected 0-1 loss is typically intractable (exponential in the input dimension), even for a linear classifier. In such situations, one typically optimizes
a surrogate loss function instead, which acts as a proxy but has advantages.
I do not understand why 0-1 loss is intractable or how is it exponential in the input dimensions?
neural-networks deep-learning loss-functions
edited Sep 5 at 4:44
Gavin Simpson
24.2k368113
24.2k368113
asked Sep 5 at 3:30


samra irshad
897
897
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
14
down vote
accepted
The 0-1 loss function is non-convex and discontinuous, so (sub)gradient methods cannot be applied. For binary classification with a linear separator, this loss function can be formulated as finding the $beta$ that minimizes the average value of the indicator function $mathbf1(y_ibetamathbfx_i leq 0)$ over all $i$ samples. This is exponential in the inputs, as since there are two possible values for each pair, there are $2^n$ possible configurations to check for $n$ total sample points. This is known to be NP-hard. Knowing the current value of your loss function doesn’t provide any clue as to how you should possibly modify your current solution to improve, as you could derive if gradient methods for convex or continuous functions were available.
1
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
1
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
14
down vote
accepted
The 0-1 loss function is non-convex and discontinuous, so (sub)gradient methods cannot be applied. For binary classification with a linear separator, this loss function can be formulated as finding the $beta$ that minimizes the average value of the indicator function $mathbf1(y_ibetamathbfx_i leq 0)$ over all $i$ samples. This is exponential in the inputs, as since there are two possible values for each pair, there are $2^n$ possible configurations to check for $n$ total sample points. This is known to be NP-hard. Knowing the current value of your loss function doesn’t provide any clue as to how you should possibly modify your current solution to improve, as you could derive if gradient methods for convex or continuous functions were available.
1
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
1
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
add a comment |Â
up vote
14
down vote
accepted
The 0-1 loss function is non-convex and discontinuous, so (sub)gradient methods cannot be applied. For binary classification with a linear separator, this loss function can be formulated as finding the $beta$ that minimizes the average value of the indicator function $mathbf1(y_ibetamathbfx_i leq 0)$ over all $i$ samples. This is exponential in the inputs, as since there are two possible values for each pair, there are $2^n$ possible configurations to check for $n$ total sample points. This is known to be NP-hard. Knowing the current value of your loss function doesn’t provide any clue as to how you should possibly modify your current solution to improve, as you could derive if gradient methods for convex or continuous functions were available.
1
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
1
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
add a comment |Â
up vote
14
down vote
accepted
up vote
14
down vote
accepted
The 0-1 loss function is non-convex and discontinuous, so (sub)gradient methods cannot be applied. For binary classification with a linear separator, this loss function can be formulated as finding the $beta$ that minimizes the average value of the indicator function $mathbf1(y_ibetamathbfx_i leq 0)$ over all $i$ samples. This is exponential in the inputs, as since there are two possible values for each pair, there are $2^n$ possible configurations to check for $n$ total sample points. This is known to be NP-hard. Knowing the current value of your loss function doesn’t provide any clue as to how you should possibly modify your current solution to improve, as you could derive if gradient methods for convex or continuous functions were available.
The 0-1 loss function is non-convex and discontinuous, so (sub)gradient methods cannot be applied. For binary classification with a linear separator, this loss function can be formulated as finding the $beta$ that minimizes the average value of the indicator function $mathbf1(y_ibetamathbfx_i leq 0)$ over all $i$ samples. This is exponential in the inputs, as since there are two possible values for each pair, there are $2^n$ possible configurations to check for $n$ total sample points. This is known to be NP-hard. Knowing the current value of your loss function doesn’t provide any clue as to how you should possibly modify your current solution to improve, as you could derive if gradient methods for convex or continuous functions were available.
answered Sep 5 at 4:08


Don Walpola
506115
506115
1
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
1
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
add a comment |Â
1
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
1
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
1
1
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
Very good point - in practice random search or exhaustive search are the only methods which could be used to find the minimum of such a loss function, right?
– DeltaIV
Sep 5 at 6:21
1
1
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
^^ or evolutionary/swarm-based intelligence methods maybe?
– samra irshad
Sep 5 at 6:32
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
@samrairshad Yes, in fact 0-1 loss is not that uncommon to see in evolutionary methods.
– John Doucette
Sep 5 at 11:14
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f365444%2fwhy-0-1-loss-function-is-intractable%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password