What it means to have a higher cost for a local minima than the global minima?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
1
down vote

favorite












In the following lines, I do not understand what it means to have a high cost for local minima than a global minima?




Local minima can be problematic if they have high cost in comparison to the global minimum. One can construct small neural networks, even without hidden units, that have local minima with higher cost than the global minimum. If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.








share|cite|improve this question


























    up vote
    1
    down vote

    favorite












    In the following lines, I do not understand what it means to have a high cost for local minima than a global minima?




    Local minima can be problematic if they have high cost in comparison to the global minimum. One can construct small neural networks, even without hidden units, that have local minima with higher cost than the global minimum. If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.








    share|cite|improve this question






















      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      In the following lines, I do not understand what it means to have a high cost for local minima than a global minima?




      Local minima can be problematic if they have high cost in comparison to the global minimum. One can construct small neural networks, even without hidden units, that have local minima with higher cost than the global minimum. If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.








      share|cite|improve this question












      In the following lines, I do not understand what it means to have a high cost for local minima than a global minima?




      Local minima can be problematic if they have high cost in comparison to the global minimum. One can construct small neural networks, even without hidden units, that have local minima with higher cost than the global minimum. If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.










      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked Sep 7 at 0:46









      samra irshad

      947




      947




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          12
          down vote



          accepted










          An optimization algorithm's goal is to minimize a cost function $C$. Some optimization techniques (such as gradient-based methods) are only guaranteed to find a local minimum rather than the global minimum of $C$. So if $C$ looks like the picture below, a gradient-based method might find the local minimum $x_0$, which has much higher cost than the global minimum $x^*$.



          In this example, you could avoid the problem by running the optimization several times with random starting points. But as the quotation mentions, this is not effective when there are many high-cost local minima.



          enter image description here






          share|cite|improve this answer
















          • 2




            Many thanks for this cute illustration!
            – samra irshad
            Sep 7 at 2:14










          • Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
            – Floris SA
            Sep 7 at 9:33











          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "65"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f365731%2fwhat-it-means-to-have-a-higher-cost-for-a-local-minima-than-the-global-minima%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          12
          down vote



          accepted










          An optimization algorithm's goal is to minimize a cost function $C$. Some optimization techniques (such as gradient-based methods) are only guaranteed to find a local minimum rather than the global minimum of $C$. So if $C$ looks like the picture below, a gradient-based method might find the local minimum $x_0$, which has much higher cost than the global minimum $x^*$.



          In this example, you could avoid the problem by running the optimization several times with random starting points. But as the quotation mentions, this is not effective when there are many high-cost local minima.



          enter image description here






          share|cite|improve this answer
















          • 2




            Many thanks for this cute illustration!
            – samra irshad
            Sep 7 at 2:14










          • Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
            – Floris SA
            Sep 7 at 9:33















          up vote
          12
          down vote



          accepted










          An optimization algorithm's goal is to minimize a cost function $C$. Some optimization techniques (such as gradient-based methods) are only guaranteed to find a local minimum rather than the global minimum of $C$. So if $C$ looks like the picture below, a gradient-based method might find the local minimum $x_0$, which has much higher cost than the global minimum $x^*$.



          In this example, you could avoid the problem by running the optimization several times with random starting points. But as the quotation mentions, this is not effective when there are many high-cost local minima.



          enter image description here






          share|cite|improve this answer
















          • 2




            Many thanks for this cute illustration!
            – samra irshad
            Sep 7 at 2:14










          • Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
            – Floris SA
            Sep 7 at 9:33













          up vote
          12
          down vote



          accepted







          up vote
          12
          down vote



          accepted






          An optimization algorithm's goal is to minimize a cost function $C$. Some optimization techniques (such as gradient-based methods) are only guaranteed to find a local minimum rather than the global minimum of $C$. So if $C$ looks like the picture below, a gradient-based method might find the local minimum $x_0$, which has much higher cost than the global minimum $x^*$.



          In this example, you could avoid the problem by running the optimization several times with random starting points. But as the quotation mentions, this is not effective when there are many high-cost local minima.



          enter image description here






          share|cite|improve this answer












          An optimization algorithm's goal is to minimize a cost function $C$. Some optimization techniques (such as gradient-based methods) are only guaranteed to find a local minimum rather than the global minimum of $C$. So if $C$ looks like the picture below, a gradient-based method might find the local minimum $x_0$, which has much higher cost than the global minimum $x^*$.



          In this example, you could avoid the problem by running the optimization several times with random starting points. But as the quotation mentions, this is not effective when there are many high-cost local minima.



          enter image description here







          share|cite|improve this answer












          share|cite|improve this answer



          share|cite|improve this answer










          answered Sep 7 at 0:57









          tddevlin

          1,067415




          1,067415







          • 2




            Many thanks for this cute illustration!
            – samra irshad
            Sep 7 at 2:14










          • Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
            – Floris SA
            Sep 7 at 9:33













          • 2




            Many thanks for this cute illustration!
            – samra irshad
            Sep 7 at 2:14










          • Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
            – Floris SA
            Sep 7 at 9:33








          2




          2




          Many thanks for this cute illustration!
          – samra irshad
          Sep 7 at 2:14




          Many thanks for this cute illustration!
          – samra irshad
          Sep 7 at 2:14












          Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
          – Floris SA
          Sep 7 at 9:33





          Tiny (perhaps straightforward) addition from my side, in case the last line of your quote is still unclear: Many algorithms (especially gradient-based ones, as you mention) try to find the global minimum by finding the direction in model space in which $C(x)$ decreases most rapidly. This explains why it can get stuck in $C(x_0)$ in the illustration above, that direction does not exist. This is what the authors meant in "If local minima with high cost are common, this could pose a serious problem for gradient-based optimization algorithms.". There are many such places to get stuck.
          – Floris SA
          Sep 7 at 9:33


















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f365731%2fwhat-it-means-to-have-a-higher-cost-for-a-local-minima-than-the-global-minima%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          What does second last employer means? [closed]

          Installing NextGIS Connect into QGIS 3?

          Confectionery