Sets of solutions which it is hard to uniformly sample from, but easy to integrate functions over? (Or compute expectations over?)

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:



1) It is hard to sample approximately uniformly at random from the set of all solutions



2) It is easy to approximately integrate any (polynomially computable) function over the solutions.



I would also be happy with replacing 2) by the weaker:



2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.



My thoughts:



  • If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.


  • If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.


  • A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.


  • I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.


  • A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?


--



Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.










share|cite|improve this question



























    up vote
    1
    down vote

    favorite












    I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:



    1) It is hard to sample approximately uniformly at random from the set of all solutions



    2) It is easy to approximately integrate any (polynomially computable) function over the solutions.



    I would also be happy with replacing 2) by the weaker:



    2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.



    My thoughts:



    • If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.


    • If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.


    • A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.


    • I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.


    • A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?


    --



    Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.










    share|cite|improve this question

























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:



      1) It is hard to sample approximately uniformly at random from the set of all solutions



      2) It is easy to approximately integrate any (polynomially computable) function over the solutions.



      I would also be happy with replacing 2) by the weaker:



      2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.



      My thoughts:



      • If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.


      • If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.


      • A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.


      • I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.


      • A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?


      --



      Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.










      share|cite|improve this question















      I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:



      1) It is hard to sample approximately uniformly at random from the set of all solutions



      2) It is easy to approximately integrate any (polynomially computable) function over the solutions.



      I would also be happy with replacing 2) by the weaker:



      2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.



      My thoughts:



      • If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.


      • If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.


      • A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.


      • I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.


      • A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?


      --



      Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.







      counting-complexity randomness probabilistic-computation probabilistic-complexity






      share|cite|improve this question















      share|cite|improve this question













      share|cite|improve this question




      share|cite|improve this question








      edited 3 hours ago









      D.W.

      7,28112049




      7,28112049










      asked 5 hours ago









      Lorenzo

      21916




      21916




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          3
          down vote



          accepted










          There is no such problem. If it's hard to sample, it's hard to integrate.



          Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:



          1. Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.


          2. Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.


          3. Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.


          4. And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.


          It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.




          You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and



          $$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
          = mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$






          share|cite|improve this answer






















          • Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
            – Lorenzo
            2 hours ago











          • @Lorenzo, see edited answer, at the bottom.
            – D.W.
            2 hours ago










          Your Answer




          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "114"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: false,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          noCode: true, onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













           

          draft saved


          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcstheory.stackexchange.com%2fquestions%2f41564%2fsets-of-solutions-which-it-is-hard-to-uniformly-sample-from-but-easy-to-integra%23new-answer', 'question_page');

          );

          Post as a guest






























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          up vote
          3
          down vote



          accepted










          There is no such problem. If it's hard to sample, it's hard to integrate.



          Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:



          1. Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.


          2. Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.


          3. Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.


          4. And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.


          It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.




          You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and



          $$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
          = mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$






          share|cite|improve this answer






















          • Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
            – Lorenzo
            2 hours ago











          • @Lorenzo, see edited answer, at the bottom.
            – D.W.
            2 hours ago














          up vote
          3
          down vote



          accepted










          There is no such problem. If it's hard to sample, it's hard to integrate.



          Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:



          1. Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.


          2. Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.


          3. Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.


          4. And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.


          It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.




          You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and



          $$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
          = mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$






          share|cite|improve this answer






















          • Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
            – Lorenzo
            2 hours ago











          • @Lorenzo, see edited answer, at the bottom.
            – D.W.
            2 hours ago












          up vote
          3
          down vote



          accepted







          up vote
          3
          down vote



          accepted






          There is no such problem. If it's hard to sample, it's hard to integrate.



          Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:



          1. Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.


          2. Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.


          3. Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.


          4. And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.


          It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.




          You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and



          $$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
          = mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$






          share|cite|improve this answer














          There is no such problem. If it's hard to sample, it's hard to integrate.



          Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:



          1. Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.


          2. Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.


          3. Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.


          4. And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.


          It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.




          You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and



          $$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
          = mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$







          share|cite|improve this answer














          share|cite|improve this answer



          share|cite|improve this answer








          edited 34 mins ago

























          answered 3 hours ago









          D.W.

          7,28112049




          7,28112049











          • Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
            – Lorenzo
            2 hours ago











          • @Lorenzo, see edited answer, at the bottom.
            – D.W.
            2 hours ago
















          • Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
            – Lorenzo
            2 hours ago











          • @Lorenzo, see edited answer, at the bottom.
            – D.W.
            2 hours ago















          Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
          – Lorenzo
          2 hours ago





          Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
          – Lorenzo
          2 hours ago













          @Lorenzo, see edited answer, at the bottom.
          – D.W.
          2 hours ago




          @Lorenzo, see edited answer, at the bottom.
          – D.W.
          2 hours ago

















           

          draft saved


          draft discarded















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcstheory.stackexchange.com%2fquestions%2f41564%2fsets-of-solutions-which-it-is-hard-to-uniformly-sample-from-but-easy-to-integra%23new-answer', 'question_page');

          );

          Post as a guest













































































          Comments

          Popular posts from this blog

          What does second last employer means? [closed]

          Installing NextGIS Connect into QGIS 3?

          One-line joke