Sets of solutions which it is hard to uniformly sample from, but easy to integrate functions over? (Or compute expectations over?)
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:
1) It is hard to sample approximately uniformly at random from the set of all solutions
2) It is easy to approximately integrate any (polynomially computable) function over the solutions.
I would also be happy with replacing 2) by the weaker:
2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.
My thoughts:
If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.
If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.
A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.
I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.
A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?
--
Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.
counting-complexity randomness probabilistic-computation probabilistic-complexity
add a comment |Â
up vote
1
down vote
favorite
I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:
1) It is hard to sample approximately uniformly at random from the set of all solutions
2) It is easy to approximately integrate any (polynomially computable) function over the solutions.
I would also be happy with replacing 2) by the weaker:
2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.
My thoughts:
If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.
If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.
A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.
I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.
A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?
--
Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.
counting-complexity randomness probabilistic-computation probabilistic-complexity
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:
1) It is hard to sample approximately uniformly at random from the set of all solutions
2) It is easy to approximately integrate any (polynomially computable) function over the solutions.
I would also be happy with replacing 2) by the weaker:
2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.
My thoughts:
If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.
If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.
A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.
I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.
A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?
--
Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.
counting-complexity randomness probabilistic-computation probabilistic-complexity
I'm curious if there is a problem (e.g. something like perfect matchings on a given graph, number of solutions to a boolean equation, etc. for precise frameowork see JVV86) such that:
1) It is hard to sample approximately uniformly at random from the set of all solutions
2) It is easy to approximately integrate any (polynomially computable) function over the solutions.
I would also be happy with replacing 2) by the weaker:
2') It is easy to approximate the expectation of any polynomially computable function over the uniform distribution on the solutions.
My thoughts:
If 2) holds, then by integrating the constant function $1$, one can approximately count the number of solutions to the problem.
If 2) holds but 1) does not, then the problem cannot be self-reducible, as otherwise by reductions in JVV86 there would be an efficient algorithm for approximate uniform sampling by using the observation of the previous bullet.
A reasonable candidate technique for accomplishing 2) without 1) is importance sampling.
I know of situations where sampling is easy and counting is hard (so integration is hard); the case of uniformly sampling solutions to a DNF equation is explained in JVV86 section 4.
A specific candidate problem I would be interested in is the problem of sampling directed simple cycles from a digraph. (It is shown in section 5. of JVV86 that this problem is hard.) However, I know that counting the number of directed simple cycles is NP-hard, by a similar argument to that presented in JVV, and therefore so is integrating over them. However, perhaps 2') can be solved in this case?
--
Any references that seem relevant would be very appreciated, even if they are not a direct answer to my question.
counting-complexity randomness probabilistic-computation probabilistic-complexity
counting-complexity randomness probabilistic-computation probabilistic-complexity
edited 3 hours ago
D.W.
7,28112049
7,28112049
asked 5 hours ago
Lorenzo
21916
21916
add a comment |Â
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
3
down vote
accepted
There is no such problem. If it's hard to sample, it's hard to integrate.
Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:
Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.
Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.
Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.
And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.
It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.
You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and
$$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
= mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
There is no such problem. If it's hard to sample, it's hard to integrate.
Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:
Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.
Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.
Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.
And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.
It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.
You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and
$$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
= mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
add a comment |Â
up vote
3
down vote
accepted
There is no such problem. If it's hard to sample, it's hard to integrate.
Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:
Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.
Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.
Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.
And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.
It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.
You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and
$$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
= mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
add a comment |Â
up vote
3
down vote
accepted
up vote
3
down vote
accepted
There is no such problem. If it's hard to sample, it's hard to integrate.
Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:
Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.
Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.
Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.
And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.
It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.
You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and
$$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
= mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$
There is no such problem. If it's hard to sample, it's hard to integrate.
Here is a sketch of the reason why. Represent every solution $x$ by a $n$-bit string $x_1,dots,x_n$. If you can integrate over the set of all solutions, here is an algorithm to sample from all solutions:
Count the number of solutions; call it $N_0$. (As you say, this can be done efficiently, as it is just an integral.) Count the number of solutions $x$ such that $x_1=1$; call it $N'_0$. (This is an integral of the function $f(x)$ over all solutions, where $f(x_1,dots,x_n) = x_1$, so it can be computed efficiently.) With probability $N'_0/N_0$, set $y_1 = 1$, else set $y_1 = 0$.
Count the number of solutions $x$ such that $x_1=y_1$; call it $N_1$. (This is another integral.) Count the number of solutions $x$ such that $x_1=y_1$ and $x_2=1$; call it $N'_1$. With probability $N'_1/N_1$, set $y_2 = 1$, else set $y_2 = 0$.
Count the number $N_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$, and the number $N'_2$ of solutions $x$ such that $x_1=y_1$ and $x_2=y_2$ and $x_3=1$. With probability $N'_2/N_2$, set $y_3=1$, else set $y_3=0$.
And so on. When this procedure finishes, output $y_1,y_2,dots,y_n$. This will be a valid solution, sampled uniformly at random from the set of all solutions.
It's easy to see that if you have a way to compute the exact integral, you'll obtain a way to sample exactly uniformly at random. I expect it will follow that if you can approximate the integral, you can sample approximately uniformly at random; I'll let you compute the parameters and relation between the two approximations.
You can also implement this algorithm using expectations. Let $X$ be a r.v. that is uniformly distributed on the set of all solutions. Set $y_1=1$ with probability $mathbbE[X_1]$; set $y_2=1$ with probability $mathbbE[mathbf1_X_1 = y_1 cdot X_2] / mathbbE[mathbf1_X_1 = y_1]$; and so on. Here I use $mathbf1_e(X)$ to represent a function that is 1 if $e(X)$ holds, and 0 otherwise. Why does this work? Well, $Pr[X_1=1] = mathbbE[X_1]$, and
$$Pr[X_2=1|X_1=y_1] = Pr[X_1 = y_1 land X_2=1] over Pr[X_1=y_1]
= mathbbE[mathbf1_X_1 = y_1 cdot X_2] over mathbbE[mathbf1_X_1 = y_1].$$
edited 34 mins ago
answered 3 hours ago
D.W.
7,28112049
7,28112049
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
add a comment |Â
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
Got it, thanks! The thing I was missing is that you don't need self reducibility to reduce sampling to counting if you can integrate over all functions (your construction is essentially the reduction JVV86 for self reducible problems). What about computing expectations against the uniform distribution - that is, 1) and 2')? I don't think the same idea will work for that.
â Lorenzo
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
@Lorenzo, see edited answer, at the bottom.
â D.W.
2 hours ago
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcstheory.stackexchange.com%2fquestions%2f41564%2fsets-of-solutions-which-it-is-hard-to-uniformly-sample-from-but-easy-to-integra%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password