How to convert an array of numbers into probability values?
Clash Royale CLAN TAG#URR8PPP
up vote
3
down vote
favorite
I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]
Array 2:
[0.48078357, 0.92219683, 1.02359911]
Each of the three numbers in the array represents distance of a data point from the cluster centroid in k-means algorithm. I want to convert these numbers into probabilities. The element which has a high distance should be converted into a low probability. For example, [0.81893085, 0.54768653, 0.14973508] can be converted into a probability vector like [0.13, 0.22, 0.65]. As it can be seen, the elements which have a high value in the original array have low value in the probability array (and of course the values in the probability array sum to 1).
Is there any mathematical technique that will achieve this result?
What I have tried till now is, I took the inverse of each of the values in the original array:
1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]
And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function
This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?
probability
add a comment |Â
up vote
3
down vote
favorite
I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]
Array 2:
[0.48078357, 0.92219683, 1.02359911]
Each of the three numbers in the array represents distance of a data point from the cluster centroid in k-means algorithm. I want to convert these numbers into probabilities. The element which has a high distance should be converted into a low probability. For example, [0.81893085, 0.54768653, 0.14973508] can be converted into a probability vector like [0.13, 0.22, 0.65]. As it can be seen, the elements which have a high value in the original array have low value in the probability array (and of course the values in the probability array sum to 1).
Is there any mathematical technique that will achieve this result?
What I have tried till now is, I took the inverse of each of the values in the original array:
1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]
And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function
This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?
probability
add a comment |Â
up vote
3
down vote
favorite
up vote
3
down vote
favorite
I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]
Array 2:
[0.48078357, 0.92219683, 1.02359911]
Each of the three numbers in the array represents distance of a data point from the cluster centroid in k-means algorithm. I want to convert these numbers into probabilities. The element which has a high distance should be converted into a low probability. For example, [0.81893085, 0.54768653, 0.14973508] can be converted into a probability vector like [0.13, 0.22, 0.65]. As it can be seen, the elements which have a high value in the original array have low value in the probability array (and of course the values in the probability array sum to 1).
Is there any mathematical technique that will achieve this result?
What I have tried till now is, I took the inverse of each of the values in the original array:
1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]
And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function
This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?
probability
I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]
Array 2:
[0.48078357, 0.92219683, 1.02359911]
Each of the three numbers in the array represents distance of a data point from the cluster centroid in k-means algorithm. I want to convert these numbers into probabilities. The element which has a high distance should be converted into a low probability. For example, [0.81893085, 0.54768653, 0.14973508] can be converted into a probability vector like [0.13, 0.22, 0.65]. As it can be seen, the elements which have a high value in the original array have low value in the probability array (and of course the values in the probability array sum to 1).
Is there any mathematical technique that will achieve this result?
What I have tried till now is, I took the inverse of each of the values in the original array:
1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]
And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function
This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?
probability
asked Aug 23 at 5:47
Sujeeth Kumaravel
211
211
add a comment |Â
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
2
down vote
Any Survival Function (1 minus the CDF) will have the desired property. Exponential is a potentially good candidate here, as it sometimes can be used to describe distances, but it's hard to say without more information.
$$S(x) = exp(-ax)$$
The parameter $a$ can be tuned or possibly estimated from the data.
For reference, if $a = 1$ then you get,
$$ [0.44, 0.58, 0.86]$$
$$[0.62, 0.4, 0.36]$$
for the first and second arrays respectively.
add a comment |Â
up vote
1
down vote
This is a generalised question. There are lots of ways to normalise a given distribution. For example:
- Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the
mean, and
variance` and then use the formula - Maybe you can use simple exponential distribution as given by other answers like
1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution)
or maybejust direcly apply softmax function. - You can use the inverse function like you used and then divide by the sum of all values.
The point I am trying to make is that there are 100's of ways to convert an array into probability distribution, you need to choose what works best for you. Also it is very important to note that if you are using this probability distribution to calculate loss and then optimise your model using gradient descent you must make sure the loss function is convex which directly means either your loss function or your PDF should take care that the ultimate loss is convex.
add a comment |Â
up vote
0
down vote
Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.
You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].
You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].
So as knrumsey suggested in his answer, you just need a CDF to achieve your result. But which result would be meaningful for your task depends on you. Like my approach with exp(-x) gives values which are close to each other. But your approach gives values in which you can confidently say that the data point belongs to third cluster. So choice of mathematical function and its interpretation of results solely depends on you and your task.
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
Any Survival Function (1 minus the CDF) will have the desired property. Exponential is a potentially good candidate here, as it sometimes can be used to describe distances, but it's hard to say without more information.
$$S(x) = exp(-ax)$$
The parameter $a$ can be tuned or possibly estimated from the data.
For reference, if $a = 1$ then you get,
$$ [0.44, 0.58, 0.86]$$
$$[0.62, 0.4, 0.36]$$
for the first and second arrays respectively.
add a comment |Â
up vote
2
down vote
Any Survival Function (1 minus the CDF) will have the desired property. Exponential is a potentially good candidate here, as it sometimes can be used to describe distances, but it's hard to say without more information.
$$S(x) = exp(-ax)$$
The parameter $a$ can be tuned or possibly estimated from the data.
For reference, if $a = 1$ then you get,
$$ [0.44, 0.58, 0.86]$$
$$[0.62, 0.4, 0.36]$$
for the first and second arrays respectively.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Any Survival Function (1 minus the CDF) will have the desired property. Exponential is a potentially good candidate here, as it sometimes can be used to describe distances, but it's hard to say without more information.
$$S(x) = exp(-ax)$$
The parameter $a$ can be tuned or possibly estimated from the data.
For reference, if $a = 1$ then you get,
$$ [0.44, 0.58, 0.86]$$
$$[0.62, 0.4, 0.36]$$
for the first and second arrays respectively.
Any Survival Function (1 minus the CDF) will have the desired property. Exponential is a potentially good candidate here, as it sometimes can be used to describe distances, but it's hard to say without more information.
$$S(x) = exp(-ax)$$
The parameter $a$ can be tuned or possibly estimated from the data.
For reference, if $a = 1$ then you get,
$$ [0.44, 0.58, 0.86]$$
$$[0.62, 0.4, 0.36]$$
for the first and second arrays respectively.
edited Aug 23 at 17:04
answered Aug 23 at 6:09
knrumsey
1214
1214
add a comment |Â
add a comment |Â
up vote
1
down vote
This is a generalised question. There are lots of ways to normalise a given distribution. For example:
- Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the
mean, and
variance` and then use the formula - Maybe you can use simple exponential distribution as given by other answers like
1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution)
or maybejust direcly apply softmax function. - You can use the inverse function like you used and then divide by the sum of all values.
The point I am trying to make is that there are 100's of ways to convert an array into probability distribution, you need to choose what works best for you. Also it is very important to note that if you are using this probability distribution to calculate loss and then optimise your model using gradient descent you must make sure the loss function is convex which directly means either your loss function or your PDF should take care that the ultimate loss is convex.
add a comment |Â
up vote
1
down vote
This is a generalised question. There are lots of ways to normalise a given distribution. For example:
- Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the
mean, and
variance` and then use the formula - Maybe you can use simple exponential distribution as given by other answers like
1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution)
or maybejust direcly apply softmax function. - You can use the inverse function like you used and then divide by the sum of all values.
The point I am trying to make is that there are 100's of ways to convert an array into probability distribution, you need to choose what works best for you. Also it is very important to note that if you are using this probability distribution to calculate loss and then optimise your model using gradient descent you must make sure the loss function is convex which directly means either your loss function or your PDF should take care that the ultimate loss is convex.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
This is a generalised question. There are lots of ways to normalise a given distribution. For example:
- Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the
mean, and
variance` and then use the formula - Maybe you can use simple exponential distribution as given by other answers like
1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution)
or maybejust direcly apply softmax function. - You can use the inverse function like you used and then divide by the sum of all values.
The point I am trying to make is that there are 100's of ways to convert an array into probability distribution, you need to choose what works best for you. Also it is very important to note that if you are using this probability distribution to calculate loss and then optimise your model using gradient descent you must make sure the loss function is convex which directly means either your loss function or your PDF should take care that the ultimate loss is convex.
This is a generalised question. There are lots of ways to normalise a given distribution. For example:
- Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the
mean, and
variance` and then use the formula - Maybe you can use simple exponential distribution as given by other answers like
1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution)
or maybejust direcly apply softmax function. - You can use the inverse function like you used and then divide by the sum of all values.
The point I am trying to make is that there are 100's of ways to convert an array into probability distribution, you need to choose what works best for you. Also it is very important to note that if you are using this probability distribution to calculate loss and then optimise your model using gradient descent you must make sure the loss function is convex which directly means either your loss function or your PDF should take care that the ultimate loss is convex.
answered Aug 23 at 6:40
DuttaA
398115
398115
add a comment |Â
add a comment |Â
up vote
0
down vote
Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.
You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].
You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].
So as knrumsey suggested in his answer, you just need a CDF to achieve your result. But which result would be meaningful for your task depends on you. Like my approach with exp(-x) gives values which are close to each other. But your approach gives values in which you can confidently say that the data point belongs to third cluster. So choice of mathematical function and its interpretation of results solely depends on you and your task.
add a comment |Â
up vote
0
down vote
Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.
You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].
You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].
So as knrumsey suggested in his answer, you just need a CDF to achieve your result. But which result would be meaningful for your task depends on you. Like my approach with exp(-x) gives values which are close to each other. But your approach gives values in which you can confidently say that the data point belongs to third cluster. So choice of mathematical function and its interpretation of results solely depends on you and your task.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.
You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].
You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].
So as knrumsey suggested in his answer, you just need a CDF to achieve your result. But which result would be meaningful for your task depends on you. Like my approach with exp(-x) gives values which are close to each other. But your approach gives values in which you can confidently say that the data point belongs to third cluster. So choice of mathematical function and its interpretation of results solely depends on you and your task.
Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.
You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].
You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].
So as knrumsey suggested in his answer, you just need a CDF to achieve your result. But which result would be meaningful for your task depends on you. Like my approach with exp(-x) gives values which are close to each other. But your approach gives values in which you can confidently say that the data point belongs to third cluster. So choice of mathematical function and its interpretation of results solely depends on you and your task.
answered Aug 23 at 6:24
Ankit Seth
834115
834115
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37329%2fhow-to-convert-an-array-of-numbers-into-probability-values%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password