How to convert an array of numbers into probability values?

up vote
3
down vote

favorite

I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]

Array 2:
[0.48078357, 0.92219683, 1.02359911]

Each of the three numbers in the array represents distance of a data point from the cluster centroid in k-means algorithm. I want to convert these numbers into probabilities. The element which has a high distance should be converted into a low probability. For example, [0.81893085, 0.54768653, 0.14973508] can be converted into a probability vector like [0.13, 0.22, 0.65]. As it can be seen, the elements which have a high value in the original array have low value in the probability array (and of course the values in the probability array sum to 1).

Is there any mathematical technique that will achieve this result?

What I have tried till now is, I took the inverse of each of the values in the original array:

1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]

And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function

This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

add a commentÂ |Â

up vote
3
down vote

favorite

I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]

Array 2:
[0.48078357, 0.92219683, 1.02359911]

Is there any mathematical technique that will achieve this result?

What I have tried till now is, I took the inverse of each of the values in the original array:

1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]

And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function

This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

add a commentÂ |Â

up vote
3
down vote

favorite

I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]

Array 2:
[0.48078357, 0.92219683, 1.02359911]

Is there any mathematical technique that will achieve this result?

What I have tried till now is, I took the inverse of each of the values in the original array:

1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]

And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function

This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

I would like some help with respect to certain numerical computation. I have certain arrays which look like:
Array 1:
[0.81893085, 0.54768653, 0.14973508]

Array 2:
[0.48078357, 0.92219683, 1.02359911]

Is there any mathematical technique that will achieve this result?

What I have tried till now is, I took the inverse of each of the values in the original array:

1/[0.81893085, 0.54768653, 0.14973508] = [1.22110431, 1.82586195, 6.67846172]

And then I input the resulting array to softmax function (softmax function converts an array of numbers to probabilities) - https://en.wikipedia.org/wiki/Softmax_function

This gives a probability vector of [0.00421394, 0.00771491, 0.98807115].
Is this a good approach? Is there any other approach?

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

asked Aug 23 at 5:47

Sujeeth Kumaravel

211

add a commentÂ |Â

3 Answers
3

active

oldest

votes

up vote
2
down vote

Any Survival Function (1 minus the CDF) will have the desired property. Exponential is a potentially good candidate here, as it sometimes can be used to describe distances, but it's hard to say without more information.

$$S(x) = exp(-ax)$$

The parameter $a$ can be tuned or possibly estimated from the data.

For reference, if $a = 1$ then you get,

$$ [0.44, 0.58, 0.86]$$

$$[0.62, 0.4, 0.36]$$

for the first and second arrays respectively.

edited Aug 23 at 17:04

answered Aug 23 at 6:09

knrumsey

1214

add a commentÂ |Â

up vote
1
down vote

This is a generalised question. There are lots of ways to normalise a given distribution. For example:

Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the mean, andvariance` and then use the formula

Maybe you can use simple exponential distribution as given by other answers like 1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution) or maybejust direcly apply softmax function.

You can use the inverse function like you used and then divide by the sum of all values.

The point I am trying to make is that there are 100's of ways to convert an array into probability distribution, you need to choose what works best for you. Also it is very important to note that if you are using this probability distribution to calculate loss and then optimise your model using gradient descent you must make sure the loss function is convex which directly means either your loss function or your PDF should take care that the ultimate loss is convex.

answered Aug 23 at 6:40

DuttaA

398115

add a commentÂ |Â

up vote
0
down vote

Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.

You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].

You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].

So as knrumsey suggested in his answer, you just need a CDF to achieve your result. But which result would be meaningful for your task depends on you. Like my approach with exp(-x) gives values which are close to each other. But your approach gives values in which you can confidently say that the data point belongs to third cluster. So choice of mathematical function and its interpretation of results solely depends on you and your task.

answered Aug 23 at 6:24

Ankit Seth

834115

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f37329%2fhow-to-convert-an-array-of-numbers-into-probability-values%23new-answer', 'question_page');

);

Post as a guest

Name

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
2
down vote

$$S(x) = exp(-ax)$$

The parameter $a$ can be tuned or possibly estimated from the data.

For reference, if $a = 1$ then you get,

$$ [0.44, 0.58, 0.86]$$

$$[0.62, 0.4, 0.36]$$

for the first and second arrays respectively.

edited Aug 23 at 17:04

answered Aug 23 at 6:09

knrumsey

1214

add a commentÂ |Â

up vote
2
down vote

$$S(x) = exp(-ax)$$

The parameter $a$ can be tuned or possibly estimated from the data.

For reference, if $a = 1$ then you get,

$$ [0.44, 0.58, 0.86]$$

$$[0.62, 0.4, 0.36]$$

for the first and second arrays respectively.

edited Aug 23 at 17:04

answered Aug 23 at 6:09

knrumsey

1214

add a commentÂ |Â

up vote
2
down vote

$$S(x) = exp(-ax)$$

The parameter $a$ can be tuned or possibly estimated from the data.

For reference, if $a = 1$ then you get,

$$ [0.44, 0.58, 0.86]$$

$$[0.62, 0.4, 0.36]$$

for the first and second arrays respectively.

edited Aug 23 at 17:04

answered Aug 23 at 6:09

knrumsey

1214

$$S(x) = exp(-ax)$$

The parameter $a$ can be tuned or possibly estimated from the data.

For reference, if $a = 1$ then you get,

$$ [0.44, 0.58, 0.86]$$

$$[0.62, 0.4, 0.36]$$

for the first and second arrays respectively.

edited Aug 23 at 17:04

answered Aug 23 at 6:09

knrumsey

1214

edited Aug 23 at 17:04

answered Aug 23 at 6:09

knrumsey

1214

answered Aug 23 at 6:09

knrumsey

1214

answered Aug 23 at 6:09

knrumsey

1214

add a commentÂ |Â

up vote
1
down vote

This is a generalised question. There are lots of ways to normalise a given distribution. For example:

Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the mean, andvariance` and then use the formula

Maybe you can use simple exponential distribution as given by other answers like 1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution) or maybejust direcly apply softmax function.

You can use the inverse function like you used and then divide by the sum of all values.

answered Aug 23 at 6:40

DuttaA

398115

add a commentÂ |Â

up vote
1
down vote

This is a generalised question. There are lots of ways to normalise a given distribution. For example:

Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the mean, andvariance` and then use the formula

Maybe you can use simple exponential distribution as given by other answers like 1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution) or maybejust direcly apply softmax function.

You can use the inverse function like you used and then divide by the sum of all values.

answered Aug 23 at 6:40

DuttaA

398115

add a commentÂ |Â

up vote
1
down vote

This is a generalised question. There are lots of ways to normalise a given distribution. For example:

Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the mean, andvariance` and then use the formula

Maybe you can use simple exponential distribution as given by other answers like 1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution) or maybejust direcly apply softmax function.

You can use the inverse function like you used and then divide by the sum of all values.

answered Aug 23 at 6:40

DuttaA

398115

This is a generalised question. There are lots of ways to normalise a given distribution. For example:

Normal distribution: You can physically inspect your function by graphing it against variables and then convert it to Normal Distribution as given here. Or you can simply find out the mean, andvariance` and then use the formula

Maybe you can use simple exponential distribution as given by other answers like 1 - n ^ (ax) / (Sum of all x's put in the equation in the distribution) or maybejust direcly apply softmax function.

You can use the inverse function like you used and then divide by the sum of all values.

answered Aug 23 at 6:40

DuttaA

398115

answered Aug 23 at 6:40

DuttaA

398115

answered Aug 23 at 6:40

DuttaA

398115

answered Aug 23 at 6:40

DuttaA

398115

add a commentÂ |Â

up vote
0
down vote

Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.

You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].

You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].

answered Aug 23 at 6:24

Ankit Seth

834115

add a commentÂ |Â

up vote
0
down vote

Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.

You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].

You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].

answered Aug 23 at 6:24

Ankit Seth

834115

add a commentÂ |Â

up vote
0
down vote

Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.

You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].

You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].

answered Aug 23 at 6:24

Ankit Seth

834115

Your approach is quite good. Another approach would mean another function, which gives output in range [0,1] such that the sum of values should be 1.

You can also use only the inverse as you did, just divide the inverse values by their sum, which would give you [0.1255579 , 0.18774104, 0.68670106].

You can also use exp(-x) on your array, giving the values [0.44090279382, 0.5782861116, 0.8609360253], then divide by the sum of these values, giving you [0.23450718, 0.30757856, 0.45791426].

answered Aug 23 at 6:24

Ankit Seth

834115

answered Aug 23 at 6:24

Ankit Seth

834115

answered Aug 23 at 6:24

Ankit Seth

834115

answered Aug 23 at 6:24

Ankit Seth

834115

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky