Constructing higher order transition probability matrix

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
5
down vote

favorite












Recently I asked a question here about how to construct a transition probability matrix given the following list:



x = "A", "A", "A", "E", "D", "D", "D", "C", "B", "E", "E", "E", "D", 
"B", "A", "D", "B", "E", "C", "A", "D", "A", "A", "A", "A", "C",
"C", "C", "D", "D", "E"


For which one can get the following matrix: (see the detail from the previous question)



$$beginarraycccccc
& textA & textB & textC & textD & textE \
textA & frac59 & 0 & frac19 & frac29 & frac19 \
textB & frac13 & 0 & 0 & 0 & frac23 \
textC & frac15 & frac15 & frac25 & frac15 & 0 \
textD & frac18 & frac14 & frac18 & frac38 & frac18 \
textE & 0 & 0 & frac15 & frac25 & frac25 \
endarray$$



above is equivalent of partitioning list $x$ into sublists with size 2 and offset of 1, then counting each element and divide it by the sum of the row. The command to find the right partition is Partition[x, 2, 1] (again I refer you to the previous question). Now what if we want to find the higher order transition matrix? For example the second order would be Partition[x, 3, 1] and the expected matrix shall look like:



$$beginarraycccccc
& A & B & C & D &E \
AA & P_AA,A & P_AA,B & P_AA,C & P_AA,D &P_AA,E\
AB & P_AB,A & P_AB,B & P_AB,C & P_AB,D &P_AB,E\
AC & P_AC,A & P_AC,B & P_AC,C & P_AC,D &P_AC,E\
AD & P_AD,A & P_AD,B & P_AD,C & P_AD,D &P_AD,E\
AE & P_AE,A & P_AE,B & P_AE,C & P_AE,D &P_AE,E\
vdots & vdots & vdots & vdots &vdots & vdots\
EC & P_EC,A & P_EC,B & P_EC,C & P_EC,D &P_EC,E\
ED &P_ED,A & P_ED,B & P_ED,C & P_ED,D &P_ED,E\
EE & P_EE,A & P_EE,B & P_EE,C & P_EE,D &P_EE,E\
endarray$$



In general the dimension of the matrix follows $S$, where n is the order of the Markov chain.







share|improve this question






















  • Thanks for the edit @rhermans it was really bugging me.
    – William
    Aug 8 at 9:59














up vote
5
down vote

favorite












Recently I asked a question here about how to construct a transition probability matrix given the following list:



x = "A", "A", "A", "E", "D", "D", "D", "C", "B", "E", "E", "E", "D", 
"B", "A", "D", "B", "E", "C", "A", "D", "A", "A", "A", "A", "C",
"C", "C", "D", "D", "E"


For which one can get the following matrix: (see the detail from the previous question)



$$beginarraycccccc
& textA & textB & textC & textD & textE \
textA & frac59 & 0 & frac19 & frac29 & frac19 \
textB & frac13 & 0 & 0 & 0 & frac23 \
textC & frac15 & frac15 & frac25 & frac15 & 0 \
textD & frac18 & frac14 & frac18 & frac38 & frac18 \
textE & 0 & 0 & frac15 & frac25 & frac25 \
endarray$$



above is equivalent of partitioning list $x$ into sublists with size 2 and offset of 1, then counting each element and divide it by the sum of the row. The command to find the right partition is Partition[x, 2, 1] (again I refer you to the previous question). Now what if we want to find the higher order transition matrix? For example the second order would be Partition[x, 3, 1] and the expected matrix shall look like:



$$beginarraycccccc
& A & B & C & D &E \
AA & P_AA,A & P_AA,B & P_AA,C & P_AA,D &P_AA,E\
AB & P_AB,A & P_AB,B & P_AB,C & P_AB,D &P_AB,E\
AC & P_AC,A & P_AC,B & P_AC,C & P_AC,D &P_AC,E\
AD & P_AD,A & P_AD,B & P_AD,C & P_AD,D &P_AD,E\
AE & P_AE,A & P_AE,B & P_AE,C & P_AE,D &P_AE,E\
vdots & vdots & vdots & vdots &vdots & vdots\
EC & P_EC,A & P_EC,B & P_EC,C & P_EC,D &P_EC,E\
ED &P_ED,A & P_ED,B & P_ED,C & P_ED,D &P_ED,E\
EE & P_EE,A & P_EE,B & P_EE,C & P_EE,D &P_EE,E\
endarray$$



In general the dimension of the matrix follows $S$, where n is the order of the Markov chain.







share|improve this question






















  • Thanks for the edit @rhermans it was really bugging me.
    – William
    Aug 8 at 9:59












up vote
5
down vote

favorite









up vote
5
down vote

favorite











Recently I asked a question here about how to construct a transition probability matrix given the following list:



x = "A", "A", "A", "E", "D", "D", "D", "C", "B", "E", "E", "E", "D", 
"B", "A", "D", "B", "E", "C", "A", "D", "A", "A", "A", "A", "C",
"C", "C", "D", "D", "E"


For which one can get the following matrix: (see the detail from the previous question)



$$beginarraycccccc
& textA & textB & textC & textD & textE \
textA & frac59 & 0 & frac19 & frac29 & frac19 \
textB & frac13 & 0 & 0 & 0 & frac23 \
textC & frac15 & frac15 & frac25 & frac15 & 0 \
textD & frac18 & frac14 & frac18 & frac38 & frac18 \
textE & 0 & 0 & frac15 & frac25 & frac25 \
endarray$$



above is equivalent of partitioning list $x$ into sublists with size 2 and offset of 1, then counting each element and divide it by the sum of the row. The command to find the right partition is Partition[x, 2, 1] (again I refer you to the previous question). Now what if we want to find the higher order transition matrix? For example the second order would be Partition[x, 3, 1] and the expected matrix shall look like:



$$beginarraycccccc
& A & B & C & D &E \
AA & P_AA,A & P_AA,B & P_AA,C & P_AA,D &P_AA,E\
AB & P_AB,A & P_AB,B & P_AB,C & P_AB,D &P_AB,E\
AC & P_AC,A & P_AC,B & P_AC,C & P_AC,D &P_AC,E\
AD & P_AD,A & P_AD,B & P_AD,C & P_AD,D &P_AD,E\
AE & P_AE,A & P_AE,B & P_AE,C & P_AE,D &P_AE,E\
vdots & vdots & vdots & vdots &vdots & vdots\
EC & P_EC,A & P_EC,B & P_EC,C & P_EC,D &P_EC,E\
ED &P_ED,A & P_ED,B & P_ED,C & P_ED,D &P_ED,E\
EE & P_EE,A & P_EE,B & P_EE,C & P_EE,D &P_EE,E\
endarray$$



In general the dimension of the matrix follows $S$, where n is the order of the Markov chain.







share|improve this question














Recently I asked a question here about how to construct a transition probability matrix given the following list:



x = "A", "A", "A", "E", "D", "D", "D", "C", "B", "E", "E", "E", "D", 
"B", "A", "D", "B", "E", "C", "A", "D", "A", "A", "A", "A", "C",
"C", "C", "D", "D", "E"


For which one can get the following matrix: (see the detail from the previous question)



$$beginarraycccccc
& textA & textB & textC & textD & textE \
textA & frac59 & 0 & frac19 & frac29 & frac19 \
textB & frac13 & 0 & 0 & 0 & frac23 \
textC & frac15 & frac15 & frac25 & frac15 & 0 \
textD & frac18 & frac14 & frac18 & frac38 & frac18 \
textE & 0 & 0 & frac15 & frac25 & frac25 \
endarray$$



above is equivalent of partitioning list $x$ into sublists with size 2 and offset of 1, then counting each element and divide it by the sum of the row. The command to find the right partition is Partition[x, 2, 1] (again I refer you to the previous question). Now what if we want to find the higher order transition matrix? For example the second order would be Partition[x, 3, 1] and the expected matrix shall look like:



$$beginarraycccccc
& A & B & C & D &E \
AA & P_AA,A & P_AA,B & P_AA,C & P_AA,D &P_AA,E\
AB & P_AB,A & P_AB,B & P_AB,C & P_AB,D &P_AB,E\
AC & P_AC,A & P_AC,B & P_AC,C & P_AC,D &P_AC,E\
AD & P_AD,A & P_AD,B & P_AD,C & P_AD,D &P_AD,E\
AE & P_AE,A & P_AE,B & P_AE,C & P_AE,D &P_AE,E\
vdots & vdots & vdots & vdots &vdots & vdots\
EC & P_EC,A & P_EC,B & P_EC,C & P_EC,D &P_EC,E\
ED &P_ED,A & P_ED,B & P_ED,C & P_ED,D &P_ED,E\
EE & P_EE,A & P_EE,B & P_EE,C & P_EE,D &P_EE,E\
endarray$$



In general the dimension of the matrix follows $S$, where n is the order of the Markov chain.









share|improve this question













share|improve this question




share|improve this question








edited Aug 8 at 9:52









rhermans

21.6k439103




21.6k439103










asked Aug 8 at 9:41









William

35517




35517











  • Thanks for the edit @rhermans it was really bugging me.
    – William
    Aug 8 at 9:59
















  • Thanks for the edit @rhermans it was really bugging me.
    – William
    Aug 8 at 9:59















Thanks for the edit @rhermans it was really bugging me.
– William
Aug 8 at 9:59




Thanks for the edit @rhermans it was really bugging me.
– William
Aug 8 at 9:59










4 Answers
4






active

oldest

votes

















up vote
5
down vote



accepted










The following code is just brute-force. But at least yields the expected results. Also, it can be used for any order.



The first parameter is the data. The second parameter is the order.



probM[data_, ord_] := 
Module[uniques = Union[data], acc = 0, len, trans, trPre, tData,
toCount, toGather, toNormalize,
trans = Dispatch@Thread[uniques -> Range[len = Length[uniques]]];
trPre = Dispatch@Flatten[Array[## -> ++acc &, ConstantArray[len, ord]]];
tData = Replace[data, trans, 1];
toCount = Partition[tData, ord + 1, 1];
toGather = Map[Replace[#[[1, ;; -2]], trPre], #[[1, -1]] -> #[[2]] &,
Tally[toCount]];
toNormalize = GatherBy[toGather, #[[1, 1]] &];
SparseArray[
Flatten@Map[
With[tot = 1/Plus @@ #[[All, 2]],
Map[#[[1]] -> #[[2]] tot &, #]] &, toNormalize]]];


Let us check the dimensions of the first three orders.



Table[probM[x, i] // Dimensions, i, 3]
(*5, 5, 25, 5, 125, 5*)


As for the efficiency of probM, I tried replacing some of the Map with ParallelMap but it did not yield any improvement. You might want to combine with niceties from the other answer. For example, use ArrayComponents instead of dispatch tables.



In any case, check the second order table:



$$
beginarraycccccc
text & textA & textB & textC & textD & textE \
textAA & frac35 & 0 & frac15 & 0 & frac15 \
textAB & 0 & 0 & 0 & 0 & 0 \
textAC & 0 & 0 & 1 & 0 & 0 \
textAD & frac12 & frac12 & 0 & 0 & 0 \
textAE & 0 & 0 & 0 & 1 & 0 \
textBA & 0 & 0 & 0 & 1 & 0 \
textBB & 0 & 0 & 0 & 0 & 0 \
textBC & 0 & 0 & 0 & 0 & 0 \
textBD & 0 & 0 & 0 & 0 & 0 \
textBE & 0 & 0 & frac12 & 0 & frac12 \
textCA & 0 & 0 & 0 & 1 & 0 \
textCB & 0 & 0 & 0 & 0 & 1 \
textCC & 0 & 0 & frac12 & frac12 & 0 \
textCD & 0 & 0 & 0 & 1 & 0 \
textCE & 0 & 0 & 0 & 0 & 0 \
textDA & 1 & 0 & 0 & 0 & 0 \
textDB & frac12 & 0 & 0 & 0 & frac12 \
textDC & 0 & 1 & 0 & 0 & 0 \
textDD & 0 & 0 & frac13 & frac13 & frac13 \
textDE & 0 & 0 & 0 & 0 & 0 \
textEA & 0 & 0 & 0 & 0 & 0 \
textEB & 0 & 0 & 0 & 0 & 0 \
textEC & 1 & 0 & 0 & 0 & 0 \
textED & 0 & frac12 & 0 & frac12 & 0 \
textEE & 0 & 0 & 0 & frac12 & frac12 \
endarray
$$






share|improve this answer






















  • Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
    – William
    Aug 8 at 11:53










  • @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
    – Hector
    Aug 8 at 11:57











  • Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
    – William
    Aug 8 at 12:00










  • I am using 8.0 ver
    – William
    Aug 8 at 12:01










  • @William I'll edit the answer to make it more clear.
    – Hector
    Aug 8 at 12:02

















up vote
5
down vote













Update: Using EmpiricalDistribution and MarginalDistribution to compute the conditional probabilities:



ClearAll[transitionProb]
transitionProb[step_: 1][x_] := Module[states = DeleteDuplicates@x,
ed = EmpiricalDistribution[Partition[ArrayComponents @ x, step + 1, 1]],
ordering, tuples, md, condpdF,
ordering = Ordering[states]; tuples = Tuples[ordering, step];
md = MarginalDistribution[ed, Range[step]];
condpdF[u__, w_] := If[PDF[md, u] === 0, 0, PDF[ed, u, w]/PDF[md, u]];
Prepend[Row @ states[[##]],
## & @@ Table[## & @@ condpdF[##, i], i, ordering] & @@@ tuples,
Prepend[states[[ordering]], ""]]]


Examples:



transitionProb[2][x] // Grid[#, Dividers -> All] & // TeXForm



$beginarray
hline
text & textA & textB & textC & textD & textE \
hline
textAtextA & frac35 & 0 & frac15 & 0 & frac15 \
hline
textAtextB & 0 & 0 & 0 & 0 & 0 \
hline
textAtextC & 0 & 0 & 1 & 0 & 0 \
hline
textAtextD & frac12 & frac12 & 0 & 0 & 0 \
hline
textAtextE & 0 & 0 & 0 & 1 & 0 \
hline
textBtextA & 0 & 0 & 0 & 1 & 0 \
hline
textBtextB & 0 & 0 & 0 & 0 & 0 \
hline
textBtextC & 0 & 0 & 0 & 0 & 0 \
hline
textBtextD & 0 & 0 & 0 & 0 & 0 \
hline
textBtextE & 0 & 0 & frac12 & 0 & frac12 \
hline
textCtextA & 0 & 0 & 0 & 1 & 0 \
hline
textCtextB & 0 & 0 & 0 & 0 & 1 \
hline
textCtextC & 0 & 0 & frac12 & frac12 & 0 \
hline
textCtextD & 0 & 0 & 0 & 1 & 0 \
hline
textCtextE & 0 & 0 & 0 & 0 & 0 \
hline
textDtextA & 1 & 0 & 0 & 0 & 0 \
hline
textDtextB & frac12 & 0 & 0 & 0 & frac12 \
hline
textDtextC & 0 & 1 & 0 & 0 & 0 \
hline
textDtextD & 0 & 0 & frac13 & frac13 & frac13 \
hline
textDtextE & 0 & 0 & 0 & 0 & 0 \
hline
textEtextA & 0 & 0 & 0 & 0 & 0 \
hline
textEtextB & 0 & 0 & 0 & 0 & 0 \
hline
textEtextC & 1 & 0 & 0 & 0 & 0 \
hline
textEtextD & 0 & frac12 & 0 & frac12 & 0 \
hline
textEtextE & 0 & 0 & 0 & frac12 & frac12 \
hline
endarray$




transitionProb[1][x] // Grid[#, Dividers -> All] & // TeXForm



$beginarray
hline
text & textA & textB & textC & textD & textE \
hline
textA & frac59 & 0 & frac19 & frac29 & frac19 \
hline
textB & frac13 & 0 & 0 & 0 & frac23 \
hline
textC & frac15 & frac15 & frac25 & frac15 & 0 \
hline
textD & frac18 & frac14 & frac18 & frac38 & frac18 \
hline
textE & 0 & 0 & frac15 & frac25 & frac25 \
hline
endarray$




Original answer:



states = DeleteDuplicates[x];
ordering = Ordering[states];
data = ArrayComponents@x ;
estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
tuples = Tuples[Range[5][[ordering]], 2];
table = Row@states[[##]], ## & @@
Table[Probability[p[3] == s [Conditioned] p[1] == # && p[2] == #2,
p [Distributed] estproc], s, Range[Length @ states]] & @@@ tuples ;

TeXForm @ Grid[Prepend[table, Prepend[states[[ordering]], ""]], Dividers -> All]



$beginarray
hline
text & textA & textB & textC & textD & textE \
hline
textAA & frac59 & frac19 & frac29 & frac19 & 0 \
hline
textAB & 0 & 0 & 0 & 0 & 0 \
hline
textAC & frac15 & 0 & frac15 & frac25 & frac15 \
hline
textAD & frac18 & frac18 & frac38 & frac18 & frac14 \
hline
textAE & 0 & frac25 & frac25 & frac15 & 0 \
hline
textBA & 0 & 0 & 0 & 0 & 0 \
hline
textBB & 0 & 0 & 0 & 0 & 0 \
hline
textBC & 0 & 0 & 0 & 0 & 0 \
hline
textBD & 0 & 0 & 0 & 0 & 0 \
hline
textBE & 0 & 0 & 0 & 0 & 0 \
hline
textCA & frac59 & frac19 & frac29 & frac19 & 0 \
hline
textCB & frac13 & frac23 & 0 & 0 & 0 \
hline
textCC & frac15 & 0 & frac15 & frac25 & frac15 \
hline
textCD & frac18 & frac18 & frac38 & frac18 & frac14 \
hline
textCE & 0 & 0 & 0 & 0 & 0 \
hline
textDA & frac59 & frac19 & frac29 & frac19 & 0 \
hline
textDB & frac13 & frac23 & 0 & 0 & 0 \
hline
textDC & frac15 & 0 & frac15 & frac25 & frac15 \
hline
textDD & frac18 & frac18 & frac38 & frac18 & frac14 \
hline
textDE & 0 & frac25 & frac25 & frac15 & 0 \
hline
textEA & 0 & 0 & 0 & 0 & 0 \
hline
textEB & 0 & 0 & 0 & 0 & 0 \
hline
textEC & frac15 & 0 & frac15 & frac25 & frac15 \
hline
textED & frac18 & frac18 & frac38 & frac18 & frac14 \
hline
textEE & 0 & frac25 & frac25 & frac15 & 0 \
hline
endarray$







share|improve this answer






















  • Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
    – William
    Aug 8 at 10:28










  • @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
    – kglr
    Aug 8 at 10:39










  • So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
    – William
    Aug 8 at 10:51










  • @William, please see the update.
    – kglr
    Aug 8 at 12:26










  • thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
    – William
    Aug 8 at 12:37

















up vote
2
down vote













As a variant of my answer to the linked question, the following should work correctly and efficiently.



Some random data to work with:



x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];


Creating the probability tensor P:



n = 2;
data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
A = With[spopt = SystemOptions["SparseArrayOptions"],
Internal`WithLocalSettings[
(*switch to additive assembly*)
SetSystemOptions["SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

(*assemble matrix*)
SparseArray[Partition[data, n + 1, 1] -> 1, ConstantArray[Max[data], n + 1] ],

(*reset "SparseArrayOptions" to previous value*)
SetSystemOptions[spopt]]]; // AbsoluteTiming // First
P = #/N[Total[Abs[#], n + 1] /. 0 -> 1] &@Flatten[A, n - 1];



0.717521



0.184357




The row labels of P should be



Tuples[Sort[DeleteDuplicates[x]], n]





share|improve this answer





























    up vote
    1
    down vote













    You can use CrossTensorate from the package CrossTabulate.m, which I used and referenced in my answer of the previous question.



    The making of contingency tensors with that function is discussed in this blog post: "Contingency tables creation examples".



    In general, though, I would say it is better to use Tries with Frequencies or nested associations.



    tmat3 = CrossTensorate[Count == 1 + 2 + 3, Partition[x, 3, 1]];

    tmat4 = CrossTensorate[Count == 1 + 2 + 3 + 4, Partition[x, 4, 1]];

    tmat3["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat3["XTABTensor"];
    tmat4["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat4["XTABTensor"];

    Grid["tmat3", "tmat4", MatrixForm[tmat3], MatrixForm[tmat4]]


    enter image description here



    ArrayRules[tmat3["XTABTensor"]]

    (* 1, 1, 1 -> 3/5, 1, 1, 5 -> 1/5, 1, 5, 4 ->
    1, 1, 4, 2 -> 1/2, 1, 4, 1 -> 1/2, 1, 1, 3 -> 1/
    5, 1, 3, 3 -> 1, 2, 5, 5 -> 1/2, 2, 1, 4 -> 1, 2, 5, 3 -> 1/
    2, 3, 2, 5 -> 1, 3, 1, 4 -> 1, 3, 3, 3 -> 1/2, 3, 3, 4 -> 1/
    2, 3, 4, 4 -> 1, 4, 4, 4 -> 1/3, 4, 4, 3 -> 1/3, 4, 3, 2 ->
    1, 4, 2, 1 -> 1/2, 4, 2, 5 -> 1/2, 4, 1, 1 -> 1, 4, 4, 5 ->
    1/3, 5, 4, 4 -> 1/2, 5, 5, 5 -> 1/2, 5, 5, 4 -> 1/
    2, 5, 4, 2 -> 1/2, 5, 3, 1 -> 1, _, _, _ -> 0 *)





    share|improve this answer






















      Your Answer




      StackExchange.ifUsing("editor", function ()
      return StackExchange.using("mathjaxEditing", function ()
      StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
      StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
      );
      );
      , "mathjax-editing");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "387"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: false,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f179684%2fconstructing-higher-order-transition-probability-matrix%23new-answer', 'question_page');

      );

      Post as a guest






























      4 Answers
      4






      active

      oldest

      votes








      4 Answers
      4






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      5
      down vote



      accepted










      The following code is just brute-force. But at least yields the expected results. Also, it can be used for any order.



      The first parameter is the data. The second parameter is the order.



      probM[data_, ord_] := 
      Module[uniques = Union[data], acc = 0, len, trans, trPre, tData,
      toCount, toGather, toNormalize,
      trans = Dispatch@Thread[uniques -> Range[len = Length[uniques]]];
      trPre = Dispatch@Flatten[Array[## -> ++acc &, ConstantArray[len, ord]]];
      tData = Replace[data, trans, 1];
      toCount = Partition[tData, ord + 1, 1];
      toGather = Map[Replace[#[[1, ;; -2]], trPre], #[[1, -1]] -> #[[2]] &,
      Tally[toCount]];
      toNormalize = GatherBy[toGather, #[[1, 1]] &];
      SparseArray[
      Flatten@Map[
      With[tot = 1/Plus @@ #[[All, 2]],
      Map[#[[1]] -> #[[2]] tot &, #]] &, toNormalize]]];


      Let us check the dimensions of the first three orders.



      Table[probM[x, i] // Dimensions, i, 3]
      (*5, 5, 25, 5, 125, 5*)


      As for the efficiency of probM, I tried replacing some of the Map with ParallelMap but it did not yield any improvement. You might want to combine with niceties from the other answer. For example, use ArrayComponents instead of dispatch tables.



      In any case, check the second order table:



      $$
      beginarraycccccc
      text & textA & textB & textC & textD & textE \
      textAA & frac35 & 0 & frac15 & 0 & frac15 \
      textAB & 0 & 0 & 0 & 0 & 0 \
      textAC & 0 & 0 & 1 & 0 & 0 \
      textAD & frac12 & frac12 & 0 & 0 & 0 \
      textAE & 0 & 0 & 0 & 1 & 0 \
      textBA & 0 & 0 & 0 & 1 & 0 \
      textBB & 0 & 0 & 0 & 0 & 0 \
      textBC & 0 & 0 & 0 & 0 & 0 \
      textBD & 0 & 0 & 0 & 0 & 0 \
      textBE & 0 & 0 & frac12 & 0 & frac12 \
      textCA & 0 & 0 & 0 & 1 & 0 \
      textCB & 0 & 0 & 0 & 0 & 1 \
      textCC & 0 & 0 & frac12 & frac12 & 0 \
      textCD & 0 & 0 & 0 & 1 & 0 \
      textCE & 0 & 0 & 0 & 0 & 0 \
      textDA & 1 & 0 & 0 & 0 & 0 \
      textDB & frac12 & 0 & 0 & 0 & frac12 \
      textDC & 0 & 1 & 0 & 0 & 0 \
      textDD & 0 & 0 & frac13 & frac13 & frac13 \
      textDE & 0 & 0 & 0 & 0 & 0 \
      textEA & 0 & 0 & 0 & 0 & 0 \
      textEB & 0 & 0 & 0 & 0 & 0 \
      textEC & 1 & 0 & 0 & 0 & 0 \
      textED & 0 & frac12 & 0 & frac12 & 0 \
      textEE & 0 & 0 & 0 & frac12 & frac12 \
      endarray
      $$






      share|improve this answer






















      • Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
        – William
        Aug 8 at 11:53










      • @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
        – Hector
        Aug 8 at 11:57











      • Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
        – William
        Aug 8 at 12:00










      • I am using 8.0 ver
        – William
        Aug 8 at 12:01










      • @William I'll edit the answer to make it more clear.
        – Hector
        Aug 8 at 12:02














      up vote
      5
      down vote



      accepted










      The following code is just brute-force. But at least yields the expected results. Also, it can be used for any order.



      The first parameter is the data. The second parameter is the order.



      probM[data_, ord_] := 
      Module[uniques = Union[data], acc = 0, len, trans, trPre, tData,
      toCount, toGather, toNormalize,
      trans = Dispatch@Thread[uniques -> Range[len = Length[uniques]]];
      trPre = Dispatch@Flatten[Array[## -> ++acc &, ConstantArray[len, ord]]];
      tData = Replace[data, trans, 1];
      toCount = Partition[tData, ord + 1, 1];
      toGather = Map[Replace[#[[1, ;; -2]], trPre], #[[1, -1]] -> #[[2]] &,
      Tally[toCount]];
      toNormalize = GatherBy[toGather, #[[1, 1]] &];
      SparseArray[
      Flatten@Map[
      With[tot = 1/Plus @@ #[[All, 2]],
      Map[#[[1]] -> #[[2]] tot &, #]] &, toNormalize]]];


      Let us check the dimensions of the first three orders.



      Table[probM[x, i] // Dimensions, i, 3]
      (*5, 5, 25, 5, 125, 5*)


      As for the efficiency of probM, I tried replacing some of the Map with ParallelMap but it did not yield any improvement. You might want to combine with niceties from the other answer. For example, use ArrayComponents instead of dispatch tables.



      In any case, check the second order table:



      $$
      beginarraycccccc
      text & textA & textB & textC & textD & textE \
      textAA & frac35 & 0 & frac15 & 0 & frac15 \
      textAB & 0 & 0 & 0 & 0 & 0 \
      textAC & 0 & 0 & 1 & 0 & 0 \
      textAD & frac12 & frac12 & 0 & 0 & 0 \
      textAE & 0 & 0 & 0 & 1 & 0 \
      textBA & 0 & 0 & 0 & 1 & 0 \
      textBB & 0 & 0 & 0 & 0 & 0 \
      textBC & 0 & 0 & 0 & 0 & 0 \
      textBD & 0 & 0 & 0 & 0 & 0 \
      textBE & 0 & 0 & frac12 & 0 & frac12 \
      textCA & 0 & 0 & 0 & 1 & 0 \
      textCB & 0 & 0 & 0 & 0 & 1 \
      textCC & 0 & 0 & frac12 & frac12 & 0 \
      textCD & 0 & 0 & 0 & 1 & 0 \
      textCE & 0 & 0 & 0 & 0 & 0 \
      textDA & 1 & 0 & 0 & 0 & 0 \
      textDB & frac12 & 0 & 0 & 0 & frac12 \
      textDC & 0 & 1 & 0 & 0 & 0 \
      textDD & 0 & 0 & frac13 & frac13 & frac13 \
      textDE & 0 & 0 & 0 & 0 & 0 \
      textEA & 0 & 0 & 0 & 0 & 0 \
      textEB & 0 & 0 & 0 & 0 & 0 \
      textEC & 1 & 0 & 0 & 0 & 0 \
      textED & 0 & frac12 & 0 & frac12 & 0 \
      textEE & 0 & 0 & 0 & frac12 & frac12 \
      endarray
      $$






      share|improve this answer






















      • Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
        – William
        Aug 8 at 11:53










      • @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
        – Hector
        Aug 8 at 11:57











      • Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
        – William
        Aug 8 at 12:00










      • I am using 8.0 ver
        – William
        Aug 8 at 12:01










      • @William I'll edit the answer to make it more clear.
        – Hector
        Aug 8 at 12:02












      up vote
      5
      down vote



      accepted







      up vote
      5
      down vote



      accepted






      The following code is just brute-force. But at least yields the expected results. Also, it can be used for any order.



      The first parameter is the data. The second parameter is the order.



      probM[data_, ord_] := 
      Module[uniques = Union[data], acc = 0, len, trans, trPre, tData,
      toCount, toGather, toNormalize,
      trans = Dispatch@Thread[uniques -> Range[len = Length[uniques]]];
      trPre = Dispatch@Flatten[Array[## -> ++acc &, ConstantArray[len, ord]]];
      tData = Replace[data, trans, 1];
      toCount = Partition[tData, ord + 1, 1];
      toGather = Map[Replace[#[[1, ;; -2]], trPre], #[[1, -1]] -> #[[2]] &,
      Tally[toCount]];
      toNormalize = GatherBy[toGather, #[[1, 1]] &];
      SparseArray[
      Flatten@Map[
      With[tot = 1/Plus @@ #[[All, 2]],
      Map[#[[1]] -> #[[2]] tot &, #]] &, toNormalize]]];


      Let us check the dimensions of the first three orders.



      Table[probM[x, i] // Dimensions, i, 3]
      (*5, 5, 25, 5, 125, 5*)


      As for the efficiency of probM, I tried replacing some of the Map with ParallelMap but it did not yield any improvement. You might want to combine with niceties from the other answer. For example, use ArrayComponents instead of dispatch tables.



      In any case, check the second order table:



      $$
      beginarraycccccc
      text & textA & textB & textC & textD & textE \
      textAA & frac35 & 0 & frac15 & 0 & frac15 \
      textAB & 0 & 0 & 0 & 0 & 0 \
      textAC & 0 & 0 & 1 & 0 & 0 \
      textAD & frac12 & frac12 & 0 & 0 & 0 \
      textAE & 0 & 0 & 0 & 1 & 0 \
      textBA & 0 & 0 & 0 & 1 & 0 \
      textBB & 0 & 0 & 0 & 0 & 0 \
      textBC & 0 & 0 & 0 & 0 & 0 \
      textBD & 0 & 0 & 0 & 0 & 0 \
      textBE & 0 & 0 & frac12 & 0 & frac12 \
      textCA & 0 & 0 & 0 & 1 & 0 \
      textCB & 0 & 0 & 0 & 0 & 1 \
      textCC & 0 & 0 & frac12 & frac12 & 0 \
      textCD & 0 & 0 & 0 & 1 & 0 \
      textCE & 0 & 0 & 0 & 0 & 0 \
      textDA & 1 & 0 & 0 & 0 & 0 \
      textDB & frac12 & 0 & 0 & 0 & frac12 \
      textDC & 0 & 1 & 0 & 0 & 0 \
      textDD & 0 & 0 & frac13 & frac13 & frac13 \
      textDE & 0 & 0 & 0 & 0 & 0 \
      textEA & 0 & 0 & 0 & 0 & 0 \
      textEB & 0 & 0 & 0 & 0 & 0 \
      textEC & 1 & 0 & 0 & 0 & 0 \
      textED & 0 & frac12 & 0 & frac12 & 0 \
      textEE & 0 & 0 & 0 & frac12 & frac12 \
      endarray
      $$






      share|improve this answer














      The following code is just brute-force. But at least yields the expected results. Also, it can be used for any order.



      The first parameter is the data. The second parameter is the order.



      probM[data_, ord_] := 
      Module[uniques = Union[data], acc = 0, len, trans, trPre, tData,
      toCount, toGather, toNormalize,
      trans = Dispatch@Thread[uniques -> Range[len = Length[uniques]]];
      trPre = Dispatch@Flatten[Array[## -> ++acc &, ConstantArray[len, ord]]];
      tData = Replace[data, trans, 1];
      toCount = Partition[tData, ord + 1, 1];
      toGather = Map[Replace[#[[1, ;; -2]], trPre], #[[1, -1]] -> #[[2]] &,
      Tally[toCount]];
      toNormalize = GatherBy[toGather, #[[1, 1]] &];
      SparseArray[
      Flatten@Map[
      With[tot = 1/Plus @@ #[[All, 2]],
      Map[#[[1]] -> #[[2]] tot &, #]] &, toNormalize]]];


      Let us check the dimensions of the first three orders.



      Table[probM[x, i] // Dimensions, i, 3]
      (*5, 5, 25, 5, 125, 5*)


      As for the efficiency of probM, I tried replacing some of the Map with ParallelMap but it did not yield any improvement. You might want to combine with niceties from the other answer. For example, use ArrayComponents instead of dispatch tables.



      In any case, check the second order table:



      $$
      beginarraycccccc
      text & textA & textB & textC & textD & textE \
      textAA & frac35 & 0 & frac15 & 0 & frac15 \
      textAB & 0 & 0 & 0 & 0 & 0 \
      textAC & 0 & 0 & 1 & 0 & 0 \
      textAD & frac12 & frac12 & 0 & 0 & 0 \
      textAE & 0 & 0 & 0 & 1 & 0 \
      textBA & 0 & 0 & 0 & 1 & 0 \
      textBB & 0 & 0 & 0 & 0 & 0 \
      textBC & 0 & 0 & 0 & 0 & 0 \
      textBD & 0 & 0 & 0 & 0 & 0 \
      textBE & 0 & 0 & frac12 & 0 & frac12 \
      textCA & 0 & 0 & 0 & 1 & 0 \
      textCB & 0 & 0 & 0 & 0 & 1 \
      textCC & 0 & 0 & frac12 & frac12 & 0 \
      textCD & 0 & 0 & 0 & 1 & 0 \
      textCE & 0 & 0 & 0 & 0 & 0 \
      textDA & 1 & 0 & 0 & 0 & 0 \
      textDB & frac12 & 0 & 0 & 0 & frac12 \
      textDC & 0 & 1 & 0 & 0 & 0 \
      textDD & 0 & 0 & frac13 & frac13 & frac13 \
      textDE & 0 & 0 & 0 & 0 & 0 \
      textEA & 0 & 0 & 0 & 0 & 0 \
      textEB & 0 & 0 & 0 & 0 & 0 \
      textEC & 1 & 0 & 0 & 0 & 0 \
      textED & 0 & frac12 & 0 & frac12 & 0 \
      textEE & 0 & 0 & 0 & frac12 & frac12 \
      endarray
      $$







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Aug 8 at 12:05

























      answered Aug 8 at 11:45









      Hector

      5,2121033




      5,2121033











      • Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
        – William
        Aug 8 at 11:53










      • @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
        – Hector
        Aug 8 at 11:57











      • Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
        – William
        Aug 8 at 12:00










      • I am using 8.0 ver
        – William
        Aug 8 at 12:01










      • @William I'll edit the answer to make it more clear.
        – Hector
        Aug 8 at 12:02
















      • Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
        – William
        Aug 8 at 11:53










      • @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
        – Hector
        Aug 8 at 11:57











      • Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
        – William
        Aug 8 at 12:00










      • I am using 8.0 ver
        – William
        Aug 8 at 12:01










      • @William I'll edit the answer to make it more clear.
        – Hector
        Aug 8 at 12:02















      Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
      – William
      Aug 8 at 11:53




      Thanks a lot, what command did you use to get the final table? when I run your code I get 5, 5, 22, 5, 109, 5
      – William
      Aug 8 at 11:53












      @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
      – Hector
      Aug 8 at 11:57





      @William Try probM[x, 2] // MatrixForm. As for your result, it seems that you run Table[probM[x, i] // Dimensions, i, 3]. But that should have returned 5, 5, 25, 5, 125, 5. What version of MMA are you running? Mine is 9.0.
      – Hector
      Aug 8 at 11:57













      Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
      – William
      Aug 8 at 12:00




      Perfect, and how can I change the orders (this is second order, if I want to check the code with 1, 3 and other orders)? and the offset (possibly)?
      – William
      Aug 8 at 12:00












      I am using 8.0 ver
      – William
      Aug 8 at 12:01




      I am using 8.0 ver
      – William
      Aug 8 at 12:01












      @William I'll edit the answer to make it more clear.
      – Hector
      Aug 8 at 12:02




      @William I'll edit the answer to make it more clear.
      – Hector
      Aug 8 at 12:02










      up vote
      5
      down vote













      Update: Using EmpiricalDistribution and MarginalDistribution to compute the conditional probabilities:



      ClearAll[transitionProb]
      transitionProb[step_: 1][x_] := Module[states = DeleteDuplicates@x,
      ed = EmpiricalDistribution[Partition[ArrayComponents @ x, step + 1, 1]],
      ordering, tuples, md, condpdF,
      ordering = Ordering[states]; tuples = Tuples[ordering, step];
      md = MarginalDistribution[ed, Range[step]];
      condpdF[u__, w_] := If[PDF[md, u] === 0, 0, PDF[ed, u, w]/PDF[md, u]];
      Prepend[Row @ states[[##]],
      ## & @@ Table[## & @@ condpdF[##, i], i, ordering] & @@@ tuples,
      Prepend[states[[ordering]], ""]]]


      Examples:



      transitionProb[2][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAtextA & frac35 & 0 & frac15 & 0 & frac15 \
      hline
      textAtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAtextC & 0 & 0 & 1 & 0 & 0 \
      hline
      textAtextD & frac12 & frac12 & 0 & 0 & 0 \
      hline
      textAtextE & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextE & 0 & 0 & frac12 & 0 & frac12 \
      hline
      textCtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextB & 0 & 0 & 0 & 0 & 1 \
      hline
      textCtextC & 0 & 0 & frac12 & frac12 & 0 \
      hline
      textCtextD & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDtextA & 1 & 0 & 0 & 0 & 0 \
      hline
      textDtextB & frac12 & 0 & 0 & 0 & frac12 \
      hline
      textDtextC & 0 & 1 & 0 & 0 & 0 \
      hline
      textDtextD & 0 & 0 & frac13 & frac13 & frac13 \
      hline
      textDtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextC & 1 & 0 & 0 & 0 & 0 \
      hline
      textEtextD & 0 & frac12 & 0 & frac12 & 0 \
      hline
      textEtextE & 0 & 0 & 0 & frac12 & frac12 \
      hline
      endarray$




      transitionProb[1][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textA & frac59 & 0 & frac19 & frac29 & frac19 \
      hline
      textB & frac13 & 0 & 0 & 0 & frac23 \
      hline
      textC & frac15 & frac15 & frac25 & frac15 & 0 \
      hline
      textD & frac18 & frac14 & frac18 & frac38 & frac18 \
      hline
      textE & 0 & 0 & frac15 & frac25 & frac25 \
      hline
      endarray$




      Original answer:



      states = DeleteDuplicates[x];
      ordering = Ordering[states];
      data = ArrayComponents@x ;
      estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
      tuples = Tuples[Range[5][[ordering]], 2];
      table = Row@states[[##]], ## & @@
      Table[Probability[p[3] == s [Conditioned] p[1] == # && p[2] == #2,
      p [Distributed] estproc], s, Range[Length @ states]] & @@@ tuples ;

      TeXForm @ Grid[Prepend[table, Prepend[states[[ordering]], ""]], Dividers -> All]



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textAB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textAD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textAE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textBA & 0 & 0 & 0 & 0 & 0 \
      hline
      textBB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBE & 0 & 0 & 0 & 0 & 0 \
      hline
      textCA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textCB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textCC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textCD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textCE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textDB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textDC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textDD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textDE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textEA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textED & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textEE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      endarray$







      share|improve this answer






















      • Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
        – William
        Aug 8 at 10:28










      • @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
        – kglr
        Aug 8 at 10:39










      • So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
        – William
        Aug 8 at 10:51










      • @William, please see the update.
        – kglr
        Aug 8 at 12:26










      • thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
        – William
        Aug 8 at 12:37














      up vote
      5
      down vote













      Update: Using EmpiricalDistribution and MarginalDistribution to compute the conditional probabilities:



      ClearAll[transitionProb]
      transitionProb[step_: 1][x_] := Module[states = DeleteDuplicates@x,
      ed = EmpiricalDistribution[Partition[ArrayComponents @ x, step + 1, 1]],
      ordering, tuples, md, condpdF,
      ordering = Ordering[states]; tuples = Tuples[ordering, step];
      md = MarginalDistribution[ed, Range[step]];
      condpdF[u__, w_] := If[PDF[md, u] === 0, 0, PDF[ed, u, w]/PDF[md, u]];
      Prepend[Row @ states[[##]],
      ## & @@ Table[## & @@ condpdF[##, i], i, ordering] & @@@ tuples,
      Prepend[states[[ordering]], ""]]]


      Examples:



      transitionProb[2][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAtextA & frac35 & 0 & frac15 & 0 & frac15 \
      hline
      textAtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAtextC & 0 & 0 & 1 & 0 & 0 \
      hline
      textAtextD & frac12 & frac12 & 0 & 0 & 0 \
      hline
      textAtextE & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextE & 0 & 0 & frac12 & 0 & frac12 \
      hline
      textCtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextB & 0 & 0 & 0 & 0 & 1 \
      hline
      textCtextC & 0 & 0 & frac12 & frac12 & 0 \
      hline
      textCtextD & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDtextA & 1 & 0 & 0 & 0 & 0 \
      hline
      textDtextB & frac12 & 0 & 0 & 0 & frac12 \
      hline
      textDtextC & 0 & 1 & 0 & 0 & 0 \
      hline
      textDtextD & 0 & 0 & frac13 & frac13 & frac13 \
      hline
      textDtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextC & 1 & 0 & 0 & 0 & 0 \
      hline
      textEtextD & 0 & frac12 & 0 & frac12 & 0 \
      hline
      textEtextE & 0 & 0 & 0 & frac12 & frac12 \
      hline
      endarray$




      transitionProb[1][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textA & frac59 & 0 & frac19 & frac29 & frac19 \
      hline
      textB & frac13 & 0 & 0 & 0 & frac23 \
      hline
      textC & frac15 & frac15 & frac25 & frac15 & 0 \
      hline
      textD & frac18 & frac14 & frac18 & frac38 & frac18 \
      hline
      textE & 0 & 0 & frac15 & frac25 & frac25 \
      hline
      endarray$




      Original answer:



      states = DeleteDuplicates[x];
      ordering = Ordering[states];
      data = ArrayComponents@x ;
      estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
      tuples = Tuples[Range[5][[ordering]], 2];
      table = Row@states[[##]], ## & @@
      Table[Probability[p[3] == s [Conditioned] p[1] == # && p[2] == #2,
      p [Distributed] estproc], s, Range[Length @ states]] & @@@ tuples ;

      TeXForm @ Grid[Prepend[table, Prepend[states[[ordering]], ""]], Dividers -> All]



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textAB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textAD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textAE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textBA & 0 & 0 & 0 & 0 & 0 \
      hline
      textBB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBE & 0 & 0 & 0 & 0 & 0 \
      hline
      textCA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textCB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textCC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textCD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textCE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textDB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textDC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textDD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textDE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textEA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textED & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textEE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      endarray$







      share|improve this answer






















      • Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
        – William
        Aug 8 at 10:28










      • @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
        – kglr
        Aug 8 at 10:39










      • So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
        – William
        Aug 8 at 10:51










      • @William, please see the update.
        – kglr
        Aug 8 at 12:26










      • thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
        – William
        Aug 8 at 12:37












      up vote
      5
      down vote










      up vote
      5
      down vote









      Update: Using EmpiricalDistribution and MarginalDistribution to compute the conditional probabilities:



      ClearAll[transitionProb]
      transitionProb[step_: 1][x_] := Module[states = DeleteDuplicates@x,
      ed = EmpiricalDistribution[Partition[ArrayComponents @ x, step + 1, 1]],
      ordering, tuples, md, condpdF,
      ordering = Ordering[states]; tuples = Tuples[ordering, step];
      md = MarginalDistribution[ed, Range[step]];
      condpdF[u__, w_] := If[PDF[md, u] === 0, 0, PDF[ed, u, w]/PDF[md, u]];
      Prepend[Row @ states[[##]],
      ## & @@ Table[## & @@ condpdF[##, i], i, ordering] & @@@ tuples,
      Prepend[states[[ordering]], ""]]]


      Examples:



      transitionProb[2][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAtextA & frac35 & 0 & frac15 & 0 & frac15 \
      hline
      textAtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAtextC & 0 & 0 & 1 & 0 & 0 \
      hline
      textAtextD & frac12 & frac12 & 0 & 0 & 0 \
      hline
      textAtextE & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextE & 0 & 0 & frac12 & 0 & frac12 \
      hline
      textCtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextB & 0 & 0 & 0 & 0 & 1 \
      hline
      textCtextC & 0 & 0 & frac12 & frac12 & 0 \
      hline
      textCtextD & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDtextA & 1 & 0 & 0 & 0 & 0 \
      hline
      textDtextB & frac12 & 0 & 0 & 0 & frac12 \
      hline
      textDtextC & 0 & 1 & 0 & 0 & 0 \
      hline
      textDtextD & 0 & 0 & frac13 & frac13 & frac13 \
      hline
      textDtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextC & 1 & 0 & 0 & 0 & 0 \
      hline
      textEtextD & 0 & frac12 & 0 & frac12 & 0 \
      hline
      textEtextE & 0 & 0 & 0 & frac12 & frac12 \
      hline
      endarray$




      transitionProb[1][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textA & frac59 & 0 & frac19 & frac29 & frac19 \
      hline
      textB & frac13 & 0 & 0 & 0 & frac23 \
      hline
      textC & frac15 & frac15 & frac25 & frac15 & 0 \
      hline
      textD & frac18 & frac14 & frac18 & frac38 & frac18 \
      hline
      textE & 0 & 0 & frac15 & frac25 & frac25 \
      hline
      endarray$




      Original answer:



      states = DeleteDuplicates[x];
      ordering = Ordering[states];
      data = ArrayComponents@x ;
      estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
      tuples = Tuples[Range[5][[ordering]], 2];
      table = Row@states[[##]], ## & @@
      Table[Probability[p[3] == s [Conditioned] p[1] == # && p[2] == #2,
      p [Distributed] estproc], s, Range[Length @ states]] & @@@ tuples ;

      TeXForm @ Grid[Prepend[table, Prepend[states[[ordering]], ""]], Dividers -> All]



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textAB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textAD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textAE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textBA & 0 & 0 & 0 & 0 & 0 \
      hline
      textBB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBE & 0 & 0 & 0 & 0 & 0 \
      hline
      textCA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textCB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textCC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textCD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textCE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textDB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textDC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textDD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textDE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textEA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textED & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textEE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      endarray$







      share|improve this answer














      Update: Using EmpiricalDistribution and MarginalDistribution to compute the conditional probabilities:



      ClearAll[transitionProb]
      transitionProb[step_: 1][x_] := Module[states = DeleteDuplicates@x,
      ed = EmpiricalDistribution[Partition[ArrayComponents @ x, step + 1, 1]],
      ordering, tuples, md, condpdF,
      ordering = Ordering[states]; tuples = Tuples[ordering, step];
      md = MarginalDistribution[ed, Range[step]];
      condpdF[u__, w_] := If[PDF[md, u] === 0, 0, PDF[ed, u, w]/PDF[md, u]];
      Prepend[Row @ states[[##]],
      ## & @@ Table[## & @@ condpdF[##, i], i, ordering] & @@@ tuples,
      Prepend[states[[ordering]], ""]]]


      Examples:



      transitionProb[2][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAtextA & frac35 & 0 & frac15 & 0 & frac15 \
      hline
      textAtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAtextC & 0 & 0 & 1 & 0 & 0 \
      hline
      textAtextD & frac12 & frac12 & 0 & 0 & 0 \
      hline
      textAtextE & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textBtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBtextE & 0 & 0 & frac12 & 0 & frac12 \
      hline
      textCtextA & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextB & 0 & 0 & 0 & 0 & 1 \
      hline
      textCtextC & 0 & 0 & frac12 & frac12 & 0 \
      hline
      textCtextD & 0 & 0 & 0 & 1 & 0 \
      hline
      textCtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDtextA & 1 & 0 & 0 & 0 & 0 \
      hline
      textDtextB & frac12 & 0 & 0 & 0 & frac12 \
      hline
      textDtextC & 0 & 1 & 0 & 0 & 0 \
      hline
      textDtextD & 0 & 0 & frac13 & frac13 & frac13 \
      hline
      textDtextE & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEtextC & 1 & 0 & 0 & 0 & 0 \
      hline
      textEtextD & 0 & frac12 & 0 & frac12 & 0 \
      hline
      textEtextE & 0 & 0 & 0 & frac12 & frac12 \
      hline
      endarray$




      transitionProb[1][x] // Grid[#, Dividers -> All] & // TeXForm



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textA & frac59 & 0 & frac19 & frac29 & frac19 \
      hline
      textB & frac13 & 0 & 0 & 0 & frac23 \
      hline
      textC & frac15 & frac15 & frac25 & frac15 & 0 \
      hline
      textD & frac18 & frac14 & frac18 & frac38 & frac18 \
      hline
      textE & 0 & 0 & frac15 & frac25 & frac25 \
      hline
      endarray$




      Original answer:



      states = DeleteDuplicates[x];
      ordering = Ordering[states];
      data = ArrayComponents@x ;
      estproc = EstimatedProcess[data, DiscreteMarkovProcess[Length@states]];
      tuples = Tuples[Range[5][[ordering]], 2];
      table = Row@states[[##]], ## & @@
      Table[Probability[p[3] == s [Conditioned] p[1] == # && p[2] == #2,
      p [Distributed] estproc], s, Range[Length @ states]] & @@@ tuples ;

      TeXForm @ Grid[Prepend[table, Prepend[states[[ordering]], ""]], Dividers -> All]



      $beginarray
      hline
      text & textA & textB & textC & textD & textE \
      hline
      textAA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textAB & 0 & 0 & 0 & 0 & 0 \
      hline
      textAC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textAD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textAE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textBA & 0 & 0 & 0 & 0 & 0 \
      hline
      textBB & 0 & 0 & 0 & 0 & 0 \
      hline
      textBC & 0 & 0 & 0 & 0 & 0 \
      hline
      textBD & 0 & 0 & 0 & 0 & 0 \
      hline
      textBE & 0 & 0 & 0 & 0 & 0 \
      hline
      textCA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textCB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textCC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textCD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textCE & 0 & 0 & 0 & 0 & 0 \
      hline
      textDA & frac59 & frac19 & frac29 & frac19 & 0 \
      hline
      textDB & frac13 & frac23 & 0 & 0 & 0 \
      hline
      textDC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textDD & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textDE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      textEA & 0 & 0 & 0 & 0 & 0 \
      hline
      textEB & 0 & 0 & 0 & 0 & 0 \
      hline
      textEC & frac15 & 0 & frac15 & frac25 & frac15 \
      hline
      textED & frac18 & frac18 & frac38 & frac18 & frac14 \
      hline
      textEE & 0 & frac25 & frac25 & frac15 & 0 \
      hline
      endarray$








      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Aug 8 at 13:34

























      answered Aug 8 at 10:11









      kglr

      157k8182379




      157k8182379











      • Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
        – William
        Aug 8 at 10:28










      • @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
        – kglr
        Aug 8 at 10:39










      • So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
        – William
        Aug 8 at 10:51










      • @William, please see the update.
        – kglr
        Aug 8 at 12:26










      • thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
        – William
        Aug 8 at 12:37
















      • Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
        – William
        Aug 8 at 10:28










      • @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
        – kglr
        Aug 8 at 10:39










      • So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
        – William
        Aug 8 at 10:51










      • @William, please see the update.
        – kglr
        Aug 8 at 12:26










      • thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
        – William
        Aug 8 at 12:37















      Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
      – William
      Aug 8 at 10:28




      Thanks a lot for this, however as a check I tried 'Partition[x, 3, 1] // Counts' which shows that I have 3 AAA instead of 5, this happens all over the table my worry is that the state ordering and data which you defined are not matched, for example in ordering E has value of 5, while in data it is 2.
      – William
      Aug 8 at 10:28












      @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
      – kglr
      Aug 8 at 10:39




      @William, re ordering of states, that's why we sort them using ordering so that 2 corresponds to E. Re the discrepancy between Partition[x, 3, 1] // Counts and the Prob[A|AA] in the table above, i think it is because table is based on the TransitionMatrix of estproc and estproc is based on one-step transitions (Partition[x,2,1]).
      – kglr
      Aug 8 at 10:39












      So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
      – William
      Aug 8 at 10:51




      So basically DiscreteMarkovProcess doesn't have a memory, because in second order the transition is remembering the two steps behind that's why it is AA,A and not for example A,A,A, interesting.
      – William
      Aug 8 at 10:51












      @William, please see the update.
      – kglr
      Aug 8 at 12:26




      @William, please see the update.
      – kglr
      Aug 8 at 12:26












      thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
      – William
      Aug 8 at 12:37




      thank you @kglr, it indeed works fine and is shorter yet if you change the variable in Partition[data, 3, 1] for example to Partition[data, 2, 1] or else, the final table will not come out appropriate. I think hector's answer is good because you can change the order
      – William
      Aug 8 at 12:37










      up vote
      2
      down vote













      As a variant of my answer to the linked question, the following should work correctly and efficiently.



      Some random data to work with:



      x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];


      Creating the probability tensor P:



      n = 2;
      data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
      A = With[spopt = SystemOptions["SparseArrayOptions"],
      Internal`WithLocalSettings[
      (*switch to additive assembly*)
      SetSystemOptions["SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

      (*assemble matrix*)
      SparseArray[Partition[data, n + 1, 1] -> 1, ConstantArray[Max[data], n + 1] ],

      (*reset "SparseArrayOptions" to previous value*)
      SetSystemOptions[spopt]]]; // AbsoluteTiming // First
      P = #/N[Total[Abs[#], n + 1] /. 0 -> 1] &@Flatten[A, n - 1];



      0.717521



      0.184357




      The row labels of P should be



      Tuples[Sort[DeleteDuplicates[x]], n]





      share|improve this answer


























        up vote
        2
        down vote













        As a variant of my answer to the linked question, the following should work correctly and efficiently.



        Some random data to work with:



        x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];


        Creating the probability tensor P:



        n = 2;
        data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
        A = With[spopt = SystemOptions["SparseArrayOptions"],
        Internal`WithLocalSettings[
        (*switch to additive assembly*)
        SetSystemOptions["SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

        (*assemble matrix*)
        SparseArray[Partition[data, n + 1, 1] -> 1, ConstantArray[Max[data], n + 1] ],

        (*reset "SparseArrayOptions" to previous value*)
        SetSystemOptions[spopt]]]; // AbsoluteTiming // First
        P = #/N[Total[Abs[#], n + 1] /. 0 -> 1] &@Flatten[A, n - 1];



        0.717521



        0.184357




        The row labels of P should be



        Tuples[Sort[DeleteDuplicates[x]], n]





        share|improve this answer
























          up vote
          2
          down vote










          up vote
          2
          down vote









          As a variant of my answer to the linked question, the following should work correctly and efficiently.



          Some random data to work with:



          x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];


          Creating the probability tensor P:



          n = 2;
          data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
          A = With[spopt = SystemOptions["SparseArrayOptions"],
          Internal`WithLocalSettings[
          (*switch to additive assembly*)
          SetSystemOptions["SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

          (*assemble matrix*)
          SparseArray[Partition[data, n + 1, 1] -> 1, ConstantArray[Max[data], n + 1] ],

          (*reset "SparseArrayOptions" to previous value*)
          SetSystemOptions[spopt]]]; // AbsoluteTiming // First
          P = #/N[Total[Abs[#], n + 1] /. 0 -> 1] &@Flatten[A, n - 1];



          0.717521



          0.184357




          The row labels of P should be



          Tuples[Sort[DeleteDuplicates[x]], n]





          share|improve this answer














          As a variant of my answer to the linked question, the following should work correctly and efficiently.



          Some random data to work with:



          x = RandomChoice[Alphabet["English", "IndexCharacters"], 1000000];


          Creating the probability tensor P:



          n = 2;
          data = Flatten[ToCharacterCode[x]] - (ToCharacterCode["A"][[1]] - 1); // AbsoluteTiming // First
          A = With[spopt = SystemOptions["SparseArrayOptions"],
          Internal`WithLocalSettings[
          (*switch to additive assembly*)
          SetSystemOptions["SparseArrayOptions" -> "TreatRepeatedEntries" -> Total],

          (*assemble matrix*)
          SparseArray[Partition[data, n + 1, 1] -> 1, ConstantArray[Max[data], n + 1] ],

          (*reset "SparseArrayOptions" to previous value*)
          SetSystemOptions[spopt]]]; // AbsoluteTiming // First
          P = #/N[Total[Abs[#], n + 1] /. 0 -> 1] &@Flatten[A, n - 1];



          0.717521



          0.184357




          The row labels of P should be



          Tuples[Sort[DeleteDuplicates[x]], n]






          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Aug 8 at 21:13

























          answered Aug 8 at 19:05









          Henrik Schumacher

          35.9k249102




          35.9k249102




















              up vote
              1
              down vote













              You can use CrossTensorate from the package CrossTabulate.m, which I used and referenced in my answer of the previous question.



              The making of contingency tensors with that function is discussed in this blog post: "Contingency tables creation examples".



              In general, though, I would say it is better to use Tries with Frequencies or nested associations.



              tmat3 = CrossTensorate[Count == 1 + 2 + 3, Partition[x, 3, 1]];

              tmat4 = CrossTensorate[Count == 1 + 2 + 3 + 4, Partition[x, 4, 1]];

              tmat3["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat3["XTABTensor"];
              tmat4["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat4["XTABTensor"];

              Grid["tmat3", "tmat4", MatrixForm[tmat3], MatrixForm[tmat4]]


              enter image description here



              ArrayRules[tmat3["XTABTensor"]]

              (* 1, 1, 1 -> 3/5, 1, 1, 5 -> 1/5, 1, 5, 4 ->
              1, 1, 4, 2 -> 1/2, 1, 4, 1 -> 1/2, 1, 1, 3 -> 1/
              5, 1, 3, 3 -> 1, 2, 5, 5 -> 1/2, 2, 1, 4 -> 1, 2, 5, 3 -> 1/
              2, 3, 2, 5 -> 1, 3, 1, 4 -> 1, 3, 3, 3 -> 1/2, 3, 3, 4 -> 1/
              2, 3, 4, 4 -> 1, 4, 4, 4 -> 1/3, 4, 4, 3 -> 1/3, 4, 3, 2 ->
              1, 4, 2, 1 -> 1/2, 4, 2, 5 -> 1/2, 4, 1, 1 -> 1, 4, 4, 5 ->
              1/3, 5, 4, 4 -> 1/2, 5, 5, 5 -> 1/2, 5, 5, 4 -> 1/
              2, 5, 4, 2 -> 1/2, 5, 3, 1 -> 1, _, _, _ -> 0 *)





              share|improve this answer


























                up vote
                1
                down vote













                You can use CrossTensorate from the package CrossTabulate.m, which I used and referenced in my answer of the previous question.



                The making of contingency tensors with that function is discussed in this blog post: "Contingency tables creation examples".



                In general, though, I would say it is better to use Tries with Frequencies or nested associations.



                tmat3 = CrossTensorate[Count == 1 + 2 + 3, Partition[x, 3, 1]];

                tmat4 = CrossTensorate[Count == 1 + 2 + 3 + 4, Partition[x, 4, 1]];

                tmat3["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat3["XTABTensor"];
                tmat4["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat4["XTABTensor"];

                Grid["tmat3", "tmat4", MatrixForm[tmat3], MatrixForm[tmat4]]


                enter image description here



                ArrayRules[tmat3["XTABTensor"]]

                (* 1, 1, 1 -> 3/5, 1, 1, 5 -> 1/5, 1, 5, 4 ->
                1, 1, 4, 2 -> 1/2, 1, 4, 1 -> 1/2, 1, 1, 3 -> 1/
                5, 1, 3, 3 -> 1, 2, 5, 5 -> 1/2, 2, 1, 4 -> 1, 2, 5, 3 -> 1/
                2, 3, 2, 5 -> 1, 3, 1, 4 -> 1, 3, 3, 3 -> 1/2, 3, 3, 4 -> 1/
                2, 3, 4, 4 -> 1, 4, 4, 4 -> 1/3, 4, 4, 3 -> 1/3, 4, 3, 2 ->
                1, 4, 2, 1 -> 1/2, 4, 2, 5 -> 1/2, 4, 1, 1 -> 1, 4, 4, 5 ->
                1/3, 5, 4, 4 -> 1/2, 5, 5, 5 -> 1/2, 5, 5, 4 -> 1/
                2, 5, 4, 2 -> 1/2, 5, 3, 1 -> 1, _, _, _ -> 0 *)





                share|improve this answer
























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  You can use CrossTensorate from the package CrossTabulate.m, which I used and referenced in my answer of the previous question.



                  The making of contingency tensors with that function is discussed in this blog post: "Contingency tables creation examples".



                  In general, though, I would say it is better to use Tries with Frequencies or nested associations.



                  tmat3 = CrossTensorate[Count == 1 + 2 + 3, Partition[x, 3, 1]];

                  tmat4 = CrossTensorate[Count == 1 + 2 + 3 + 4, Partition[x, 4, 1]];

                  tmat3["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat3["XTABTensor"];
                  tmat4["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat4["XTABTensor"];

                  Grid["tmat3", "tmat4", MatrixForm[tmat3], MatrixForm[tmat4]]


                  enter image description here



                  ArrayRules[tmat3["XTABTensor"]]

                  (* 1, 1, 1 -> 3/5, 1, 1, 5 -> 1/5, 1, 5, 4 ->
                  1, 1, 4, 2 -> 1/2, 1, 4, 1 -> 1/2, 1, 1, 3 -> 1/
                  5, 1, 3, 3 -> 1, 2, 5, 5 -> 1/2, 2, 1, 4 -> 1, 2, 5, 3 -> 1/
                  2, 3, 2, 5 -> 1, 3, 1, 4 -> 1, 3, 3, 3 -> 1/2, 3, 3, 4 -> 1/
                  2, 3, 4, 4 -> 1, 4, 4, 4 -> 1/3, 4, 4, 3 -> 1/3, 4, 3, 2 ->
                  1, 4, 2, 1 -> 1/2, 4, 2, 5 -> 1/2, 4, 1, 1 -> 1, 4, 4, 5 ->
                  1/3, 5, 4, 4 -> 1/2, 5, 5, 5 -> 1/2, 5, 5, 4 -> 1/
                  2, 5, 4, 2 -> 1/2, 5, 3, 1 -> 1, _, _, _ -> 0 *)





                  share|improve this answer














                  You can use CrossTensorate from the package CrossTabulate.m, which I used and referenced in my answer of the previous question.



                  The making of contingency tensors with that function is discussed in this blog post: "Contingency tables creation examples".



                  In general, though, I would say it is better to use Tries with Frequencies or nested associations.



                  tmat3 = CrossTensorate[Count == 1 + 2 + 3, Partition[x, 3, 1]];

                  tmat4 = CrossTensorate[Count == 1 + 2 + 3 + 4, Partition[x, 4, 1]];

                  tmat3["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat3["XTABTensor"];
                  tmat4["XTABTensor"] = #/(Total[#, Length[Dimensions[#]]] /. 0 -> 1) &@tmat4["XTABTensor"];

                  Grid["tmat3", "tmat4", MatrixForm[tmat3], MatrixForm[tmat4]]


                  enter image description here



                  ArrayRules[tmat3["XTABTensor"]]

                  (* 1, 1, 1 -> 3/5, 1, 1, 5 -> 1/5, 1, 5, 4 ->
                  1, 1, 4, 2 -> 1/2, 1, 4, 1 -> 1/2, 1, 1, 3 -> 1/
                  5, 1, 3, 3 -> 1, 2, 5, 5 -> 1/2, 2, 1, 4 -> 1, 2, 5, 3 -> 1/
                  2, 3, 2, 5 -> 1, 3, 1, 4 -> 1, 3, 3, 3 -> 1/2, 3, 3, 4 -> 1/
                  2, 3, 4, 4 -> 1, 4, 4, 4 -> 1/3, 4, 4, 3 -> 1/3, 4, 3, 2 ->
                  1, 4, 2, 1 -> 1/2, 4, 2, 5 -> 1/2, 4, 1, 1 -> 1, 4, 4, 5 ->
                  1/3, 5, 4, 4 -> 1/2, 5, 5, 5 -> 1/2, 5, 5, 4 -> 1/
                  2, 5, 4, 2 -> 1/2, 5, 3, 1 -> 1, _, _, _ -> 0 *)






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Aug 8 at 21:03

























                  answered Aug 8 at 20:56









                  Anton Antonov

                  21.4k162107




                  21.4k162107



























                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f179684%2fconstructing-higher-order-transition-probability-matrix%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Comments

                      Popular posts from this blog

                      List of Gilmore Girls characters

                      What does second last employer means? [closed]

                      One-line joke