Create all combinations of letter substitution in string
Clash Royale CLAN TAG#URR8PPP
up vote
13
down vote
favorite
I have a string "ECET" and I would like to create all the possible strings where I substitute one or more letters (all but the first) with "X".
So in this case my result would be:
> result
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Any ideas as to how to approach the issue?
This is not just create the possible combinations/permutations of "X" but also how to combine them with the existing string.
r combinations
add a comment |Â
up vote
13
down vote
favorite
I have a string "ECET" and I would like to create all the possible strings where I substitute one or more letters (all but the first) with "X".
So in this case my result would be:
> result
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Any ideas as to how to approach the issue?
This is not just create the possible combinations/permutations of "X" but also how to combine them with the existing string.
r combinations
Not really, because I don't only need the combinations as such. I checked the question. (Or at least I think :) )
â User2321
Sep 6 at 9:32
@Salman, that's really only one part of the question.
â Axeman
Sep 6 at 9:32
add a comment |Â
up vote
13
down vote
favorite
up vote
13
down vote
favorite
I have a string "ECET" and I would like to create all the possible strings where I substitute one or more letters (all but the first) with "X".
So in this case my result would be:
> result
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Any ideas as to how to approach the issue?
This is not just create the possible combinations/permutations of "X" but also how to combine them with the existing string.
r combinations
I have a string "ECET" and I would like to create all the possible strings where I substitute one or more letters (all but the first) with "X".
So in this case my result would be:
> result
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Any ideas as to how to approach the issue?
This is not just create the possible combinations/permutations of "X" but also how to combine them with the existing string.
r combinations
edited Sep 6 at 9:38
asked Sep 6 at 9:29
User2321
1,168719
1,168719
Not really, because I don't only need the combinations as such. I checked the question. (Or at least I think :) )
â User2321
Sep 6 at 9:32
@Salman, that's really only one part of the question.
â Axeman
Sep 6 at 9:32
add a comment |Â
Not really, because I don't only need the combinations as such. I checked the question. (Or at least I think :) )
â User2321
Sep 6 at 9:32
@Salman, that's really only one part of the question.
â Axeman
Sep 6 at 9:32
Not really, because I don't only need the combinations as such. I checked the question. (Or at least I think :) )
â User2321
Sep 6 at 9:32
Not really, because I don't only need the combinations as such. I checked the question. (Or at least I think :) )
â User2321
Sep 6 at 9:32
@Salman, that's really only one part of the question.
â Axeman
Sep 6 at 9:32
@Salman, that's really only one part of the question.
â Axeman
Sep 6 at 9:32
add a comment |Â
7 Answers
7
active
oldest
votes
up vote
13
down vote
accepted
Using the FUN
argument of combn
:
a <- "ECET"
fun <- function(n, string)
combn(nchar(string), n, function(x)
s <- strsplit(string, '')[[1]]
s[x] <- 'X'
paste(s, collapse = '')
)
lapply(seq_len(nchar(a)), fun, string = a)
[[1]]
[1] "XCET" "EXET" "ECXT" "ECEX"
[[2]]
[1] "XXET" "XCXT" "XCEX" "EXXT" "EXEX" "ECXX"
[[3]]
[1] "XXXT" "XXEX" "XCXX" "EXXX"
[[4]]
[1] "XXXX"
unlist
to get a single vector. Faster solutions are probably available.
To leave your first character unchanged:
paste0(
substring(a, 1, 1),
unlist(lapply(seq_len(nchar(a) - 1), fun, string = substring(a, 2)))
)
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
add a comment |Â
up vote
7
down vote
Here's a recursive solution:
f <- function(x,pos=2)
if(pos <= nchar(x))
c(f(x,pos+1), f(`substr<-`(x, pos, pos, "X"),pos+1))
else x
f(x)[-1]
# [1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
Or using expand.grid
:
do.call(paste0, expand.grid(c(substr(x,1,1),lapply(strsplit(x,"")[[1]][-1], c, "X"))))[-1]
# [1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
Or using combn
/ Reduce
/ substr<-
:
combs <- unlist(lapply(seq(nchar(x)-1),combn, x =seq(nchar(x))[-1],simplify = F),F)
sapply(combs, Reduce, f= function(x,y) `substr<-`(x,y,y,"X"), init = x)
# [1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Second solution explained
pairs0 <- lapply(strsplit(x,"")[[1]][-1], c, "X") # pairs of original letter + "X"
pairs1 <- c(substr(x,1,1), pairs0) # including 1st letter (without "X")
do.call(paste0, expand.grid(pairs1))[-1] # expand into data.frame and paste
add a comment |Â
up vote
6
down vote
Kind of for the sake of adding another option using binary logic:
Assuming your string is always 4 character long:
input<-"ECET"
invec <- strsplit(input,'')[[1]]
sapply(1:7, function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:4])] <- "X"
paste0(z,collapse = '')
)
[1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
If the string has to be longer, you can compute the values with power of 2, something like this should do:
input<-"ECETC"
pow <- nchar(input)
invec <- strsplit(input,'')[[1]]
sapply(1:(2^(pow-1) - 1), function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:(pow)])] <- "X"
paste0(z,collapse = '')
)
[1] "ECETX" "ECEXC" "ECEXX" "ECXTC" "ECXTX" "ECXXC" "ECXXX" "EXETC" "EXETX" "EXEXC" "EXEXX" "EXXTC" "EXXTX" "EXXXC"
[15] "EXXXX"
The idea is to know the number of possible alterations, it's a binary of 3 positions, so 2^3 minus 1 as we don't want to keep the no replacement string: 7
intToBits return the binary value of the integer, for 5:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
R uses 32 bits by default, but we just want a logical vector corresponding to our string lenght, so we just keep the nchar of the original string.
Then we convert to logical and reverse this 4 boolean values, as we'll never trigger the last bit (8 for 4 chars) it will never be true:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> tmp<-as.logical(intToBits(5)[1:4])
> tmp
[1] TRUE FALSE TRUE FALSE
> rev(tmp)
[1] FALSE TRUE FALSE TRUE
To avoid overwriting our original vector we do copy it into z, and then just replace the position in z using this logical vector.
For a nice output we return the paste0 with collapse as nothing to recreate a single string and retrieve a character vector.
add a comment |Â
up vote
3
down vote
Another version with combn, using purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) purrr::reduce(x,f,.init=s)
unlist(purrr::map(1:(nchar(s)-1), function(x) combn(2:nchar(s),x,g)))
#[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
or without purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) Reduce(f,x,s)
unlist(lapply(1:(nchar(s)-1),function(x) combn(2:nchar(s),x,g)))
add a comment |Â
up vote
2
down vote
Here is a base R solution, but i find it complicated, with 3 nested loops.
replaceChar <- function(x, char = "X")
n <- nchar(x)
res <- NULL
for(i in seq_len(n))
cmb <- combn(n, i)
r <- apply(cmb, 2, function(cc)
y <- x
for(k in cc)
substr(y, k, k) <- char
y
)
res <- c(res, r)
res
x <- "ECET"
replaceChar(x)
replaceChar(x, "Y")
replaceChar(paste0(x, x))
add a comment |Â
up vote
1
down vote
A vectorized method with boolean indexing:
permX <- function(text, replChar='X')
library(gtools)
library(stringr)
# get TRUE/FALSE permutations for nchar(text)
idx <- permutations(2, nchar(text),c(T,F), repeats.allowed = T)
# we don't want the first character to be replaced
idx <- idx[1:(nrow(idx)/2),]
# split string into single chars
chars <- str_split(text,'')
# build data.frame with nrows(df) == nrows(idx)
df = t(data.frame(rep(chars, nrow(idx))))
# do replacing
df[idx] <- replChar
row.names(df) <- c()
return(df)
permX('ECET')
[,1] [,2] [,3] [,4]
[1,] "E" "C" "E" "T"
[2,] "E" "C" "E" "X"
[3,] "E" "C" "X" "T"
[4,] "E" "C" "X" "X"
[5,] "E" "X" "E" "T"
[6,] "E" "X" "E" "X"
[7,] "E" "X" "X" "T"
[8,] "E" "X" "X" "X"
add a comment |Â
up vote
1
down vote
One more simple solution
# expand.grid to get all combinations of the input vectors, result in a matrix
m <- expand.grid( c('E'),
c('C','X'),
c('E','X'),
c('T','X') )
# then, optionally, apply to paste the columns together
apply(m, 1, paste0, collapse='')[-1]
[1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
New contributor
Would be a complete answer if the building ofm
was done from a string and not from manual input. (but mostly that would be Moody's second option)
â Tensibai
Sep 7 at 13:24
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
1
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
add a comment |Â
7 Answers
7
active
oldest
votes
7 Answers
7
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
13
down vote
accepted
Using the FUN
argument of combn
:
a <- "ECET"
fun <- function(n, string)
combn(nchar(string), n, function(x)
s <- strsplit(string, '')[[1]]
s[x] <- 'X'
paste(s, collapse = '')
)
lapply(seq_len(nchar(a)), fun, string = a)
[[1]]
[1] "XCET" "EXET" "ECXT" "ECEX"
[[2]]
[1] "XXET" "XCXT" "XCEX" "EXXT" "EXEX" "ECXX"
[[3]]
[1] "XXXT" "XXEX" "XCXX" "EXXX"
[[4]]
[1] "XXXX"
unlist
to get a single vector. Faster solutions are probably available.
To leave your first character unchanged:
paste0(
substring(a, 1, 1),
unlist(lapply(seq_len(nchar(a) - 1), fun, string = substring(a, 2)))
)
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
add a comment |Â
up vote
13
down vote
accepted
Using the FUN
argument of combn
:
a <- "ECET"
fun <- function(n, string)
combn(nchar(string), n, function(x)
s <- strsplit(string, '')[[1]]
s[x] <- 'X'
paste(s, collapse = '')
)
lapply(seq_len(nchar(a)), fun, string = a)
[[1]]
[1] "XCET" "EXET" "ECXT" "ECEX"
[[2]]
[1] "XXET" "XCXT" "XCEX" "EXXT" "EXEX" "ECXX"
[[3]]
[1] "XXXT" "XXEX" "XCXX" "EXXX"
[[4]]
[1] "XXXX"
unlist
to get a single vector. Faster solutions are probably available.
To leave your first character unchanged:
paste0(
substring(a, 1, 1),
unlist(lapply(seq_len(nchar(a) - 1), fun, string = substring(a, 2)))
)
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
add a comment |Â
up vote
13
down vote
accepted
up vote
13
down vote
accepted
Using the FUN
argument of combn
:
a <- "ECET"
fun <- function(n, string)
combn(nchar(string), n, function(x)
s <- strsplit(string, '')[[1]]
s[x] <- 'X'
paste(s, collapse = '')
)
lapply(seq_len(nchar(a)), fun, string = a)
[[1]]
[1] "XCET" "EXET" "ECXT" "ECEX"
[[2]]
[1] "XXET" "XCXT" "XCEX" "EXXT" "EXEX" "ECXX"
[[3]]
[1] "XXXT" "XXEX" "XCXX" "EXXX"
[[4]]
[1] "XXXX"
unlist
to get a single vector. Faster solutions are probably available.
To leave your first character unchanged:
paste0(
substring(a, 1, 1),
unlist(lapply(seq_len(nchar(a) - 1), fun, string = substring(a, 2)))
)
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Using the FUN
argument of combn
:
a <- "ECET"
fun <- function(n, string)
combn(nchar(string), n, function(x)
s <- strsplit(string, '')[[1]]
s[x] <- 'X'
paste(s, collapse = '')
)
lapply(seq_len(nchar(a)), fun, string = a)
[[1]]
[1] "XCET" "EXET" "ECXT" "ECEX"
[[2]]
[1] "XXET" "XCXT" "XCEX" "EXXT" "EXEX" "ECXX"
[[3]]
[1] "XXXT" "XXEX" "XCXX" "EXXX"
[[4]]
[1] "XXXX"
unlist
to get a single vector. Faster solutions are probably available.
To leave your first character unchanged:
paste0(
substring(a, 1, 1),
unlist(lapply(seq_len(nchar(a) - 1), fun, string = substring(a, 2)))
)
[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
edited Sep 6 at 12:18
answered Sep 6 at 9:41
Axeman
17.5k43954
17.5k43954
add a comment |Â
add a comment |Â
up vote
7
down vote
Here's a recursive solution:
f <- function(x,pos=2)
if(pos <= nchar(x))
c(f(x,pos+1), f(`substr<-`(x, pos, pos, "X"),pos+1))
else x
f(x)[-1]
# [1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
Or using expand.grid
:
do.call(paste0, expand.grid(c(substr(x,1,1),lapply(strsplit(x,"")[[1]][-1], c, "X"))))[-1]
# [1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
Or using combn
/ Reduce
/ substr<-
:
combs <- unlist(lapply(seq(nchar(x)-1),combn, x =seq(nchar(x))[-1],simplify = F),F)
sapply(combs, Reduce, f= function(x,y) `substr<-`(x,y,y,"X"), init = x)
# [1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Second solution explained
pairs0 <- lapply(strsplit(x,"")[[1]][-1], c, "X") # pairs of original letter + "X"
pairs1 <- c(substr(x,1,1), pairs0) # including 1st letter (without "X")
do.call(paste0, expand.grid(pairs1))[-1] # expand into data.frame and paste
add a comment |Â
up vote
7
down vote
Here's a recursive solution:
f <- function(x,pos=2)
if(pos <= nchar(x))
c(f(x,pos+1), f(`substr<-`(x, pos, pos, "X"),pos+1))
else x
f(x)[-1]
# [1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
Or using expand.grid
:
do.call(paste0, expand.grid(c(substr(x,1,1),lapply(strsplit(x,"")[[1]][-1], c, "X"))))[-1]
# [1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
Or using combn
/ Reduce
/ substr<-
:
combs <- unlist(lapply(seq(nchar(x)-1),combn, x =seq(nchar(x))[-1],simplify = F),F)
sapply(combs, Reduce, f= function(x,y) `substr<-`(x,y,y,"X"), init = x)
# [1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Second solution explained
pairs0 <- lapply(strsplit(x,"")[[1]][-1], c, "X") # pairs of original letter + "X"
pairs1 <- c(substr(x,1,1), pairs0) # including 1st letter (without "X")
do.call(paste0, expand.grid(pairs1))[-1] # expand into data.frame and paste
add a comment |Â
up vote
7
down vote
up vote
7
down vote
Here's a recursive solution:
f <- function(x,pos=2)
if(pos <= nchar(x))
c(f(x,pos+1), f(`substr<-`(x, pos, pos, "X"),pos+1))
else x
f(x)[-1]
# [1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
Or using expand.grid
:
do.call(paste0, expand.grid(c(substr(x,1,1),lapply(strsplit(x,"")[[1]][-1], c, "X"))))[-1]
# [1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
Or using combn
/ Reduce
/ substr<-
:
combs <- unlist(lapply(seq(nchar(x)-1),combn, x =seq(nchar(x))[-1],simplify = F),F)
sapply(combs, Reduce, f= function(x,y) `substr<-`(x,y,y,"X"), init = x)
# [1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Second solution explained
pairs0 <- lapply(strsplit(x,"")[[1]][-1], c, "X") # pairs of original letter + "X"
pairs1 <- c(substr(x,1,1), pairs0) # including 1st letter (without "X")
do.call(paste0, expand.grid(pairs1))[-1] # expand into data.frame and paste
Here's a recursive solution:
f <- function(x,pos=2)
if(pos <= nchar(x))
c(f(x,pos+1), f(`substr<-`(x, pos, pos, "X"),pos+1))
else x
f(x)[-1]
# [1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
Or using expand.grid
:
do.call(paste0, expand.grid(c(substr(x,1,1),lapply(strsplit(x,"")[[1]][-1], c, "X"))))[-1]
# [1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
Or using combn
/ Reduce
/ substr<-
:
combs <- unlist(lapply(seq(nchar(x)-1),combn, x =seq(nchar(x))[-1],simplify = F),F)
sapply(combs, Reduce, f= function(x,y) `substr<-`(x,y,y,"X"), init = x)
# [1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
Second solution explained
pairs0 <- lapply(strsplit(x,"")[[1]][-1], c, "X") # pairs of original letter + "X"
pairs1 <- c(substr(x,1,1), pairs0) # including 1st letter (without "X")
do.call(paste0, expand.grid(pairs1))[-1] # expand into data.frame and paste
edited Sep 7 at 22:46
answered Sep 6 at 10:49
Moody_Mudskipper
17.7k32154
17.7k32154
add a comment |Â
add a comment |Â
up vote
6
down vote
Kind of for the sake of adding another option using binary logic:
Assuming your string is always 4 character long:
input<-"ECET"
invec <- strsplit(input,'')[[1]]
sapply(1:7, function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:4])] <- "X"
paste0(z,collapse = '')
)
[1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
If the string has to be longer, you can compute the values with power of 2, something like this should do:
input<-"ECETC"
pow <- nchar(input)
invec <- strsplit(input,'')[[1]]
sapply(1:(2^(pow-1) - 1), function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:(pow)])] <- "X"
paste0(z,collapse = '')
)
[1] "ECETX" "ECEXC" "ECEXX" "ECXTC" "ECXTX" "ECXXC" "ECXXX" "EXETC" "EXETX" "EXEXC" "EXEXX" "EXXTC" "EXXTX" "EXXXC"
[15] "EXXXX"
The idea is to know the number of possible alterations, it's a binary of 3 positions, so 2^3 minus 1 as we don't want to keep the no replacement string: 7
intToBits return the binary value of the integer, for 5:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
R uses 32 bits by default, but we just want a logical vector corresponding to our string lenght, so we just keep the nchar of the original string.
Then we convert to logical and reverse this 4 boolean values, as we'll never trigger the last bit (8 for 4 chars) it will never be true:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> tmp<-as.logical(intToBits(5)[1:4])
> tmp
[1] TRUE FALSE TRUE FALSE
> rev(tmp)
[1] FALSE TRUE FALSE TRUE
To avoid overwriting our original vector we do copy it into z, and then just replace the position in z using this logical vector.
For a nice output we return the paste0 with collapse as nothing to recreate a single string and retrieve a character vector.
add a comment |Â
up vote
6
down vote
Kind of for the sake of adding another option using binary logic:
Assuming your string is always 4 character long:
input<-"ECET"
invec <- strsplit(input,'')[[1]]
sapply(1:7, function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:4])] <- "X"
paste0(z,collapse = '')
)
[1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
If the string has to be longer, you can compute the values with power of 2, something like this should do:
input<-"ECETC"
pow <- nchar(input)
invec <- strsplit(input,'')[[1]]
sapply(1:(2^(pow-1) - 1), function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:(pow)])] <- "X"
paste0(z,collapse = '')
)
[1] "ECETX" "ECEXC" "ECEXX" "ECXTC" "ECXTX" "ECXXC" "ECXXX" "EXETC" "EXETX" "EXEXC" "EXEXX" "EXXTC" "EXXTX" "EXXXC"
[15] "EXXXX"
The idea is to know the number of possible alterations, it's a binary of 3 positions, so 2^3 minus 1 as we don't want to keep the no replacement string: 7
intToBits return the binary value of the integer, for 5:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
R uses 32 bits by default, but we just want a logical vector corresponding to our string lenght, so we just keep the nchar of the original string.
Then we convert to logical and reverse this 4 boolean values, as we'll never trigger the last bit (8 for 4 chars) it will never be true:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> tmp<-as.logical(intToBits(5)[1:4])
> tmp
[1] TRUE FALSE TRUE FALSE
> rev(tmp)
[1] FALSE TRUE FALSE TRUE
To avoid overwriting our original vector we do copy it into z, and then just replace the position in z using this logical vector.
For a nice output we return the paste0 with collapse as nothing to recreate a single string and retrieve a character vector.
add a comment |Â
up vote
6
down vote
up vote
6
down vote
Kind of for the sake of adding another option using binary logic:
Assuming your string is always 4 character long:
input<-"ECET"
invec <- strsplit(input,'')[[1]]
sapply(1:7, function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:4])] <- "X"
paste0(z,collapse = '')
)
[1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
If the string has to be longer, you can compute the values with power of 2, something like this should do:
input<-"ECETC"
pow <- nchar(input)
invec <- strsplit(input,'')[[1]]
sapply(1:(2^(pow-1) - 1), function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:(pow)])] <- "X"
paste0(z,collapse = '')
)
[1] "ECETX" "ECEXC" "ECEXX" "ECXTC" "ECXTX" "ECXXC" "ECXXX" "EXETC" "EXETX" "EXEXC" "EXEXX" "EXXTC" "EXXTX" "EXXXC"
[15] "EXXXX"
The idea is to know the number of possible alterations, it's a binary of 3 positions, so 2^3 minus 1 as we don't want to keep the no replacement string: 7
intToBits return the binary value of the integer, for 5:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
R uses 32 bits by default, but we just want a logical vector corresponding to our string lenght, so we just keep the nchar of the original string.
Then we convert to logical and reverse this 4 boolean values, as we'll never trigger the last bit (8 for 4 chars) it will never be true:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> tmp<-as.logical(intToBits(5)[1:4])
> tmp
[1] TRUE FALSE TRUE FALSE
> rev(tmp)
[1] FALSE TRUE FALSE TRUE
To avoid overwriting our original vector we do copy it into z, and then just replace the position in z using this logical vector.
For a nice output we return the paste0 with collapse as nothing to recreate a single string and retrieve a character vector.
Kind of for the sake of adding another option using binary logic:
Assuming your string is always 4 character long:
input<-"ECET"
invec <- strsplit(input,'')[[1]]
sapply(1:7, function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:4])] <- "X"
paste0(z,collapse = '')
)
[1] "ECEX" "ECXT" "ECXX" "EXET" "EXEX" "EXXT" "EXXX"
If the string has to be longer, you can compute the values with power of 2, something like this should do:
input<-"ECETC"
pow <- nchar(input)
invec <- strsplit(input,'')[[1]]
sapply(1:(2^(pow-1) - 1), function(x)
z <- invec
z[rev(as.logical(intToBits(x))[1:(pow)])] <- "X"
paste0(z,collapse = '')
)
[1] "ECETX" "ECEXC" "ECEXX" "ECXTC" "ECXTX" "ECXXC" "ECXXX" "EXETC" "EXETX" "EXEXC" "EXEXX" "EXXTC" "EXXTX" "EXXXC"
[15] "EXXXX"
The idea is to know the number of possible alterations, it's a binary of 3 positions, so 2^3 minus 1 as we don't want to keep the no replacement string: 7
intToBits return the binary value of the integer, for 5:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
R uses 32 bits by default, but we just want a logical vector corresponding to our string lenght, so we just keep the nchar of the original string.
Then we convert to logical and reverse this 4 boolean values, as we'll never trigger the last bit (8 for 4 chars) it will never be true:
> intToBits(5)
[1] 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> tmp<-as.logical(intToBits(5)[1:4])
> tmp
[1] TRUE FALSE TRUE FALSE
> rev(tmp)
[1] FALSE TRUE FALSE TRUE
To avoid overwriting our original vector we do copy it into z, and then just replace the position in z using this logical vector.
For a nice output we return the paste0 with collapse as nothing to recreate a single string and retrieve a character vector.
answered Sep 6 at 13:29
Tensibai
13.6k12946
13.6k12946
add a comment |Â
add a comment |Â
up vote
3
down vote
Another version with combn, using purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) purrr::reduce(x,f,.init=s)
unlist(purrr::map(1:(nchar(s)-1), function(x) combn(2:nchar(s),x,g)))
#[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
or without purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) Reduce(f,x,s)
unlist(lapply(1:(nchar(s)-1),function(x) combn(2:nchar(s),x,g)))
add a comment |Â
up vote
3
down vote
Another version with combn, using purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) purrr::reduce(x,f,.init=s)
unlist(purrr::map(1:(nchar(s)-1), function(x) combn(2:nchar(s),x,g)))
#[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
or without purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) Reduce(f,x,s)
unlist(lapply(1:(nchar(s)-1),function(x) combn(2:nchar(s),x,g)))
add a comment |Â
up vote
3
down vote
up vote
3
down vote
Another version with combn, using purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) purrr::reduce(x,f,.init=s)
unlist(purrr::map(1:(nchar(s)-1), function(x) combn(2:nchar(s),x,g)))
#[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
or without purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) Reduce(f,x,s)
unlist(lapply(1:(nchar(s)-1),function(x) combn(2:nchar(s),x,g)))
Another version with combn, using purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) purrr::reduce(x,f,.init=s)
unlist(purrr::map(1:(nchar(s)-1), function(x) combn(2:nchar(s),x,g)))
#[1] "EXET" "ECXT" "ECEX" "EXXT" "EXEX" "ECXX" "EXXX"
or without purrr:
s <- "ECET"
f <- function(x,y) substr(x,y,y) <- "X"; x
g <- function(x) Reduce(f,x,s)
unlist(lapply(1:(nchar(s)-1),function(x) combn(2:nchar(s),x,g)))
edited Sep 6 at 10:50
answered Sep 6 at 10:39
Nicolas2
761119
761119
add a comment |Â
add a comment |Â
up vote
2
down vote
Here is a base R solution, but i find it complicated, with 3 nested loops.
replaceChar <- function(x, char = "X")
n <- nchar(x)
res <- NULL
for(i in seq_len(n))
cmb <- combn(n, i)
r <- apply(cmb, 2, function(cc)
y <- x
for(k in cc)
substr(y, k, k) <- char
y
)
res <- c(res, r)
res
x <- "ECET"
replaceChar(x)
replaceChar(x, "Y")
replaceChar(paste0(x, x))
add a comment |Â
up vote
2
down vote
Here is a base R solution, but i find it complicated, with 3 nested loops.
replaceChar <- function(x, char = "X")
n <- nchar(x)
res <- NULL
for(i in seq_len(n))
cmb <- combn(n, i)
r <- apply(cmb, 2, function(cc)
y <- x
for(k in cc)
substr(y, k, k) <- char
y
)
res <- c(res, r)
res
x <- "ECET"
replaceChar(x)
replaceChar(x, "Y")
replaceChar(paste0(x, x))
add a comment |Â
up vote
2
down vote
up vote
2
down vote
Here is a base R solution, but i find it complicated, with 3 nested loops.
replaceChar <- function(x, char = "X")
n <- nchar(x)
res <- NULL
for(i in seq_len(n))
cmb <- combn(n, i)
r <- apply(cmb, 2, function(cc)
y <- x
for(k in cc)
substr(y, k, k) <- char
y
)
res <- c(res, r)
res
x <- "ECET"
replaceChar(x)
replaceChar(x, "Y")
replaceChar(paste0(x, x))
Here is a base R solution, but i find it complicated, with 3 nested loops.
replaceChar <- function(x, char = "X")
n <- nchar(x)
res <- NULL
for(i in seq_len(n))
cmb <- combn(n, i)
r <- apply(cmb, 2, function(cc)
y <- x
for(k in cc)
substr(y, k, k) <- char
y
)
res <- c(res, r)
res
x <- "ECET"
replaceChar(x)
replaceChar(x, "Y")
replaceChar(paste0(x, x))
edited Sep 6 at 10:35
answered Sep 6 at 9:52
Rui Barradas
12.1k31628
12.1k31628
add a comment |Â
add a comment |Â
up vote
1
down vote
A vectorized method with boolean indexing:
permX <- function(text, replChar='X')
library(gtools)
library(stringr)
# get TRUE/FALSE permutations for nchar(text)
idx <- permutations(2, nchar(text),c(T,F), repeats.allowed = T)
# we don't want the first character to be replaced
idx <- idx[1:(nrow(idx)/2),]
# split string into single chars
chars <- str_split(text,'')
# build data.frame with nrows(df) == nrows(idx)
df = t(data.frame(rep(chars, nrow(idx))))
# do replacing
df[idx] <- replChar
row.names(df) <- c()
return(df)
permX('ECET')
[,1] [,2] [,3] [,4]
[1,] "E" "C" "E" "T"
[2,] "E" "C" "E" "X"
[3,] "E" "C" "X" "T"
[4,] "E" "C" "X" "X"
[5,] "E" "X" "E" "T"
[6,] "E" "X" "E" "X"
[7,] "E" "X" "X" "T"
[8,] "E" "X" "X" "X"
add a comment |Â
up vote
1
down vote
A vectorized method with boolean indexing:
permX <- function(text, replChar='X')
library(gtools)
library(stringr)
# get TRUE/FALSE permutations for nchar(text)
idx <- permutations(2, nchar(text),c(T,F), repeats.allowed = T)
# we don't want the first character to be replaced
idx <- idx[1:(nrow(idx)/2),]
# split string into single chars
chars <- str_split(text,'')
# build data.frame with nrows(df) == nrows(idx)
df = t(data.frame(rep(chars, nrow(idx))))
# do replacing
df[idx] <- replChar
row.names(df) <- c()
return(df)
permX('ECET')
[,1] [,2] [,3] [,4]
[1,] "E" "C" "E" "T"
[2,] "E" "C" "E" "X"
[3,] "E" "C" "X" "T"
[4,] "E" "C" "X" "X"
[5,] "E" "X" "E" "T"
[6,] "E" "X" "E" "X"
[7,] "E" "X" "X" "T"
[8,] "E" "X" "X" "X"
add a comment |Â
up vote
1
down vote
up vote
1
down vote
A vectorized method with boolean indexing:
permX <- function(text, replChar='X')
library(gtools)
library(stringr)
# get TRUE/FALSE permutations for nchar(text)
idx <- permutations(2, nchar(text),c(T,F), repeats.allowed = T)
# we don't want the first character to be replaced
idx <- idx[1:(nrow(idx)/2),]
# split string into single chars
chars <- str_split(text,'')
# build data.frame with nrows(df) == nrows(idx)
df = t(data.frame(rep(chars, nrow(idx))))
# do replacing
df[idx] <- replChar
row.names(df) <- c()
return(df)
permX('ECET')
[,1] [,2] [,3] [,4]
[1,] "E" "C" "E" "T"
[2,] "E" "C" "E" "X"
[3,] "E" "C" "X" "T"
[4,] "E" "C" "X" "X"
[5,] "E" "X" "E" "T"
[6,] "E" "X" "E" "X"
[7,] "E" "X" "X" "T"
[8,] "E" "X" "X" "X"
A vectorized method with boolean indexing:
permX <- function(text, replChar='X')
library(gtools)
library(stringr)
# get TRUE/FALSE permutations for nchar(text)
idx <- permutations(2, nchar(text),c(T,F), repeats.allowed = T)
# we don't want the first character to be replaced
idx <- idx[1:(nrow(idx)/2),]
# split string into single chars
chars <- str_split(text,'')
# build data.frame with nrows(df) == nrows(idx)
df = t(data.frame(rep(chars, nrow(idx))))
# do replacing
df[idx] <- replChar
row.names(df) <- c()
return(df)
permX('ECET')
[,1] [,2] [,3] [,4]
[1,] "E" "C" "E" "T"
[2,] "E" "C" "E" "X"
[3,] "E" "C" "X" "T"
[4,] "E" "C" "X" "X"
[5,] "E" "X" "E" "T"
[6,] "E" "X" "E" "X"
[7,] "E" "X" "X" "T"
[8,] "E" "X" "X" "X"
edited Sep 6 at 10:08
answered Sep 6 at 9:59
psychOle
673612
673612
add a comment |Â
add a comment |Â
up vote
1
down vote
One more simple solution
# expand.grid to get all combinations of the input vectors, result in a matrix
m <- expand.grid( c('E'),
c('C','X'),
c('E','X'),
c('T','X') )
# then, optionally, apply to paste the columns together
apply(m, 1, paste0, collapse='')[-1]
[1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
New contributor
Would be a complete answer if the building ofm
was done from a string and not from manual input. (but mostly that would be Moody's second option)
â Tensibai
Sep 7 at 13:24
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
1
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
add a comment |Â
up vote
1
down vote
One more simple solution
# expand.grid to get all combinations of the input vectors, result in a matrix
m <- expand.grid( c('E'),
c('C','X'),
c('E','X'),
c('T','X') )
# then, optionally, apply to paste the columns together
apply(m, 1, paste0, collapse='')[-1]
[1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
New contributor
Would be a complete answer if the building ofm
was done from a string and not from manual input. (but mostly that would be Moody's second option)
â Tensibai
Sep 7 at 13:24
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
1
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
add a comment |Â
up vote
1
down vote
up vote
1
down vote
One more simple solution
# expand.grid to get all combinations of the input vectors, result in a matrix
m <- expand.grid( c('E'),
c('C','X'),
c('E','X'),
c('T','X') )
# then, optionally, apply to paste the columns together
apply(m, 1, paste0, collapse='')[-1]
[1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
New contributor
One more simple solution
# expand.grid to get all combinations of the input vectors, result in a matrix
m <- expand.grid( c('E'),
c('C','X'),
c('E','X'),
c('T','X') )
# then, optionally, apply to paste the columns together
apply(m, 1, paste0, collapse='')[-1]
[1] "EXET" "ECXT" "EXXT" "ECEX" "EXEX" "ECXX" "EXXX"
New contributor
edited yesterday
New contributor
answered Sep 7 at 11:04
krads
1336
1336
New contributor
New contributor
Would be a complete answer if the building ofm
was done from a string and not from manual input. (but mostly that would be Moody's second option)
â Tensibai
Sep 7 at 13:24
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
1
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
add a comment |Â
Would be a complete answer if the building ofm
was done from a string and not from manual input. (but mostly that would be Moody's second option)
â Tensibai
Sep 7 at 13:24
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
1
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
Would be a complete answer if the building of
m
was done from a string and not from manual input. (but mostly that would be Moody's second option)â Tensibai
Sep 7 at 13:24
Would be a complete answer if the building of
m
was done from a string and not from manual input. (but mostly that would be Moody's second option)â Tensibai
Sep 7 at 13:24
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
Moody's second option as a one line solution is truly excellent. But it's very terse with a lot packed in. I think it's worth also showing this way as it's clearer what's happening at each step. The problem was simple enough that it didn't require coding to put the input into expand.grid()
â krads
Sep 7 at 14:48
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
I assume the question just take one exemple of 4 letters (maybe some kind of biologic sequence) and wish to apply that to a large number after, so showing how to build the various vectors in m would be better in my opinion
â Tensibai
Sep 7 at 15:05
1
1
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
I think it's useful to show an intuitive solution even if it's not general. I've updated my answer to make my 2nd solution more understandable :)
â Moody_Mudskipper
Sep 7 at 22:54
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52200936%2fcreate-all-combinations-of-letter-substitution-in-string%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Not really, because I don't only need the combinations as such. I checked the question. (Or at least I think :) )
â User2321
Sep 6 at 9:32
@Salman, that's really only one part of the question.
â Axeman
Sep 6 at 9:32