Replace words in an unstructured text file using a for loop
Clash Royale CLAN TAG#URR8PPP
up vote
8
down vote
favorite
I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).
Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.
See example code and data below.
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))
text <- sub(pattern = "change", replace = new[i], x = text)
text
r
add a comment |Â
up vote
8
down vote
favorite
I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).
Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.
See example code and data below.
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))
text <- sub(pattern = "change", replace = new[i], x = text)
text
r
Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
1 hour ago
add a comment |Â
up vote
8
down vote
favorite
up vote
8
down vote
favorite
I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).
Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.
See example code and data below.
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))
text <- sub(pattern = "change", replace = new[i], x = text)
text
r
I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).
Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.
See example code and data below.
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))
text <- sub(pattern = "change", replace = new[i], x = text)
text
r
r
edited 22 mins ago


Jaap
52.9k20115123
52.9k20115123
asked 1 hour ago


Gorp
18619
18619
Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
1 hour ago
add a comment |Â
Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
1 hour ago
Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
1 hour ago
Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
1 hour ago
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
6
down vote
How about this? The logic is, hammer away a string until it has no more change
. On every "hit" (where change
is found), move along the new
vector.
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1
for (i in 1:length(text))
while (grepl(pattern = "change", text[i]))
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1
text
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
add a comment |Â
up vote
1
down vote
Here is another solution using gregexpr()
and regmatches()
:
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
# Alter the structure of text
altered_text <- paste(text, collapse = "n")
# So we can use gregexpr and regmatches to get what you want
matches <- gregexpr("change", altered_text)
regmatches(altered_text, matches) <- list(new)
# And here's the result
cat(altered_text)
#> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
#> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
#> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT
# Or, putting the text back to its old structure
# (one element for each line)
unlist(strsplit(altered_text, "n"))
#> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
#> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
#> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Created on 2018-10-16 by the reprex package (v0.2.1)
We can do this since gregexpr()
can find all the matches in the text for "change"; from help("gregexpr")
:
regexpr returns an integer vector of the same length as text giving
the starting position of the first match....
gregexpr returns a list of the same length as text each element of
which is of the same form as the return value for regexpr, except that
the starting positions of every (disjoint) match are given.
(emphasis added).
Then regmatches()
can be used to either extract the matches found by gregexpr()
or replace them; from help("regmatches")
:
Usage
regmatches(x, m, invert = FALSE)
regmatches(x, m, invert = FALSE) <- value
...
value
an object with suitable replacement values for the matched or
non-matched substrings (see Details).
...
Details
The replacement function can be used for replacing the matched or
non-matched substrings. For vector match data, if invert is FALSE,
value should be a character vector with length the number of matched
elements in m. Otherwise, it should be a list of character vectors
with the same length as m, each as long as the number of replacements
needed.
add a comment |Â
up vote
1
down vote
Another approach using strsplit
:
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"] <- new
tapply(df$values, df$ind, paste, collapse = " ")
which gives:
1
"TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
2
"TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
3
"TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Additionally you could wrap the tapply
call in unname
:
unname(tapply(df$values, df$ind, paste, collapse = " "))
which gives:
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
If you want to use the elements of new
only once, you could update the code to:
newnew <- new[1:3]
ix <- df$values == "change"
df[ix, "values"][1:length(newnew)] <- newnew
unname(tapply(df$values, df$ind, paste, collapse = " "))
You could alter this further to also take into account the situation where there are more replacements than positions (occurences of the pattern, change
in the example) that need to be replaced:
newnew2 <- c(new, "five")
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"][1:pmin(sum(ix),length(newnew2))] <- newnew2[1:pmin(sum(ix),length(newnew2))]
unname(tapply(df$values, df$ind, paste, collapse = " "))
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
6
down vote
How about this? The logic is, hammer away a string until it has no more change
. On every "hit" (where change
is found), move along the new
vector.
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1
for (i in 1:length(text))
while (grepl(pattern = "change", text[i]))
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1
text
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
add a comment |Â
up vote
6
down vote
How about this? The logic is, hammer away a string until it has no more change
. On every "hit" (where change
is found), move along the new
vector.
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1
for (i in 1:length(text))
while (grepl(pattern = "change", text[i]))
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1
text
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
add a comment |Â
up vote
6
down vote
up vote
6
down vote
How about this? The logic is, hammer away a string until it has no more change
. On every "hit" (where change
is found), move along the new
vector.
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1
for (i in 1:length(text))
while (grepl(pattern = "change", text[i]))
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1
text
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
How about this? The logic is, hammer away a string until it has no more change
. On every "hit" (where change
is found), move along the new
vector.
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1
for (i in 1:length(text))
while (grepl(pattern = "change", text[i]))
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1
text
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
answered 1 hour ago


Roman Luštrik
48.2k17103158
48.2k17103158
add a comment |Â
add a comment |Â
up vote
1
down vote
Here is another solution using gregexpr()
and regmatches()
:
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
# Alter the structure of text
altered_text <- paste(text, collapse = "n")
# So we can use gregexpr and regmatches to get what you want
matches <- gregexpr("change", altered_text)
regmatches(altered_text, matches) <- list(new)
# And here's the result
cat(altered_text)
#> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
#> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
#> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT
# Or, putting the text back to its old structure
# (one element for each line)
unlist(strsplit(altered_text, "n"))
#> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
#> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
#> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Created on 2018-10-16 by the reprex package (v0.2.1)
We can do this since gregexpr()
can find all the matches in the text for "change"; from help("gregexpr")
:
regexpr returns an integer vector of the same length as text giving
the starting position of the first match....
gregexpr returns a list of the same length as text each element of
which is of the same form as the return value for regexpr, except that
the starting positions of every (disjoint) match are given.
(emphasis added).
Then regmatches()
can be used to either extract the matches found by gregexpr()
or replace them; from help("regmatches")
:
Usage
regmatches(x, m, invert = FALSE)
regmatches(x, m, invert = FALSE) <- value
...
value
an object with suitable replacement values for the matched or
non-matched substrings (see Details).
...
Details
The replacement function can be used for replacing the matched or
non-matched substrings. For vector match data, if invert is FALSE,
value should be a character vector with length the number of matched
elements in m. Otherwise, it should be a list of character vectors
with the same length as m, each as long as the number of replacements
needed.
add a comment |Â
up vote
1
down vote
Here is another solution using gregexpr()
and regmatches()
:
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
# Alter the structure of text
altered_text <- paste(text, collapse = "n")
# So we can use gregexpr and regmatches to get what you want
matches <- gregexpr("change", altered_text)
regmatches(altered_text, matches) <- list(new)
# And here's the result
cat(altered_text)
#> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
#> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
#> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT
# Or, putting the text back to its old structure
# (one element for each line)
unlist(strsplit(altered_text, "n"))
#> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
#> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
#> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Created on 2018-10-16 by the reprex package (v0.2.1)
We can do this since gregexpr()
can find all the matches in the text for "change"; from help("gregexpr")
:
regexpr returns an integer vector of the same length as text giving
the starting position of the first match....
gregexpr returns a list of the same length as text each element of
which is of the same form as the return value for regexpr, except that
the starting positions of every (disjoint) match are given.
(emphasis added).
Then regmatches()
can be used to either extract the matches found by gregexpr()
or replace them; from help("regmatches")
:
Usage
regmatches(x, m, invert = FALSE)
regmatches(x, m, invert = FALSE) <- value
...
value
an object with suitable replacement values for the matched or
non-matched substrings (see Details).
...
Details
The replacement function can be used for replacing the matched or
non-matched substrings. For vector match data, if invert is FALSE,
value should be a character vector with length the number of matched
elements in m. Otherwise, it should be a list of character vectors
with the same length as m, each as long as the number of replacements
needed.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Here is another solution using gregexpr()
and regmatches()
:
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
# Alter the structure of text
altered_text <- paste(text, collapse = "n")
# So we can use gregexpr and regmatches to get what you want
matches <- gregexpr("change", altered_text)
regmatches(altered_text, matches) <- list(new)
# And here's the result
cat(altered_text)
#> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
#> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
#> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT
# Or, putting the text back to its old structure
# (one element for each line)
unlist(strsplit(altered_text, "n"))
#> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
#> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
#> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Created on 2018-10-16 by the reprex package (v0.2.1)
We can do this since gregexpr()
can find all the matches in the text for "change"; from help("gregexpr")
:
regexpr returns an integer vector of the same length as text giving
the starting position of the first match....
gregexpr returns a list of the same length as text each element of
which is of the same form as the return value for regexpr, except that
the starting positions of every (disjoint) match are given.
(emphasis added).
Then regmatches()
can be used to either extract the matches found by gregexpr()
or replace them; from help("regmatches")
:
Usage
regmatches(x, m, invert = FALSE)
regmatches(x, m, invert = FALSE) <- value
...
value
an object with suitable replacement values for the matched or
non-matched substrings (see Details).
...
Details
The replacement function can be used for replacing the matched or
non-matched substrings. For vector match data, if invert is FALSE,
value should be a character vector with length the number of matched
elements in m. Otherwise, it should be a list of character vectors
with the same length as m, each as long as the number of replacements
needed.
Here is another solution using gregexpr()
and regmatches()
:
#text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")
#Variable containing input for text
new <- c("one", "two", "three", "four")
# Alter the structure of text
altered_text <- paste(text, collapse = "n")
# So we can use gregexpr and regmatches to get what you want
matches <- gregexpr("change", altered_text)
regmatches(altered_text, matches) <- list(new)
# And here's the result
cat(altered_text)
#> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
#> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
#> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT
# Or, putting the text back to its old structure
# (one element for each line)
unlist(strsplit(altered_text, "n"))
#> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
#> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
#> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Created on 2018-10-16 by the reprex package (v0.2.1)
We can do this since gregexpr()
can find all the matches in the text for "change"; from help("gregexpr")
:
regexpr returns an integer vector of the same length as text giving
the starting position of the first match....
gregexpr returns a list of the same length as text each element of
which is of the same form as the return value for regexpr, except that
the starting positions of every (disjoint) match are given.
(emphasis added).
Then regmatches()
can be used to either extract the matches found by gregexpr()
or replace them; from help("regmatches")
:
Usage
regmatches(x, m, invert = FALSE)
regmatches(x, m, invert = FALSE) <- value
...
value
an object with suitable replacement values for the matched or
non-matched substrings (see Details).
...
Details
The replacement function can be used for replacing the matched or
non-matched substrings. For vector match data, if invert is FALSE,
value should be a character vector with length the number of matched
elements in m. Otherwise, it should be a list of character vectors
with the same length as m, each as long as the number of replacements
needed.
edited 50 mins ago
answered 59 mins ago
duckmayr
5,33911124
5,33911124
add a comment |Â
add a comment |Â
up vote
1
down vote
Another approach using strsplit
:
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"] <- new
tapply(df$values, df$ind, paste, collapse = " ")
which gives:
1
"TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
2
"TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
3
"TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Additionally you could wrap the tapply
call in unname
:
unname(tapply(df$values, df$ind, paste, collapse = " "))
which gives:
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
If you want to use the elements of new
only once, you could update the code to:
newnew <- new[1:3]
ix <- df$values == "change"
df[ix, "values"][1:length(newnew)] <- newnew
unname(tapply(df$values, df$ind, paste, collapse = " "))
You could alter this further to also take into account the situation where there are more replacements than positions (occurences of the pattern, change
in the example) that need to be replaced:
newnew2 <- c(new, "five")
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"][1:pmin(sum(ix),length(newnew2))] <- newnew2[1:pmin(sum(ix),length(newnew2))]
unname(tapply(df$values, df$ind, paste, collapse = " "))
add a comment |Â
up vote
1
down vote
Another approach using strsplit
:
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"] <- new
tapply(df$values, df$ind, paste, collapse = " ")
which gives:
1
"TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
2
"TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
3
"TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Additionally you could wrap the tapply
call in unname
:
unname(tapply(df$values, df$ind, paste, collapse = " "))
which gives:
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
If you want to use the elements of new
only once, you could update the code to:
newnew <- new[1:3]
ix <- df$values == "change"
df[ix, "values"][1:length(newnew)] <- newnew
unname(tapply(df$values, df$ind, paste, collapse = " "))
You could alter this further to also take into account the situation where there are more replacements than positions (occurences of the pattern, change
in the example) that need to be replaced:
newnew2 <- c(new, "five")
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"][1:pmin(sum(ix),length(newnew2))] <- newnew2[1:pmin(sum(ix),length(newnew2))]
unname(tapply(df$values, df$ind, paste, collapse = " "))
add a comment |Â
up vote
1
down vote
up vote
1
down vote
Another approach using strsplit
:
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"] <- new
tapply(df$values, df$ind, paste, collapse = " ")
which gives:
1
"TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
2
"TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
3
"TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Additionally you could wrap the tapply
call in unname
:
unname(tapply(df$values, df$ind, paste, collapse = " "))
which gives:
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
If you want to use the elements of new
only once, you could update the code to:
newnew <- new[1:3]
ix <- df$values == "change"
df[ix, "values"][1:length(newnew)] <- newnew
unname(tapply(df$values, df$ind, paste, collapse = " "))
You could alter this further to also take into account the situation where there are more replacements than positions (occurences of the pattern, change
in the example) that need to be replaced:
newnew2 <- c(new, "five")
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"][1:pmin(sum(ix),length(newnew2))] <- newnew2[1:pmin(sum(ix),length(newnew2))]
unname(tapply(df$values, df$ind, paste, collapse = " "))
Another approach using strsplit
:
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"] <- new
tapply(df$values, df$ind, paste, collapse = " ")
which gives:
1
"TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
2
"TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
3
"TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
Additionally you could wrap the tapply
call in unname
:
unname(tapply(df$values, df$ind, paste, collapse = " "))
which gives:
[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"
If you want to use the elements of new
only once, you could update the code to:
newnew <- new[1:3]
ix <- df$values == "change"
df[ix, "values"][1:length(newnew)] <- newnew
unname(tapply(df$values, df$ind, paste, collapse = " "))
You could alter this further to also take into account the situation where there are more replacements than positions (occurences of the pattern, change
in the example) that need to be replaced:
newnew2 <- c(new, "five")
tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
df <- stack(setNames(tl, seq_along(tl)))
ix <- df$values == "change"
df[ix, "values"][1:pmin(sum(ix),length(newnew2))] <- newnew2[1:pmin(sum(ix),length(newnew2))]
unname(tapply(df$values, df$ind, paste, collapse = " "))
edited 22 mins ago
answered 53 mins ago


Jaap
52.9k20115123
52.9k20115123
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52832017%2freplace-words-in-an-unstructured-text-file-using-a-for-loop%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
1 hour ago