Replace words in an unstructured text file in R using a for loop

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
7
down vote

favorite












I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).



Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.



See example code and data below.



 #text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))

text <- sub(pattern = "change", replace = new[i], x = text)


text









share|improve this question





















  • Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
    – Vladimir Volokhonsky
    43 mins ago














up vote
7
down vote

favorite












I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).



Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.



See example code and data below.



 #text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))

text <- sub(pattern = "change", replace = new[i], x = text)


text









share|improve this question





















  • Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
    – Vladimir Volokhonsky
    43 mins ago












up vote
7
down vote

favorite









up vote
7
down vote

favorite











I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).



Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.



See example code and data below.



 #text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))

text <- sub(pattern = "change", replace = new[i], x = text)


text









share|improve this question













I have a VERY unstructured text file that I read with readLines. I want to change certain strings to another string which is in a variable (called "new" below).



Below I want the manipulated text to include all terms: "one", "two", "three" and "four" once, instead of the "change" strings. However, as you can see sub changes the first pattern in each element, but I need the code to ignore that there are new strings with quotes.



See example code and data below.



 #text to be changed
text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

#Variable containing input for text
new <- c("one", "two", "three", "four")
#For loop that I want to include
for (i in 1:length(new))

text <- sub(pattern = "change", replace = new[i], x = text)


text






r






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 54 mins ago









Gorp

18119




18119











  • Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
    – Vladimir Volokhonsky
    43 mins ago
















  • Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
    – Vladimir Volokhonsky
    43 mins ago















Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
43 mins ago




Are you need text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one", "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three", "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT") as a result?
– Vladimir Volokhonsky
43 mins ago












3 Answers
3






active

oldest

votes

















up vote
5
down vote













How about this? The logic is, hammer away a string until it has no more change. On every "hit" (where change is found), move along the new vector.



text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
"TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

#Variable containing input for text
new <- c("one", "two", "three", "four")
new.i <- 1

for (i in 1:length(text))
while (grepl(pattern = "change", text[i]))
text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
new.i <- new.i + 1


text

[1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
[2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
[3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"





share|improve this answer



























    up vote
    1
    down vote













    Here is another solution using gregexpr() and regmatches():





    #text to be changed
    text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
    "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
    "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

    #Variable containing input for text
    new <- c("one", "two", "three", "four")

    # Alter the structure of text
    altered_text <- paste(text, collapse = "n")

    # So we can use gregexpr and regmatches to get what you want
    matches <- gregexpr("change", altered_text)
    regmatches(altered_text, matches) <- list(new)

    # And here's the result
    cat(altered_text)
    #> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
    #> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
    #> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT

    # Or, putting the text back to its old structure
    # (one element for each line)
    unlist(strsplit(altered_text, "n"))
    #> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
    #> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
    #> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"


    Created on 2018-10-16 by the reprex package (v0.2.1)



    We can do this since gregexpr() can find all the matches in the text for "change"; from help("gregexpr"):




    regexpr returns an integer vector of the same length as text giving
    the starting position of the first match....



    gregexpr returns a list of the same length as text each element of
    which is of the same form as the return value for regexpr, except that
    the starting positions of every (disjoint) match are given.




    (emphasis added).



    Then regmatches() can be used to either extract the matches found by gregexpr() or replace them; from help("regmatches"):




    Usage



    regmatches(x, m, invert = FALSE)

    regmatches(x, m, invert = FALSE) <- value



    ...



    value

    an object with suitable replacement values for the matched or
    non-matched substrings (see Details).



    ...



    Details



    The replacement function can be used for replacing the matched or
    non-matched substrings. For vector match data, if invert is FALSE,
    value should be a character vector with length the number of matched
    elements in m. Otherwise, it should be a list of character vectors
    with the same length as m, each as long as the number of replacements
    needed.







    share|improve this answer





























      up vote
      1
      down vote













      Another approach using strsplit:



      tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
      df <- stack(setNames(tl, seq_along(tl)))

      ix <- df$values == "change"
      df[ix, "values"] <- new
      tapply(df$values, df$ind, paste, collapse = " ")


      which gives:




       1 
      "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
      2
      "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
      3
      "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"



      Additionally you could wrap the tapply call in unname:



       unname(tapply(df$values, df$ind, paste, collapse = " "))


      which gives:




      [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one" 
      [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
      [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"






      share|improve this answer


















      • 1




        @zx8754 thx, I missed that; updated with a new solution now
        – Jaap
        2 mins ago










      Your Answer





      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: false,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52832017%2freplace-words-in-an-unstructured-text-file-in-r-using-a-for-loop%23new-answer', 'question_page');

      );

      Post as a guest






























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      5
      down vote













      How about this? The logic is, hammer away a string until it has no more change. On every "hit" (where change is found), move along the new vector.



      text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
      "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
      "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

      #Variable containing input for text
      new <- c("one", "two", "three", "four")
      new.i <- 1

      for (i in 1:length(text))
      while (grepl(pattern = "change", text[i]))
      text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
      new.i <- new.i + 1


      text

      [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
      [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
      [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"





      share|improve this answer
























        up vote
        5
        down vote













        How about this? The logic is, hammer away a string until it has no more change. On every "hit" (where change is found), move along the new vector.



        text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
        "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
        "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

        #Variable containing input for text
        new <- c("one", "two", "three", "four")
        new.i <- 1

        for (i in 1:length(text))
        while (grepl(pattern = "change", text[i]))
        text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
        new.i <- new.i + 1


        text

        [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
        [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
        [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"





        share|improve this answer






















          up vote
          5
          down vote










          up vote
          5
          down vote









          How about this? The logic is, hammer away a string until it has no more change. On every "hit" (where change is found), move along the new vector.



          text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
          "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
          "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

          #Variable containing input for text
          new <- c("one", "two", "three", "four")
          new.i <- 1

          for (i in 1:length(text))
          while (grepl(pattern = "change", text[i]))
          text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
          new.i <- new.i + 1


          text

          [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
          [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
          [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"





          share|improve this answer












          How about this? The logic is, hammer away a string until it has no more change. On every "hit" (where change is found), move along the new vector.



          text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
          "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
          "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

          #Variable containing input for text
          new <- c("one", "two", "three", "four")
          new.i <- 1

          for (i in 1:length(text))
          while (grepl(pattern = "change", text[i]))
          text[i] <- sub(pattern = "change", replacement = new[new.i], x = text[i])
          new.i <- new.i + 1


          text

          [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
          [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
          [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 38 mins ago









          Roman Luštrik

          48.2k17103158




          48.2k17103158






















              up vote
              1
              down vote













              Here is another solution using gregexpr() and regmatches():





              #text to be changed
              text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
              "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
              "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

              #Variable containing input for text
              new <- c("one", "two", "three", "four")

              # Alter the structure of text
              altered_text <- paste(text, collapse = "n")

              # So we can use gregexpr and regmatches to get what you want
              matches <- gregexpr("change", altered_text)
              regmatches(altered_text, matches) <- list(new)

              # And here's the result
              cat(altered_text)
              #> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
              #> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
              #> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT

              # Or, putting the text back to its old structure
              # (one element for each line)
              unlist(strsplit(altered_text, "n"))
              #> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
              #> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
              #> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"


              Created on 2018-10-16 by the reprex package (v0.2.1)



              We can do this since gregexpr() can find all the matches in the text for "change"; from help("gregexpr"):




              regexpr returns an integer vector of the same length as text giving
              the starting position of the first match....



              gregexpr returns a list of the same length as text each element of
              which is of the same form as the return value for regexpr, except that
              the starting positions of every (disjoint) match are given.




              (emphasis added).



              Then regmatches() can be used to either extract the matches found by gregexpr() or replace them; from help("regmatches"):




              Usage



              regmatches(x, m, invert = FALSE)

              regmatches(x, m, invert = FALSE) <- value



              ...



              value

              an object with suitable replacement values for the matched or
              non-matched substrings (see Details).



              ...



              Details



              The replacement function can be used for replacing the matched or
              non-matched substrings. For vector match data, if invert is FALSE,
              value should be a character vector with length the number of matched
              elements in m. Otherwise, it should be a list of character vectors
              with the same length as m, each as long as the number of replacements
              needed.







              share|improve this answer


























                up vote
                1
                down vote













                Here is another solution using gregexpr() and regmatches():





                #text to be changed
                text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
                "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
                "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

                #Variable containing input for text
                new <- c("one", "two", "three", "four")

                # Alter the structure of text
                altered_text <- paste(text, collapse = "n")

                # So we can use gregexpr and regmatches to get what you want
                matches <- gregexpr("change", altered_text)
                regmatches(altered_text, matches) <- list(new)

                # And here's the result
                cat(altered_text)
                #> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
                #> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
                #> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT

                # Or, putting the text back to its old structure
                # (one element for each line)
                unlist(strsplit(altered_text, "n"))
                #> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                #> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                #> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"


                Created on 2018-10-16 by the reprex package (v0.2.1)



                We can do this since gregexpr() can find all the matches in the text for "change"; from help("gregexpr"):




                regexpr returns an integer vector of the same length as text giving
                the starting position of the first match....



                gregexpr returns a list of the same length as text each element of
                which is of the same form as the return value for regexpr, except that
                the starting positions of every (disjoint) match are given.




                (emphasis added).



                Then regmatches() can be used to either extract the matches found by gregexpr() or replace them; from help("regmatches"):




                Usage



                regmatches(x, m, invert = FALSE)

                regmatches(x, m, invert = FALSE) <- value



                ...



                value

                an object with suitable replacement values for the matched or
                non-matched substrings (see Details).



                ...



                Details



                The replacement function can be used for replacing the matched or
                non-matched substrings. For vector match data, if invert is FALSE,
                value should be a character vector with length the number of matched
                elements in m. Otherwise, it should be a list of character vectors
                with the same length as m, each as long as the number of replacements
                needed.







                share|improve this answer
























                  up vote
                  1
                  down vote










                  up vote
                  1
                  down vote









                  Here is another solution using gregexpr() and regmatches():





                  #text to be changed
                  text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
                  "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
                  "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

                  #Variable containing input for text
                  new <- c("one", "two", "three", "four")

                  # Alter the structure of text
                  altered_text <- paste(text, collapse = "n")

                  # So we can use gregexpr and regmatches to get what you want
                  matches <- gregexpr("change", altered_text)
                  regmatches(altered_text, matches) <- list(new)

                  # And here's the result
                  cat(altered_text)
                  #> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
                  #> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
                  #> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT

                  # Or, putting the text back to its old structure
                  # (one element for each line)
                  unlist(strsplit(altered_text, "n"))
                  #> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                  #> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                  #> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"


                  Created on 2018-10-16 by the reprex package (v0.2.1)



                  We can do this since gregexpr() can find all the matches in the text for "change"; from help("gregexpr"):




                  regexpr returns an integer vector of the same length as text giving
                  the starting position of the first match....



                  gregexpr returns a list of the same length as text each element of
                  which is of the same form as the return value for regexpr, except that
                  the starting positions of every (disjoint) match are given.




                  (emphasis added).



                  Then regmatches() can be used to either extract the matches found by gregexpr() or replace them; from help("regmatches"):




                  Usage



                  regmatches(x, m, invert = FALSE)

                  regmatches(x, m, invert = FALSE) <- value



                  ...



                  value

                  an object with suitable replacement values for the matched or
                  non-matched substrings (see Details).



                  ...



                  Details



                  The replacement function can be used for replacing the matched or
                  non-matched substrings. For vector match data, if invert is FALSE,
                  value should be a character vector with length the number of matched
                  elements in m. Otherwise, it should be a list of character vectors
                  with the same length as m, each as long as the number of replacements
                  needed.







                  share|improve this answer














                  Here is another solution using gregexpr() and regmatches():





                  #text to be changed
                  text <- c("TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT change",
                  "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT TEXT change",
                  "TEXT TEXT TEXT change TEXT TEXT TEXT TEXT")

                  #Variable containing input for text
                  new <- c("one", "two", "three", "four")

                  # Alter the structure of text
                  altered_text <- paste(text, collapse = "n")

                  # So we can use gregexpr and regmatches to get what you want
                  matches <- gregexpr("change", altered_text)
                  regmatches(altered_text, matches) <- list(new)

                  # And here's the result
                  cat(altered_text)
                  #> TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one
                  #> TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three
                  #> TEXT TEXT TEXT four TEXT TEXT TEXT TEXT

                  # Or, putting the text back to its old structure
                  # (one element for each line)
                  unlist(strsplit(altered_text, "n"))
                  #> [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                  #> [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                  #> [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"


                  Created on 2018-10-16 by the reprex package (v0.2.1)



                  We can do this since gregexpr() can find all the matches in the text for "change"; from help("gregexpr"):




                  regexpr returns an integer vector of the same length as text giving
                  the starting position of the first match....



                  gregexpr returns a list of the same length as text each element of
                  which is of the same form as the return value for regexpr, except that
                  the starting positions of every (disjoint) match are given.




                  (emphasis added).



                  Then regmatches() can be used to either extract the matches found by gregexpr() or replace them; from help("regmatches"):




                  Usage



                  regmatches(x, m, invert = FALSE)

                  regmatches(x, m, invert = FALSE) <- value



                  ...



                  value

                  an object with suitable replacement values for the matched or
                  non-matched substrings (see Details).



                  ...



                  Details



                  The replacement function can be used for replacing the matched or
                  non-matched substrings. For vector match data, if invert is FALSE,
                  value should be a character vector with length the number of matched
                  elements in m. Otherwise, it should be a list of character vectors
                  with the same length as m, each as long as the number of replacements
                  needed.








                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 22 mins ago

























                  answered 31 mins ago









                  duckmayr

                  5,33911124




                  5,33911124




















                      up vote
                      1
                      down vote













                      Another approach using strsplit:



                      tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
                      df <- stack(setNames(tl, seq_along(tl)))

                      ix <- df$values == "change"
                      df[ix, "values"] <- new
                      tapply(df$values, df$ind, paste, collapse = " ")


                      which gives:




                       1 
                      "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                      2
                      "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      3
                      "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"



                      Additionally you could wrap the tapply call in unname:



                       unname(tapply(df$values, df$ind, paste, collapse = " "))


                      which gives:




                      [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one" 
                      [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"






                      share|improve this answer


















                      • 1




                        @zx8754 thx, I missed that; updated with a new solution now
                        – Jaap
                        2 mins ago














                      up vote
                      1
                      down vote













                      Another approach using strsplit:



                      tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
                      df <- stack(setNames(tl, seq_along(tl)))

                      ix <- df$values == "change"
                      df[ix, "values"] <- new
                      tapply(df$values, df$ind, paste, collapse = " ")


                      which gives:




                       1 
                      "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                      2
                      "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      3
                      "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"



                      Additionally you could wrap the tapply call in unname:



                       unname(tapply(df$values, df$ind, paste, collapse = " "))


                      which gives:




                      [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one" 
                      [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"






                      share|improve this answer


















                      • 1




                        @zx8754 thx, I missed that; updated with a new solution now
                        – Jaap
                        2 mins ago












                      up vote
                      1
                      down vote










                      up vote
                      1
                      down vote









                      Another approach using strsplit:



                      tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
                      df <- stack(setNames(tl, seq_along(tl)))

                      ix <- df$values == "change"
                      df[ix, "values"] <- new
                      tapply(df$values, df$ind, paste, collapse = " ")


                      which gives:




                       1 
                      "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                      2
                      "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      3
                      "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"



                      Additionally you could wrap the tapply call in unname:



                       unname(tapply(df$values, df$ind, paste, collapse = " "))


                      which gives:




                      [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one" 
                      [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"






                      share|improve this answer














                      Another approach using strsplit:



                      tl <- lapply(text, function(s) strsplit(s, split = " ")[[1]])
                      df <- stack(setNames(tl, seq_along(tl)))

                      ix <- df$values == "change"
                      df[ix, "values"] <- new
                      tapply(df$values, df$ind, paste, collapse = " ")


                      which gives:




                       1 
                      "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one"
                      2
                      "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      3
                      "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"



                      Additionally you could wrap the tapply call in unname:



                       unname(tapply(df$values, df$ind, paste, collapse = " "))


                      which gives:




                      [1] "TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT TEXT one" 
                      [2] "TEXT TEXT TEXT two TEXT TEXT TEXT TEXT TEXT three"
                      [3] "TEXT TEXT TEXT four TEXT TEXT TEXT TEXT"







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited 7 mins ago

























                      answered 25 mins ago









                      Jaap

                      52.9k20115123




                      52.9k20115123







                      • 1




                        @zx8754 thx, I missed that; updated with a new solution now
                        – Jaap
                        2 mins ago












                      • 1




                        @zx8754 thx, I missed that; updated with a new solution now
                        – Jaap
                        2 mins ago







                      1




                      1




                      @zx8754 thx, I missed that; updated with a new solution now
                      – Jaap
                      2 mins ago




                      @zx8754 thx, I missed that; updated with a new solution now
                      – Jaap
                      2 mins ago

















                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52832017%2freplace-words-in-an-unstructured-text-file-in-r-using-a-for-loop%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Comments

                      Popular posts from this blog

                      White Anglo-Saxon Protestant

                      Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                      One-line joke