Delete all French stopwords

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
2
down vote

favorite












I have a list of french stopwords:



frenchStopWords = "alors", "au", "aucuns", "aussi", "autre", "avant", "avec", "avoir", 
"bon", "car", "ce", "cela", "ces", "ceux", "chaque", "ci", "comme",
"comment", "dans", "des", "du", "dedans", "dehors", "depuis",
"devrait", "doit", "donc", "dos", "début", "elle", "elles", "en",
"encore", "essai", "est", "et", "eu", "fait", "faites", "fois",
"font", "hors", "ici", "il", "ils", "je", "juste", "la", "le", "les",
"leur", "là", "ma", "maintenant", "mais", "mes", "mine", "moins",
"mon", "mot", "même", "ni", "nommés", "notre", "nous", "ou", "où",
"par", "parce", "pas", "peut", "peu", "plupart", "pour", "pourquoi",
"quand", "que", "quel", "quelle", "quelles", "quels", "qui", "sa",
"sans", "ses", "seulement", "si", "sien", "son", "sont", "sous",
"soyez", "sujet", "sur", "ta", "tandis", "tellement", "tels", "tes",
"ton", "tous", "tout", "trop", "très", "tu", "voient", "vont",
"votre", "vous", "vu", "ça", "étaient", "état", "étions", "été",
"être";


And some french text here:



text = "et", "bien", "bonjour,", "vous", "avez", "déjà", "suivi,", 
"peut-être", "le", "cours", "electronique", "et,", "voilà,", "ça", "c
[CloseCurlyQuote]est", "le", "cours", "electronique", "ii,", "c
[CloseCurlyQuote]est", "une", "suite", "logique", "du", "premier",
"cours,", "dans", "le", "premier", "cours", "dans", "l
[CloseCurlyQuote]électronique", ",", "étudié,", "tout", "ce", "qui",
"était", "lié", "aux", "fonctions,", "électroniques", "basées",
"sur", "les", "amplificateurs,", "opérationnels.", "dans", "ce",
"cours", "là,,", "va", "aborder", "le", "transistor", "bipolaire,",
"et", "les", "fonctions", "analogiques", "de", "base.,", "donc",
"va", "partir", "avec", "toutes,", "les", "fonctions", "de", "base",
"depuis", "l[CloseCurlyQuote]analyse,", "du", "transistor", "jusqu
[CloseCurlyQuote]à", "ce", "qu[CloseCurlyQuote],", "arrive",
"avec", "des", "fonctions", "un", "peu,", "plus", "complexes,", "du",
"style", "analyser", "un,", "régulateur", "série", "et", "terminer",
"avec,", "l[CloseCurlyQuote]analyse", "des", "circuits", "tels",
"que,", "les", "amplificateurs", "de", "puissance,", "et", "les",
"amplificateurs", "audio.", "donc", "si", "regarde", "ce", "cours,,",
"va", "se", "rendre", "compte", "que", "la", "suite,", "d
[CloseCurlyQuote]electronique", "c[CloseCurlyQuote]est",
"electronique", "ii", ".,", "je", "vais", "aller", "dans", "ce",
"cours,", "et", "voir", "ce", "qui", "va", "se", "passer.,", "je",
"vais", "aller", "là-dedans,", "et", "voir", "comment", "c
[CloseCurlyQuote]est", "structuré", "ce", "cours.,", "donc,",
"exactement", "de", "la", "même", "manière", "que", "electronique",
",", "va", "trouver,", "les", "semaines.,", "il", "y", "l
[CloseCurlyQuote]équivalent", "de", "8", "semaines,", "d
[CloseCurlyQuote]études,", "dont", "deux", "examens.,", "un",
"examen";


Now I want to delete all stopwords. I tried:



DeleteCases[text,#]&/@frenchStopWords


...but this is wrong. I know that I need to use an "OR" operator, but I don't know how to implment it without running a for loop. Thanks.










share|improve this question



















  • 1




    Complement[text, frenchStopWords]
    – Carl Lange
    4 hours ago






  • 1




    @james: As well as the nice Comment, see "New Proposal" at mathematica.stackexchange.com/questions/18100/…. Too bad reference.wolfram.com/language/ref/DeleteStopwords.html does not have language Options.
    – Moo
    3 hours ago















up vote
2
down vote

favorite












I have a list of french stopwords:



frenchStopWords = "alors", "au", "aucuns", "aussi", "autre", "avant", "avec", "avoir", 
"bon", "car", "ce", "cela", "ces", "ceux", "chaque", "ci", "comme",
"comment", "dans", "des", "du", "dedans", "dehors", "depuis",
"devrait", "doit", "donc", "dos", "début", "elle", "elles", "en",
"encore", "essai", "est", "et", "eu", "fait", "faites", "fois",
"font", "hors", "ici", "il", "ils", "je", "juste", "la", "le", "les",
"leur", "là", "ma", "maintenant", "mais", "mes", "mine", "moins",
"mon", "mot", "même", "ni", "nommés", "notre", "nous", "ou", "où",
"par", "parce", "pas", "peut", "peu", "plupart", "pour", "pourquoi",
"quand", "que", "quel", "quelle", "quelles", "quels", "qui", "sa",
"sans", "ses", "seulement", "si", "sien", "son", "sont", "sous",
"soyez", "sujet", "sur", "ta", "tandis", "tellement", "tels", "tes",
"ton", "tous", "tout", "trop", "très", "tu", "voient", "vont",
"votre", "vous", "vu", "ça", "étaient", "état", "étions", "été",
"être";


And some french text here:



text = "et", "bien", "bonjour,", "vous", "avez", "déjà", "suivi,", 
"peut-être", "le", "cours", "electronique", "et,", "voilà,", "ça", "c
[CloseCurlyQuote]est", "le", "cours", "electronique", "ii,", "c
[CloseCurlyQuote]est", "une", "suite", "logique", "du", "premier",
"cours,", "dans", "le", "premier", "cours", "dans", "l
[CloseCurlyQuote]électronique", ",", "étudié,", "tout", "ce", "qui",
"était", "lié", "aux", "fonctions,", "électroniques", "basées",
"sur", "les", "amplificateurs,", "opérationnels.", "dans", "ce",
"cours", "là,,", "va", "aborder", "le", "transistor", "bipolaire,",
"et", "les", "fonctions", "analogiques", "de", "base.,", "donc",
"va", "partir", "avec", "toutes,", "les", "fonctions", "de", "base",
"depuis", "l[CloseCurlyQuote]analyse,", "du", "transistor", "jusqu
[CloseCurlyQuote]à", "ce", "qu[CloseCurlyQuote],", "arrive",
"avec", "des", "fonctions", "un", "peu,", "plus", "complexes,", "du",
"style", "analyser", "un,", "régulateur", "série", "et", "terminer",
"avec,", "l[CloseCurlyQuote]analyse", "des", "circuits", "tels",
"que,", "les", "amplificateurs", "de", "puissance,", "et", "les",
"amplificateurs", "audio.", "donc", "si", "regarde", "ce", "cours,,",
"va", "se", "rendre", "compte", "que", "la", "suite,", "d
[CloseCurlyQuote]electronique", "c[CloseCurlyQuote]est",
"electronique", "ii", ".,", "je", "vais", "aller", "dans", "ce",
"cours,", "et", "voir", "ce", "qui", "va", "se", "passer.,", "je",
"vais", "aller", "là-dedans,", "et", "voir", "comment", "c
[CloseCurlyQuote]est", "structuré", "ce", "cours.,", "donc,",
"exactement", "de", "la", "même", "manière", "que", "electronique",
",", "va", "trouver,", "les", "semaines.,", "il", "y", "l
[CloseCurlyQuote]équivalent", "de", "8", "semaines,", "d
[CloseCurlyQuote]études,", "dont", "deux", "examens.,", "un",
"examen";


Now I want to delete all stopwords. I tried:



DeleteCases[text,#]&/@frenchStopWords


...but this is wrong. I know that I need to use an "OR" operator, but I don't know how to implment it without running a for loop. Thanks.










share|improve this question



















  • 1




    Complement[text, frenchStopWords]
    – Carl Lange
    4 hours ago






  • 1




    @james: As well as the nice Comment, see "New Proposal" at mathematica.stackexchange.com/questions/18100/…. Too bad reference.wolfram.com/language/ref/DeleteStopwords.html does not have language Options.
    – Moo
    3 hours ago













up vote
2
down vote

favorite









up vote
2
down vote

favorite











I have a list of french stopwords:



frenchStopWords = "alors", "au", "aucuns", "aussi", "autre", "avant", "avec", "avoir", 
"bon", "car", "ce", "cela", "ces", "ceux", "chaque", "ci", "comme",
"comment", "dans", "des", "du", "dedans", "dehors", "depuis",
"devrait", "doit", "donc", "dos", "début", "elle", "elles", "en",
"encore", "essai", "est", "et", "eu", "fait", "faites", "fois",
"font", "hors", "ici", "il", "ils", "je", "juste", "la", "le", "les",
"leur", "là", "ma", "maintenant", "mais", "mes", "mine", "moins",
"mon", "mot", "même", "ni", "nommés", "notre", "nous", "ou", "où",
"par", "parce", "pas", "peut", "peu", "plupart", "pour", "pourquoi",
"quand", "que", "quel", "quelle", "quelles", "quels", "qui", "sa",
"sans", "ses", "seulement", "si", "sien", "son", "sont", "sous",
"soyez", "sujet", "sur", "ta", "tandis", "tellement", "tels", "tes",
"ton", "tous", "tout", "trop", "très", "tu", "voient", "vont",
"votre", "vous", "vu", "ça", "étaient", "état", "étions", "été",
"être";


And some french text here:



text = "et", "bien", "bonjour,", "vous", "avez", "déjà", "suivi,", 
"peut-être", "le", "cours", "electronique", "et,", "voilà,", "ça", "c
[CloseCurlyQuote]est", "le", "cours", "electronique", "ii,", "c
[CloseCurlyQuote]est", "une", "suite", "logique", "du", "premier",
"cours,", "dans", "le", "premier", "cours", "dans", "l
[CloseCurlyQuote]électronique", ",", "étudié,", "tout", "ce", "qui",
"était", "lié", "aux", "fonctions,", "électroniques", "basées",
"sur", "les", "amplificateurs,", "opérationnels.", "dans", "ce",
"cours", "là,,", "va", "aborder", "le", "transistor", "bipolaire,",
"et", "les", "fonctions", "analogiques", "de", "base.,", "donc",
"va", "partir", "avec", "toutes,", "les", "fonctions", "de", "base",
"depuis", "l[CloseCurlyQuote]analyse,", "du", "transistor", "jusqu
[CloseCurlyQuote]à", "ce", "qu[CloseCurlyQuote],", "arrive",
"avec", "des", "fonctions", "un", "peu,", "plus", "complexes,", "du",
"style", "analyser", "un,", "régulateur", "série", "et", "terminer",
"avec,", "l[CloseCurlyQuote]analyse", "des", "circuits", "tels",
"que,", "les", "amplificateurs", "de", "puissance,", "et", "les",
"amplificateurs", "audio.", "donc", "si", "regarde", "ce", "cours,,",
"va", "se", "rendre", "compte", "que", "la", "suite,", "d
[CloseCurlyQuote]electronique", "c[CloseCurlyQuote]est",
"electronique", "ii", ".,", "je", "vais", "aller", "dans", "ce",
"cours,", "et", "voir", "ce", "qui", "va", "se", "passer.,", "je",
"vais", "aller", "là-dedans,", "et", "voir", "comment", "c
[CloseCurlyQuote]est", "structuré", "ce", "cours.,", "donc,",
"exactement", "de", "la", "même", "manière", "que", "electronique",
",", "va", "trouver,", "les", "semaines.,", "il", "y", "l
[CloseCurlyQuote]équivalent", "de", "8", "semaines,", "d
[CloseCurlyQuote]études,", "dont", "deux", "examens.,", "un",
"examen";


Now I want to delete all stopwords. I tried:



DeleteCases[text,#]&/@frenchStopWords


...but this is wrong. I know that I need to use an "OR" operator, but I don't know how to implment it without running a for loop. Thanks.










share|improve this question















I have a list of french stopwords:



frenchStopWords = "alors", "au", "aucuns", "aussi", "autre", "avant", "avec", "avoir", 
"bon", "car", "ce", "cela", "ces", "ceux", "chaque", "ci", "comme",
"comment", "dans", "des", "du", "dedans", "dehors", "depuis",
"devrait", "doit", "donc", "dos", "début", "elle", "elles", "en",
"encore", "essai", "est", "et", "eu", "fait", "faites", "fois",
"font", "hors", "ici", "il", "ils", "je", "juste", "la", "le", "les",
"leur", "là", "ma", "maintenant", "mais", "mes", "mine", "moins",
"mon", "mot", "même", "ni", "nommés", "notre", "nous", "ou", "où",
"par", "parce", "pas", "peut", "peu", "plupart", "pour", "pourquoi",
"quand", "que", "quel", "quelle", "quelles", "quels", "qui", "sa",
"sans", "ses", "seulement", "si", "sien", "son", "sont", "sous",
"soyez", "sujet", "sur", "ta", "tandis", "tellement", "tels", "tes",
"ton", "tous", "tout", "trop", "très", "tu", "voient", "vont",
"votre", "vous", "vu", "ça", "étaient", "état", "étions", "été",
"être";


And some french text here:



text = "et", "bien", "bonjour,", "vous", "avez", "déjà", "suivi,", 
"peut-être", "le", "cours", "electronique", "et,", "voilà,", "ça", "c
[CloseCurlyQuote]est", "le", "cours", "electronique", "ii,", "c
[CloseCurlyQuote]est", "une", "suite", "logique", "du", "premier",
"cours,", "dans", "le", "premier", "cours", "dans", "l
[CloseCurlyQuote]électronique", ",", "étudié,", "tout", "ce", "qui",
"était", "lié", "aux", "fonctions,", "électroniques", "basées",
"sur", "les", "amplificateurs,", "opérationnels.", "dans", "ce",
"cours", "là,,", "va", "aborder", "le", "transistor", "bipolaire,",
"et", "les", "fonctions", "analogiques", "de", "base.,", "donc",
"va", "partir", "avec", "toutes,", "les", "fonctions", "de", "base",
"depuis", "l[CloseCurlyQuote]analyse,", "du", "transistor", "jusqu
[CloseCurlyQuote]à", "ce", "qu[CloseCurlyQuote],", "arrive",
"avec", "des", "fonctions", "un", "peu,", "plus", "complexes,", "du",
"style", "analyser", "un,", "régulateur", "série", "et", "terminer",
"avec,", "l[CloseCurlyQuote]analyse", "des", "circuits", "tels",
"que,", "les", "amplificateurs", "de", "puissance,", "et", "les",
"amplificateurs", "audio.", "donc", "si", "regarde", "ce", "cours,,",
"va", "se", "rendre", "compte", "que", "la", "suite,", "d
[CloseCurlyQuote]electronique", "c[CloseCurlyQuote]est",
"electronique", "ii", ".,", "je", "vais", "aller", "dans", "ce",
"cours,", "et", "voir", "ce", "qui", "va", "se", "passer.,", "je",
"vais", "aller", "là-dedans,", "et", "voir", "comment", "c
[CloseCurlyQuote]est", "structuré", "ce", "cours.,", "donc,",
"exactement", "de", "la", "même", "manière", "que", "electronique",
",", "va", "trouver,", "les", "semaines.,", "il", "y", "l
[CloseCurlyQuote]équivalent", "de", "8", "semaines,", "d
[CloseCurlyQuote]études,", "dont", "deux", "examens.,", "un",
"examen";


Now I want to delete all stopwords. I tried:



DeleteCases[text,#]&/@frenchStopWords


...but this is wrong. I know that I need to use an "OR" operator, but I don't know how to implment it without running a for loop. Thanks.







string-manipulation natural-language






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 3 hours ago









gwr

7,01622457




7,01622457










asked 4 hours ago









james

664418




664418







  • 1




    Complement[text, frenchStopWords]
    – Carl Lange
    4 hours ago






  • 1




    @james: As well as the nice Comment, see "New Proposal" at mathematica.stackexchange.com/questions/18100/…. Too bad reference.wolfram.com/language/ref/DeleteStopwords.html does not have language Options.
    – Moo
    3 hours ago













  • 1




    Complement[text, frenchStopWords]
    – Carl Lange
    4 hours ago






  • 1




    @james: As well as the nice Comment, see "New Proposal" at mathematica.stackexchange.com/questions/18100/…. Too bad reference.wolfram.com/language/ref/DeleteStopwords.html does not have language Options.
    – Moo
    3 hours ago








1




1




Complement[text, frenchStopWords]
– Carl Lange
4 hours ago




Complement[text, frenchStopWords]
– Carl Lange
4 hours ago




1




1




@james: As well as the nice Comment, see "New Proposal" at mathematica.stackexchange.com/questions/18100/…. Too bad reference.wolfram.com/language/ref/DeleteStopwords.html does not have language Options.
– Moo
3 hours ago





@james: As well as the nice Comment, see "New Proposal" at mathematica.stackexchange.com/questions/18100/…. Too bad reference.wolfram.com/language/ref/DeleteStopwords.html does not have language Options.
– Moo
3 hours ago











1 Answer
1






active

oldest

votes

















up vote
2
down vote













I think there is an important lesson to learn here, many people overlook:



DeleteCases[text, Alternatives@@frenchStopWords]


While Complement in the comments seems equivalent it does not accept patterns and DeleteCases does, which would account for more general cases. This is why it i useful to remember Alternatives usage in DeleteCases.






share|improve this answer






















    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "387"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f184750%2fdelete-all-french-stopwords%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    2
    down vote













    I think there is an important lesson to learn here, many people overlook:



    DeleteCases[text, Alternatives@@frenchStopWords]


    While Complement in the comments seems equivalent it does not accept patterns and DeleteCases does, which would account for more general cases. This is why it i useful to remember Alternatives usage in DeleteCases.






    share|improve this answer


























      up vote
      2
      down vote













      I think there is an important lesson to learn here, many people overlook:



      DeleteCases[text, Alternatives@@frenchStopWords]


      While Complement in the comments seems equivalent it does not accept patterns and DeleteCases does, which would account for more general cases. This is why it i useful to remember Alternatives usage in DeleteCases.






      share|improve this answer
























        up vote
        2
        down vote










        up vote
        2
        down vote









        I think there is an important lesson to learn here, many people overlook:



        DeleteCases[text, Alternatives@@frenchStopWords]


        While Complement in the comments seems equivalent it does not accept patterns and DeleteCases does, which would account for more general cases. This is why it i useful to remember Alternatives usage in DeleteCases.






        share|improve this answer














        I think there is an important lesson to learn here, many people overlook:



        DeleteCases[text, Alternatives@@frenchStopWords]


        While Complement in the comments seems equivalent it does not accept patterns and DeleteCases does, which would account for more general cases. This is why it i useful to remember Alternatives usage in DeleteCases.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 23 mins ago

























        answered 31 mins ago









        Vitaliy Kaurov

        56.4k6158275




        56.4k6158275



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f184750%2fdelete-all-french-stopwords%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            Long meetings (6-7 hours a day): Being “babysat” by supervisor

            Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

            Confectionery