Are European Union parallel multilingual texts ideal for machine learning of machine translation?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












Are European Union parallel multilingual texts - regulations, directives, especially the debates of European parliament - ideal for machine learning of machine translation, e.g. with neural networks? My guess is that they are ideal, but I have not seen they to be used in actual research papers. If not, then - why they can not be ideal?



I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .










share|improve this question



























    up vote
    1
    down vote

    favorite












    Are European Union parallel multilingual texts - regulations, directives, especially the debates of European parliament - ideal for machine learning of machine translation, e.g. with neural networks? My guess is that they are ideal, but I have not seen they to be used in actual research papers. If not, then - why they can not be ideal?



    I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .










    share|improve this question

























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      Are European Union parallel multilingual texts - regulations, directives, especially the debates of European parliament - ideal for machine learning of machine translation, e.g. with neural networks? My guess is that they are ideal, but I have not seen they to be used in actual research papers. If not, then - why they can not be ideal?



      I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .










      share|improve this question















      Are European Union parallel multilingual texts - regulations, directives, especially the debates of European parliament - ideal for machine learning of machine translation, e.g. with neural networks? My guess is that they are ideal, but I have not seen they to be used in actual research papers. If not, then - why they can not be ideal?



      I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .







      computational-linguistics translation machine-translation computer-science






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 16 mins ago

























      asked 2 hours ago









      TomR

      26917




      26917




















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          2
          down vote













          Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.



          enter image description here



          It would be useful for training a translation system specifically for European parliament domain.



          But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.



          How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.






          share|improve this answer




















            Your Answer







            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "312"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );













             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f29212%2fare-european-union-parallel-multilingual-texts-ideal-for-machine-learning-of-mac%23new-answer', 'question_page');

            );

            Post as a guest






























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote













            Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.



            enter image description here



            It would be useful for training a translation system specifically for European parliament domain.



            But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.



            How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.






            share|improve this answer
























              up vote
              2
              down vote













              Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.



              enter image description here



              It would be useful for training a translation system specifically for European parliament domain.



              But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.



              How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.






              share|improve this answer






















                up vote
                2
                down vote










                up vote
                2
                down vote









                Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.



                enter image description here



                It would be useful for training a translation system specifically for European parliament domain.



                But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.



                How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.






                share|improve this answer












                Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.



                enter image description here



                It would be useful for training a translation system specifically for European parliament domain.



                But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.



                How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 48 mins ago









                A. M. Bittlingmayer

                4,362921




                4,362921



























                     

                    draft saved


                    draft discarded















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f29212%2fare-european-union-parallel-multilingual-texts-ideal-for-machine-learning-of-mac%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Comments

                    Popular posts from this blog

                    What does second last employer means? [closed]

                    List of Gilmore Girls characters

                    One-line joke