Are European Union parallel multilingual texts ideal for machine learning of machine translation?

up vote
1
down vote

favorite

Are European Union parallel multilingual texts - regulations, directives, especially the debates of European parliament - ideal for machine learning of machine translation, e.g. with neural networks? My guess is that they are ideal, but I have not seen they to be used in actual research papers. If not, then - why they can not be ideal?

I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .

edited 16 mins ago

asked 2 hours ago

TomR

26917

add a commentÂ |Â

up vote
1
down vote

favorite

I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .

edited 16 mins ago

asked 2 hours ago

TomR

26917

add a commentÂ |Â

up vote
1
down vote

favorite

I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .

edited 16 mins ago

asked 2 hours ago

TomR

26917

I am specifically interested in the grammar induction as the by-product of the machine translation learning a la https://arxiv.org/abs/1805.10850 .

computational-linguistics translation machine-translation computer-science

edited 16 mins ago

asked 2 hours ago

TomR

26917

edited 16 mins ago

asked 2 hours ago

TomR

26917

edited 16 mins ago

asked 2 hours ago

TomR

26917

asked 2 hours ago

TomR

26917

asked 2 hours ago

TomR

26917

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
2
down vote

Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.

enter image description here

It would be useful for training a translation system specifically for European parliament domain.

But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.

How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.

answered 48 mins ago

A. M. Bittlingmayer

4,362921

add a commentÂ |Â

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "312"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2flinguistics.stackexchange.com%2fquestions%2f29212%2fare-european-union-parallel-multilingual-texts-ideal-for-machine-learning-of-mac%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
2
down vote

Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.

enter image description here

It would be useful for training a translation system specifically for European parliament domain.

But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.

How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.

answered 48 mins ago

A. M. Bittlingmayer

4,362921

add a commentÂ |Â

up vote
2
down vote

Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.

enter image description here

It would be useful for training a translation system specifically for European parliament domain.

But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.

How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.

answered 48 mins ago

A. M. Bittlingmayer

4,362921

add a commentÂ |Â

up vote
2
down vote

Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.

enter image description here

It would be useful for training a translation system specifically for European parliament domain.

But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.

How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.

answered 48 mins ago

A. M. Bittlingmayer

4,362921

Europarl is a classic corpus for research papers, used at the main conference - WMT - and by some of the top people in the field.

enter image description here

It would be useful for training a translation system specifically for European parliament domain.

But Europarl, like any domain-specific corpus, is not ideal for training a production-strength open-domain machine translation system.

How many times do the top queries like how r u or ai eu se te pego in the corpus? To say nothing of laham taz-ziemel or gradient descent.

answered 48 mins ago

A. M. Bittlingmayer

4,362921

answered 48 mins ago

A. M. Bittlingmayer

4,362921

answered 48 mins ago

A. M. Bittlingmayer

4,362921

answered 48 mins ago

A. M. Bittlingmayer

4,362921

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky