Are all finitely recursive context free languages parseable with a regexp?

up vote
1
down vote

favorite

Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.

Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.

Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.

Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?

So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.

So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!

In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?

asked 2 hours ago

juhist

1204

add a commentÂ |Â

up vote
1
down vote

favorite

Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.

Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?

So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!

In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?

asked 2 hours ago

juhist

1204

add a commentÂ |Â

up vote
1
down vote

favorite

Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.

Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?

So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!

In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?

asked 2 hours ago

juhist

1204

Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.

Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?

So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!

In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?

formal-languages finite-automata pushdown-automata

asked 2 hours ago

juhist

1204

asked 2 hours ago

juhist

1204

asked 2 hours ago

juhist

1204

asked 2 hours ago

juhist

1204

asked 2 hours ago

juhist

1204

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
2
down vote

We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.

The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.

answered 1 hour ago

Yuval Filmus

182k12171331

add a commentÂ |Â

up vote
1
down vote

This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.

But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.

PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.

Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.

answered 51 mins ago

rici

3,963718

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "419"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f97645%2fare-all-finitely-recursive-context-free-languages-parseable-with-a-regexp%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
2
down vote

answered 1 hour ago

Yuval Filmus

182k12171331

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

Yuval Filmus

182k12171331

add a commentÂ |Â

up vote
2
down vote

answered 1 hour ago

Yuval Filmus

182k12171331

answered 1 hour ago

Yuval Filmus

182k12171331

answered 1 hour ago

Yuval Filmus

182k12171331

answered 1 hour ago

Yuval Filmus

182k12171331

answered 1 hour ago

Yuval Filmus

182k12171331

add a commentÂ |Â

up vote
1
down vote

This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.

answered 51 mins ago

rici

3,963718

add a commentÂ |Â

up vote
1
down vote

This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.

answered 51 mins ago

rici

3,963718

add a commentÂ |Â

up vote
1
down vote

This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.

answered 51 mins ago

rici

3,963718

This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.

answered 51 mins ago

rici

3,963718

answered 51 mins ago

rici

3,963718

answered 51 mins ago

rici

3,963718

answered 51 mins ago

rici

3,963718

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky