Are all finitely recursive context free languages parseable with a regexp?
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.
Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.
Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.
Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?
So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.
So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!
In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?
formal-languages finite-automata pushdown-automata
add a comment |Â
up vote
1
down vote
favorite
Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.
Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.
Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.
Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?
So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.
So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!
In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?
formal-languages finite-automata pushdown-automata
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.
Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.
Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.
Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?
So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.
So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!
In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?
formal-languages finite-automata pushdown-automata
Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.
Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.
Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.
Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?
So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.
So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!
In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?
formal-languages finite-automata pushdown-automata
formal-languages finite-automata pushdown-automata
asked 2 hours ago
juhist
1204
1204
add a comment |Â
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.
The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.
add a comment |Â
up vote
1
down vote
This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.
But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.
PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.
Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.
The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.
add a comment |Â
up vote
2
down vote
We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.
The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.
The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.
We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.
The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.
answered 1 hour ago
Yuval Filmus
182k12171331
182k12171331
add a comment |Â
add a comment |Â
up vote
1
down vote
This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.
But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.
PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.
Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.
add a comment |Â
up vote
1
down vote
This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.
But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.
PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.
Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.
But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.
PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.
Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.
This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.
But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.
PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.
Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.
answered 51 mins ago
rici
3,963718
3,963718
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f97645%2fare-all-finitely-recursive-context-free-languages-parseable-with-a-regexp%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password