Are all finitely recursive context free languages parseable with a regexp?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite












Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.



Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.



Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.



Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?



So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.



So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!



In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?










share|cite|improve this question

























    up vote
    1
    down vote

    favorite












    Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.



    Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.



    Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.



    Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?



    So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.



    So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!



    In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?










    share|cite|improve this question























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.



      Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.



      Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.



      Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?



      So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.



      So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!



      In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?










      share|cite|improve this question













      Let's say I have a context free language. It can be recognised by a pushdown automaton. Chances are it can't be parsed with a regular expression, as regular expressions are not as powerful as pushdown automata.



      Now, let's put an additional constraint on the language: the maximum recursion amount must be finite.



      Because the stack size has an upper bound in this case, my understanding is that there are finite number of stack configurations reachable. This means I could number them 0, 1, 2, 3, ..., N. So, I should be able to create a deterministic finite automaton (DFA) with states 0, 1, 2, 3, ..., N that recognises the same language that the pushdown automaton recognises.



      Now, if I'm able to create an equivalent DFA, doesn't it mean that there exists a regular expression that can parse the context-free language with maximum recursion amount?



      So, my theory is that all context-free languages that have a maximum recursion amount can be parsed with regular expressions. Is this theory correct? Of course, the theory says nothing about the complexity of the regular expression, it just says such a regular expression should exist.



      So, in other words: if your stack memory is limited, a regexp can do the job of a HTML/XML parser!



      In principle, isn't it true that computers with finite memory are actually DFAs and not Turning machines?







      formal-languages finite-automata pushdown-automata






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked 2 hours ago









      juhist

      1204




      1204




















          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.



          The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.






          share|cite|improve this answer



























            up vote
            1
            down vote













            This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.



            But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.



            PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.



            Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.






            share|cite|improve this answer




















              Your Answer




              StackExchange.ifUsing("editor", function ()
              return StackExchange.using("mathjaxEditing", function ()
              StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
              StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
              );
              );
              , "mathjax-editing");

              StackExchange.ready(function()
              var channelOptions =
              tags: "".split(" "),
              id: "419"
              ;
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function()
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled)
              StackExchange.using("snippets", function()
              createEditor();
              );

              else
              createEditor();

              );

              function createEditor()
              StackExchange.prepareEditor(
              heartbeatType: 'answer',
              convertImagesToLinks: false,
              noModals: false,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              );



              );













               

              draft saved


              draft discarded


















              StackExchange.ready(
              function ()
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f97645%2fare-all-finitely-recursive-context-free-languages-parseable-with-a-regexp%23new-answer', 'question_page');

              );

              Post as a guest






























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes








              up vote
              2
              down vote













              We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.



              The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.






              share|cite|improve this answer
























                up vote
                2
                down vote













                We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.



                The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.






                share|cite|improve this answer






















                  up vote
                  2
                  down vote










                  up vote
                  2
                  down vote









                  We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.



                  The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.






                  share|cite|improve this answer












                  We can take it even further: if we put a limit on the size of the HTML/XML, say 1PB, then there is only a finite number of them, so we can trivially parse them in $O(1)$ using a giant look-up table. However, that doesn't seem to tell us much about the complexity of parsing HTML/XML in practice.



                  The issue at stake here is modeling. A good model abstracts away the salient points of a real world situation in a way that makes the situation amenable to mathematical inquiry. Modeling HTML/XMLs as instances of arbitrarily recursing language forms a better model in practice than your suggestion or my suggestion.







                  share|cite|improve this answer












                  share|cite|improve this answer



                  share|cite|improve this answer










                  answered 1 hour ago









                  Yuval Filmus

                  182k12171331




                  182k12171331




















                      up vote
                      1
                      down vote













                      This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.



                      But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.



                      PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.



                      Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.






                      share|cite|improve this answer
























                        up vote
                        1
                        down vote













                        This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.



                        But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.



                        PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.



                        Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.






                        share|cite|improve this answer






















                          up vote
                          1
                          down vote










                          up vote
                          1
                          down vote









                          This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.



                          But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.



                          PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.



                          Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.






                          share|cite|improve this answer












                          This is true: a computer is, in theory, a finite-state machine. So it can only recognise regular expressions.



                          But suppose the computer isn't yet fully-built. Now, we can implement the full push-down automaton. When the PDA reaches the limit of the stack as currently built, it just waits until the next bit of the stack is bolted on. That's totally within the mathematical model of a PDA; nothing in the model says that the automaton can't take coffee breaks.



                          PDAs don't actually recognise infinite strings; they recognise finite strings of unbounded length. Every one of those strings will be recognised using a finite stack. So we never need to wait an infinite amount of time for the stack to be built out to the required depth.



                          Now you might have a theory about the maximum size of the universe. If that's true, then there is necessarily a limit to the length of the input to the PDA, and then it could have used a regular language instead. But your theory might be wrong, or it might be invalidated by some future event. We won't really know until we get there. Meanwhile, our arbitrarily expandable PDA can continue recognising the arbitrarily-long input until the end of the input is reached, at which point it will render judgement.







                          share|cite|improve this answer












                          share|cite|improve this answer



                          share|cite|improve this answer










                          answered 51 mins ago









                          rici

                          3,963718




                          3,963718



























                               

                              draft saved


                              draft discarded















































                               


                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function ()
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f97645%2fare-all-finitely-recursive-context-free-languages-parseable-with-a-regexp%23new-answer', 'question_page');

                              );

                              Post as a guest













































































                              Comments

                              Popular posts from this blog

                              Long meetings (6-7 hours a day): Being “babysat” by supervisor

                              Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                              Confectionery