Parser for pure LaTeX
Clash Royale CLAN TAG#URR8PPP
up vote
2
down vote
favorite
Is it possible to create a parser for pure LaTeX (no plain TeX, no TeX primitive) without using any TeX engine that supports total LaTeX?
I know about an iOS app to create and typeset LaTeX and I don't think that they are using a TeX engine.
Is there such a Parser?
parsing
 |Â
show 1 more comment
up vote
2
down vote
favorite
Is it possible to create a parser for pure LaTeX (no plain TeX, no TeX primitive) without using any TeX engine that supports total LaTeX?
I know about an iOS app to create and typeset LaTeX and I don't think that they are using a TeX engine.
Is there such a Parser?
parsing
2
Probably yes. Maybe no. What exactly do you want/need?
â Johannes_B
3 hours ago
1
Could you elaborate on how this differs from your other question? At the least, that had much more detail, and several worthwhile comments that could help you better phrase your question.
â Teepeemm
3 hours ago
3
Although this question is about parsing TeX code at its lowest (the way I interpreted it, at least) I think that the tex-core tag is to the inner working of TeX itself, and you specifically ruled it out, so I would remove it... Either way, I think it's a good question but requires more details. You want to parse only proper LaTeX code, or plain TeX needs to be included. Packages considered? Other formats? Please explain you question better or it will probably be closed as "too broad".
â Phelype Oleinik
3 hours ago
A clearly written Turing complete LaTeX parser in a high level language would be lovely to see.
â Anush
2 hours ago
2
What is pure LaTeX in your view? Are commands likedef
(TeX command, but used in many LaTeX documents) out of scope? And what should the output of the parser be?
â TeXnician
2 hours ago
 |Â
show 1 more comment
up vote
2
down vote
favorite
up vote
2
down vote
favorite
Is it possible to create a parser for pure LaTeX (no plain TeX, no TeX primitive) without using any TeX engine that supports total LaTeX?
I know about an iOS app to create and typeset LaTeX and I don't think that they are using a TeX engine.
Is there such a Parser?
parsing
Is it possible to create a parser for pure LaTeX (no plain TeX, no TeX primitive) without using any TeX engine that supports total LaTeX?
I know about an iOS app to create and typeset LaTeX and I don't think that they are using a TeX engine.
Is there such a Parser?
parsing
parsing
edited 2 hours ago
asked 3 hours ago
John webner
483
483
2
Probably yes. Maybe no. What exactly do you want/need?
â Johannes_B
3 hours ago
1
Could you elaborate on how this differs from your other question? At the least, that had much more detail, and several worthwhile comments that could help you better phrase your question.
â Teepeemm
3 hours ago
3
Although this question is about parsing TeX code at its lowest (the way I interpreted it, at least) I think that the tex-core tag is to the inner working of TeX itself, and you specifically ruled it out, so I would remove it... Either way, I think it's a good question but requires more details. You want to parse only proper LaTeX code, or plain TeX needs to be included. Packages considered? Other formats? Please explain you question better or it will probably be closed as "too broad".
â Phelype Oleinik
3 hours ago
A clearly written Turing complete LaTeX parser in a high level language would be lovely to see.
â Anush
2 hours ago
2
What is pure LaTeX in your view? Are commands likedef
(TeX command, but used in many LaTeX documents) out of scope? And what should the output of the parser be?
â TeXnician
2 hours ago
 |Â
show 1 more comment
2
Probably yes. Maybe no. What exactly do you want/need?
â Johannes_B
3 hours ago
1
Could you elaborate on how this differs from your other question? At the least, that had much more detail, and several worthwhile comments that could help you better phrase your question.
â Teepeemm
3 hours ago
3
Although this question is about parsing TeX code at its lowest (the way I interpreted it, at least) I think that the tex-core tag is to the inner working of TeX itself, and you specifically ruled it out, so I would remove it... Either way, I think it's a good question but requires more details. You want to parse only proper LaTeX code, or plain TeX needs to be included. Packages considered? Other formats? Please explain you question better or it will probably be closed as "too broad".
â Phelype Oleinik
3 hours ago
A clearly written Turing complete LaTeX parser in a high level language would be lovely to see.
â Anush
2 hours ago
2
What is pure LaTeX in your view? Are commands likedef
(TeX command, but used in many LaTeX documents) out of scope? And what should the output of the parser be?
â TeXnician
2 hours ago
2
2
Probably yes. Maybe no. What exactly do you want/need?
â Johannes_B
3 hours ago
Probably yes. Maybe no. What exactly do you want/need?
â Johannes_B
3 hours ago
1
1
Could you elaborate on how this differs from your other question? At the least, that had much more detail, and several worthwhile comments that could help you better phrase your question.
â Teepeemm
3 hours ago
Could you elaborate on how this differs from your other question? At the least, that had much more detail, and several worthwhile comments that could help you better phrase your question.
â Teepeemm
3 hours ago
3
3
Although this question is about parsing TeX code at its lowest (the way I interpreted it, at least) I think that the tex-core tag is to the inner working of TeX itself, and you specifically ruled it out, so I would remove it... Either way, I think it's a good question but requires more details. You want to parse only proper LaTeX code, or plain TeX needs to be included. Packages considered? Other formats? Please explain you question better or it will probably be closed as "too broad".
â Phelype Oleinik
3 hours ago
Although this question is about parsing TeX code at its lowest (the way I interpreted it, at least) I think that the tex-core tag is to the inner working of TeX itself, and you specifically ruled it out, so I would remove it... Either way, I think it's a good question but requires more details. You want to parse only proper LaTeX code, or plain TeX needs to be included. Packages considered? Other formats? Please explain you question better or it will probably be closed as "too broad".
â Phelype Oleinik
3 hours ago
A clearly written Turing complete LaTeX parser in a high level language would be lovely to see.
â Anush
2 hours ago
A clearly written Turing complete LaTeX parser in a high level language would be lovely to see.
â Anush
2 hours ago
2
2
What is pure LaTeX in your view? Are commands like
def
(TeX command, but used in many LaTeX documents) out of scope? And what should the output of the parser be?â TeXnician
2 hours ago
What is pure LaTeX in your view? Are commands like
def
(TeX command, but used in many LaTeX documents) out of scope? And what should the output of the parser be?â TeXnician
2 hours ago
 |Â
show 1 more comment
2 Answers
2
active
oldest
votes
up vote
3
down vote
If you write a parser you can define the subset of latex that you support. (There isn't really a useful definition of "Pure LaTeX with no primitives".)
For instance MathJax has a parser for a subset of LaTeX math markup, written in JavaScript, and LaTeXML has a parser for almost complete TeX written in perl, which does not include any TeX execution. LaTeXML's parser is perhaps the closest to what you ask, as far as I understand the question. https://github.com/brucemiller/LaTeXML
Here is an example that only uses commands defined in core latex (the shortvrb
package is part of the base LaTeX2e release, so it is as fundamental art of latex as say section
which is defined in article
class from the same base release files
documentclassarticle
usepackageshortvrb
begindocument
MakeShortVerb*
bfseries ** some text
DeleteShortVerb*
bfseries ** some text
enddocument
Note that it is not possible to statically assign any tokenisation to *}*
in the first case it produces the two character tokens improve this answer
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parseemphtextsectiontexttext
. I am looking for a full pure LaTeX parser.
â John webner
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
add a comment * in the first case it produces the two character tokens
Â
up vote
3
down vote
If you write a parser you can define the subset of latex that you support. (There isn't really a useful definition of "Pure LaTeX with no primitives".)
For instance MathJax has a parser for a subset of LaTeX math markup, written in JavaScript, and LaTeXML has a parser for almost complete TeX written in perl, which does not include any TeX execution. LaTeXML's parser is perhaps the closest to what you ask, as far as I understand the question. https://github.com/brucemiller/LaTeXML
Here is an example that only uses commands defined in core latex (the shortvrb
package is part of the base LaTeX2e release, so it is as fundamental art of latex as say section
which is defined in article
class from the same base release files
documentclassarticle
usepackageshortvrb
begindocument
MakeShortVerb*
bfseries ** some text
DeleteShortVerb*
bfseries ** some text
enddocument
Note that it is not possible to statically assign any tokenisation to **
in the first case it produces the two character tokens Â
up vote
3
down vote
up vote
3
down vote
If you write a parser you can define the subset of latex that you support. (There isn't really a useful definition of "Pure LaTeX with no primitives".)
For instance MathJax has a parser for a subset of LaTeX math markup, written in JavaScript, and LaTeXML has a parser for almost complete TeX written in perl, which does not include any TeX execution. LaTeXML's parser is perhaps the closest to what you ask, as far as I understand the question. https://github.com/brucemiller/LaTeXML
Here is an example that only uses commands defined in core latex (the shortvrb
package is part of the base LaTeX2e release, so it is as fundamental art of latex as say section
which is defined in article
class from the same base release files
documentclassarticle
usepackageshortvrb
begindocument
MakeShortVerb*
bfseries ** some text
DeleteShortVerb*
bfseries ** some text
enddocument
Note that it is not possible to statically assign any tokenisation to **
in the first case it produces the two character tokens improve this answer
If you write a parser you can define the subset of latex that you support. (There isn't really a useful definition of "Pure LaTeX with no primitives".)
For instance MathJax has a parser for a subset of LaTeX math markup, written in JavaScript, and LaTeXML has a parser for almost complete TeX written in perl, which does not include any TeX execution. LaTeXML's parser is perhaps the closest to what you ask, as far as I understand the question. https://github.com/brucemiller/LaTeXML
Here is an example that only uses commands defined in core latex (the shortvrb
package is part of the base LaTeX2e release, so it is as fundamental art of latex as say section
which is defined in article
class from the same base release files
documentclassarticle
usepackageshortvrb
begindocument
MakeShortVerb*
bfseries ** some text
DeleteShortVerb*
bfseries ** some text
enddocument
Note that it is not possible to statically assign any tokenisation to **
in the first case it produces the two character tokens {
in the second case it produces two character tokens **
(the first one being bold).
It would be reasonable to produce a LaTeX parser for a subset of the language that did not include this kind of construct, but you need to define the subset it isn't enough to say "not plain TeX or primitives" there are plain constructs that can be easily parsed, and there are LaTeX constructions that can not be parsed in general without access to a full tex typesetting system.
edited 17 mins ago
answered 2 hours ago
David Carlisle
473k3811011832
473k3811011832
add a comment |Â
add a comment |Â
up vote
0
down vote
I think this already occurs for document conversion software such as pandoc, and others on the internet. Generally speaking these converters only parse a subset of the commands. In addition regex can be used to extract certain tags of interest.
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parseemphtextsectiontexttext
. I am looking for a full pure LaTeX parser.
â John webner
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
add a comment |Â
up vote
0
down vote
I think this already occurs for document conversion software such as pandoc, and others on the internet. Generally speaking these converters only parse a subset of the commands. In addition regex can be used to extract certain tags of interest.
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parseemphtextsectiontexttext
. I am looking for a full pure LaTeX parser.
â John webner
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
add a comment |Â
up vote
0
down vote
up vote
0
down vote
I think this already occurs for document conversion software such as pandoc, and others on the internet. Generally speaking these converters only parse a subset of the commands. In addition regex can be used to extract certain tags of interest.
I think this already occurs for document conversion software such as pandoc, and others on the internet. Generally speaking these converters only parse a subset of the commands. In addition regex can be used to extract certain tags of interest.
answered 3 hours ago
GrandFleet
1565
1565
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parseemphtextsectiontexttext
. I am looking for a full pure LaTeX parser.
â John webner
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
add a comment |Â
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parseemphtextsectiontexttext
. I am looking for a full pure LaTeX parser.
â John webner
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parse
emphtextsectiontexttext
. I am looking for a full pure LaTeX parser.â John webner
2 hours ago
I know that parsers like pandoc can only parse a subset of the commands. E.g. Pandoc can not parse
emphtextsectiontexttext
. I am looking for a full pure LaTeX parser.â John webner
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
@Johnwebner Tbh, you shouldn't use such markup, but of course a parser might want to spit out somethingâ¦
â TeXnician
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
I think you are better off using tex then. doubt there can be a full parser without implementing tex or a subset.
â GrandFleet
2 hours ago
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f456947%2fparser-for-pure-latex%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
Probably yes. Maybe no. What exactly do you want/need?
â Johannes_B
3 hours ago
1
Could you elaborate on how this differs from your other question? At the least, that had much more detail, and several worthwhile comments that could help you better phrase your question.
â Teepeemm
3 hours ago
3
Although this question is about parsing TeX code at its lowest (the way I interpreted it, at least) I think that the tex-core tag is to the inner working of TeX itself, and you specifically ruled it out, so I would remove it... Either way, I think it's a good question but requires more details. You want to parse only proper LaTeX code, or plain TeX needs to be included. Packages considered? Other formats? Please explain you question better or it will probably be closed as "too broad".
â Phelype Oleinik
3 hours ago
A clearly written Turing complete LaTeX parser in a high level language would be lovely to see.
â Anush
2 hours ago
2
What is pure LaTeX in your view? Are commands like
def
(TeX command, but used in many LaTeX documents) out of scope? And what should the output of the parser be?â TeXnician
2 hours ago