à ¿ (long s) and hyphenation
Clash Royale CLAN TAG#URR8PPP
up vote
13
down vote
favorite
I'm using UTF-8 so it is â if one is using the right font â no problem to implement the different types of s in older German texts (s, à ¿, ÃÂ). Unfortunately the hyphenation breaks because LaTeX does not know that à ¿ has to be dealt with just the same as it would do when dealing with "s".
MWE:
documentclassarticle
usepackage[ngerman]babel
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
hyphenationGe-Ã
¿ell-Ã
¿chaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
As one can see at the end of the line, I have to add the correct hyphenation manually. Any idea how I can solve this?
xetex luatex unicode hyphenation german
add a comment |Â
up vote
13
down vote
favorite
I'm using UTF-8 so it is â if one is using the right font â no problem to implement the different types of s in older German texts (s, à ¿, ÃÂ). Unfortunately the hyphenation breaks because LaTeX does not know that à ¿ has to be dealt with just the same as it would do when dealing with "s".
MWE:
documentclassarticle
usepackage[ngerman]babel
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
hyphenationGe-Ã
¿ell-Ã
¿chaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
As one can see at the end of the line, I have to add the correct hyphenation manually. Any idea how I can solve this?
xetex luatex unicode hyphenation german
2
you could make this work automatically in luatex or xetex but in pdftex the best you could do is define à ¿ to always allow a hyphenation after it, it can not take part inpatterns
orhyphenation
â David Carlisle
Aug 6 at 14:45
4
your choices would be essentially to (a) usehyphenation
as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) uses
in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use)
â David Carlisle
Aug 6 at 15:29
I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source
â Martin Mueller
Aug 6 at 15:37
oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points,
â David Carlisle
Aug 6 at 15:37
If the list of problematic patterns is short, in luatex you can also add them withbabelpatterns
in the document itself.
â Javier Bezos
Aug 6 at 17:38
add a comment |Â
up vote
13
down vote
favorite
up vote
13
down vote
favorite
I'm using UTF-8 so it is â if one is using the right font â no problem to implement the different types of s in older German texts (s, à ¿, ÃÂ). Unfortunately the hyphenation breaks because LaTeX does not know that à ¿ has to be dealt with just the same as it would do when dealing with "s".
MWE:
documentclassarticle
usepackage[ngerman]babel
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
hyphenationGe-Ã
¿ell-Ã
¿chaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
As one can see at the end of the line, I have to add the correct hyphenation manually. Any idea how I can solve this?
xetex luatex unicode hyphenation german
I'm using UTF-8 so it is â if one is using the right font â no problem to implement the different types of s in older German texts (s, à ¿, ÃÂ). Unfortunately the hyphenation breaks because LaTeX does not know that à ¿ has to be dealt with just the same as it would do when dealing with "s".
MWE:
documentclassarticle
usepackage[ngerman]babel
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
hyphenationGe-Ã
¿ell-Ã
¿chaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
As one can see at the end of the line, I have to add the correct hyphenation manually. Any idea how I can solve this?
xetex luatex unicode hyphenation german
edited Aug 6 at 15:20
David Carlisle
463k3810861803
463k3810861803
asked Aug 6 at 14:38
Martin Mueller
428211
428211
2
you could make this work automatically in luatex or xetex but in pdftex the best you could do is define à ¿ to always allow a hyphenation after it, it can not take part inpatterns
orhyphenation
â David Carlisle
Aug 6 at 14:45
4
your choices would be essentially to (a) usehyphenation
as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) uses
in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use)
â David Carlisle
Aug 6 at 15:29
I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source
â Martin Mueller
Aug 6 at 15:37
oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points,
â David Carlisle
Aug 6 at 15:37
If the list of problematic patterns is short, in luatex you can also add them withbabelpatterns
in the document itself.
â Javier Bezos
Aug 6 at 17:38
add a comment |Â
2
you could make this work automatically in luatex or xetex but in pdftex the best you could do is define à ¿ to always allow a hyphenation after it, it can not take part inpatterns
orhyphenation
â David Carlisle
Aug 6 at 14:45
4
your choices would be essentially to (a) usehyphenation
as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) uses
in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use)
â David Carlisle
Aug 6 at 15:29
I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source
â Martin Mueller
Aug 6 at 15:37
oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points,
â David Carlisle
Aug 6 at 15:37
If the list of problematic patterns is short, in luatex you can also add them withbabelpatterns
in the document itself.
â Javier Bezos
Aug 6 at 17:38
2
2
you could make this work automatically in luatex or xetex but in pdftex the best you could do is define à ¿ to always allow a hyphenation after it, it can not take part in
patterns
or hyphenation
â David Carlisle
Aug 6 at 14:45
you could make this work automatically in luatex or xetex but in pdftex the best you could do is define à ¿ to always allow a hyphenation after it, it can not take part in
patterns
or hyphenation
â David Carlisle
Aug 6 at 14:45
4
4
your choices would be essentially to (a) use
hyphenation
as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) use s
in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use)â David Carlisle
Aug 6 at 15:29
your choices would be essentially to (a) use
hyphenation
as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) use s
in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use)â David Carlisle
Aug 6 at 15:29
I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source
â Martin Mueller
Aug 6 at 15:37
I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source
â Martin Mueller
Aug 6 at 15:37
oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points,
â David Carlisle
Aug 6 at 15:37
oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points,
â David Carlisle
Aug 6 at 15:37
If the list of problematic patterns is short, in luatex you can also add them with
babelpatterns
in the document itself.â Javier Bezos
Aug 6 at 17:38
If the list of problematic patterns is short, in luatex you can also add them with
babelpatterns
in the document itself.â Javier Bezos
Aug 6 at 17:38
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
9
down vote
accepted
For LuaTeX here is an implementation of David Carlisles idea to create a hypenate
callback. It works by replacing every Ã
¿
with a marked s
before hyphenation and then recovering the original characters after hyphenation:
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local sattr = luatexbase.new_attribute("longsattr")
local disc = node.id'disc'
print('DISC', disc)
local function long_to_s(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
print(n)
long_to_s(n.pre)
long_to_s(n.post)
long_to_s(n.replace)
end
if n.char == 383 then
n.char = 115
node.set_attribute(n, sattr, 383)
end
end
end
local function s_to_long(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
s_to_long(n.pre)
s_to_long(n.post)
s_to_long(n.replace)
end
local a = node.get_attribute(n, sattr)
if a then
n.char = a
node.unset_attribute(n, sattr)
end
end
end
local function myhyph(head, tail)
long_to_s(head, tail)
lang.hyphenate(head, tail)
s_to_long(head, tail)
end
luatexbase.add_to_callback("hyphenate",myhyph,"hyphenate with modified s")
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
LuaTeX also allows you to manipulate the hyphenation pattern during a run, so you can also use (this is an automated version of David Carlisles choice (b)):
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local l = lang.new(tex.language)
l:patterns(l:patterns():gsub('s', 'Ã
¿'))
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
1
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
add a comment |Â
up vote
2
down vote
A simple way to do this is to choose a font that supports à ¿ as an open type character variant, e.g., EB Garamond. Then you can just select that variant when you need it.
(Re-reading the comments above, I see this corresponds to option (c) from David Carlisle, which you said wasn't suitable, but this MWE shows you can have both kinds of s with this method.)
Update showing ià ¿t and à ¿elbes.
documentclassarticle
usepackage[ngerman]babel
babelfontrmEB Garamond
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
addfontfeatureCharacterVariant=1
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
ist selbes
enddocument
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
9
down vote
accepted
For LuaTeX here is an implementation of David Carlisles idea to create a hypenate
callback. It works by replacing every Ã
¿
with a marked s
before hyphenation and then recovering the original characters after hyphenation:
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local sattr = luatexbase.new_attribute("longsattr")
local disc = node.id'disc'
print('DISC', disc)
local function long_to_s(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
print(n)
long_to_s(n.pre)
long_to_s(n.post)
long_to_s(n.replace)
end
if n.char == 383 then
n.char = 115
node.set_attribute(n, sattr, 383)
end
end
end
local function s_to_long(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
s_to_long(n.pre)
s_to_long(n.post)
s_to_long(n.replace)
end
local a = node.get_attribute(n, sattr)
if a then
n.char = a
node.unset_attribute(n, sattr)
end
end
end
local function myhyph(head, tail)
long_to_s(head, tail)
lang.hyphenate(head, tail)
s_to_long(head, tail)
end
luatexbase.add_to_callback("hyphenate",myhyph,"hyphenate with modified s")
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
LuaTeX also allows you to manipulate the hyphenation pattern during a run, so you can also use (this is an automated version of David Carlisles choice (b)):
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local l = lang.new(tex.language)
l:patterns(l:patterns():gsub('s', 'Ã
¿'))
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
1
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
add a comment |Â
up vote
9
down vote
accepted
For LuaTeX here is an implementation of David Carlisles idea to create a hypenate
callback. It works by replacing every Ã
¿
with a marked s
before hyphenation and then recovering the original characters after hyphenation:
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local sattr = luatexbase.new_attribute("longsattr")
local disc = node.id'disc'
print('DISC', disc)
local function long_to_s(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
print(n)
long_to_s(n.pre)
long_to_s(n.post)
long_to_s(n.replace)
end
if n.char == 383 then
n.char = 115
node.set_attribute(n, sattr, 383)
end
end
end
local function s_to_long(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
s_to_long(n.pre)
s_to_long(n.post)
s_to_long(n.replace)
end
local a = node.get_attribute(n, sattr)
if a then
n.char = a
node.unset_attribute(n, sattr)
end
end
end
local function myhyph(head, tail)
long_to_s(head, tail)
lang.hyphenate(head, tail)
s_to_long(head, tail)
end
luatexbase.add_to_callback("hyphenate",myhyph,"hyphenate with modified s")
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
LuaTeX also allows you to manipulate the hyphenation pattern during a run, so you can also use (this is an automated version of David Carlisles choice (b)):
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local l = lang.new(tex.language)
l:patterns(l:patterns():gsub('s', 'Ã
¿'))
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
1
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
add a comment |Â
up vote
9
down vote
accepted
up vote
9
down vote
accepted
For LuaTeX here is an implementation of David Carlisles idea to create a hypenate
callback. It works by replacing every Ã
¿
with a marked s
before hyphenation and then recovering the original characters after hyphenation:
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local sattr = luatexbase.new_attribute("longsattr")
local disc = node.id'disc'
print('DISC', disc)
local function long_to_s(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
print(n)
long_to_s(n.pre)
long_to_s(n.post)
long_to_s(n.replace)
end
if n.char == 383 then
n.char = 115
node.set_attribute(n, sattr, 383)
end
end
end
local function s_to_long(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
s_to_long(n.pre)
s_to_long(n.post)
s_to_long(n.replace)
end
local a = node.get_attribute(n, sattr)
if a then
n.char = a
node.unset_attribute(n, sattr)
end
end
end
local function myhyph(head, tail)
long_to_s(head, tail)
lang.hyphenate(head, tail)
s_to_long(head, tail)
end
luatexbase.add_to_callback("hyphenate",myhyph,"hyphenate with modified s")
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
LuaTeX also allows you to manipulate the hyphenation pattern during a run, so you can also use (this is an automated version of David Carlisles choice (b)):
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local l = lang.new(tex.language)
l:patterns(l:patterns():gsub('s', 'Ã
¿'))
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
For LuaTeX here is an implementation of David Carlisles idea to create a hypenate
callback. It works by replacing every Ã
¿
with a marked s
before hyphenation and then recovering the original characters after hyphenation:
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local sattr = luatexbase.new_attribute("longsattr")
local disc = node.id'disc'
print('DISC', disc)
local function long_to_s(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
print(n)
long_to_s(n.pre)
long_to_s(n.post)
long_to_s(n.replace)
end
if n.char == 383 then
n.char = 115
node.set_attribute(n, sattr, 383)
end
end
end
local function s_to_long(head, tail)
for n in node.traverse(head) do
if n == tail then break end
if n.id == disc then
s_to_long(n.pre)
s_to_long(n.post)
s_to_long(n.replace)
end
local a = node.get_attribute(n, sattr)
if a then
n.char = a
node.unset_attribute(n, sattr)
end
end
end
local function myhyph(head, tail)
long_to_s(head, tail)
lang.hyphenate(head, tail)
s_to_long(head, tail)
end
luatexbase.add_to_callback("hyphenate",myhyph,"hyphenate with modified s")
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
LuaTeX also allows you to manipulate the hyphenation pattern during a run, so you can also use (this is an automated version of David Carlisles choice (b)):
documentclassarticle
usepackage[ngerman]babel
usepackageluacode
beginluacode*
local l = lang.new(tex.language)
l:patterns(l:patterns():gsub('s', 'Ã
¿'))
endluacode*
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
XXX GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft GeÃ
¿ellÃ
¿chaft
enddocument
edited Aug 7 at 6:48
answered Aug 6 at 17:32
Marcel Krüger
9,7561828
9,7561828
1
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
add a comment |Â
1
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
1
1
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
+1 thanks :-)..
â David Carlisle
Aug 6 at 17:44
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
I accept this answer as it seems a clever solution (even though, for other reasons, I stick to XeLaTeX)
â Martin Mueller
Aug 7 at 8:37
add a comment |Â
up vote
2
down vote
A simple way to do this is to choose a font that supports à ¿ as an open type character variant, e.g., EB Garamond. Then you can just select that variant when you need it.
(Re-reading the comments above, I see this corresponds to option (c) from David Carlisle, which you said wasn't suitable, but this MWE shows you can have both kinds of s with this method.)
Update showing ià ¿t and à ¿elbes.
documentclassarticle
usepackage[ngerman]babel
babelfontrmEB Garamond
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
addfontfeatureCharacterVariant=1
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
ist selbes
enddocument
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
add a comment |Â
up vote
2
down vote
A simple way to do this is to choose a font that supports à ¿ as an open type character variant, e.g., EB Garamond. Then you can just select that variant when you need it.
(Re-reading the comments above, I see this corresponds to option (c) from David Carlisle, which you said wasn't suitable, but this MWE shows you can have both kinds of s with this method.)
Update showing ià ¿t and à ¿elbes.
documentclassarticle
usepackage[ngerman]babel
babelfontrmEB Garamond
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
addfontfeatureCharacterVariant=1
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
ist selbes
enddocument
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
add a comment |Â
up vote
2
down vote
up vote
2
down vote
A simple way to do this is to choose a font that supports à ¿ as an open type character variant, e.g., EB Garamond. Then you can just select that variant when you need it.
(Re-reading the comments above, I see this corresponds to option (c) from David Carlisle, which you said wasn't suitable, but this MWE shows you can have both kinds of s with this method.)
Update showing ià ¿t and à ¿elbes.
documentclassarticle
usepackage[ngerman]babel
babelfontrmEB Garamond
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
addfontfeatureCharacterVariant=1
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
ist selbes
enddocument
A simple way to do this is to choose a font that supports à ¿ as an open type character variant, e.g., EB Garamond. Then you can just select that variant when you need it.
(Re-reading the comments above, I see this corresponds to option (c) from David Carlisle, which you said wasn't suitable, but this MWE shows you can have both kinds of s with this method.)
Update showing ià ¿t and à ¿elbes.
documentclassarticle
usepackage[ngerman]babel
babelfontrmEB Garamond
begindocument
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
addfontfeatureCharacterVariant=1
XXX Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft Gesellschaft
ist selbes
enddocument
edited Aug 7 at 10:13
answered Aug 7 at 5:45
David Purton
6,1701529
6,1701529
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
add a comment |Â
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
You're right about the font. Though dis is not exactly David Carlisles option c and is no proper way to do it, because though all à ¿ are s, not all s are à ¿. The rule is, i think, that at the beginning of a word and in between letters it's an à ¿ â if not a long vocal is right before it. So it's »ià ¿t« and »à ¿elbes«.
â Martin Mueller
Aug 7 at 8:36
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
@MartinMueller, EB Garamond does not change it at the end of a word, but I think it changes it everywhere else. I'm not a German speaker though, so I do not know what it should be.
â David Purton
Aug 7 at 8:44
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
Thank you for pointing that out. I've installed the EB Garamond but unfortunately the rules of à ¿ are far more complex than I knew (see wikipedia) and using your method does not produce the desired result. E.g. lesbisch should be lesbià ¿ch but is rendered leà ¿bià ¿ch
â Martin Mueller
Aug 7 at 12:20
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
@MartinMueller, that's a pity! The EB Garamond manual does warn that this feature isn't all that clever.
â David Purton
Aug 7 at 12:30
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2ftex.stackexchange.com%2fquestions%2f444847%2f%25c5%25bf-long-s-and-hyphenation%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
you could make this work automatically in luatex or xetex but in pdftex the best you could do is define à ¿ to always allow a hyphenation after it, it can not take part in
patterns
orhyphenation
â David Carlisle
Aug 6 at 14:45
4
your choices would be essentially to (a) use
hyphenation
as you have done for all necessary words or (b) copy the hyphenation patterns adding patterns for long s to match those for s and rebuild the xelatex format (lulatex does not need to be rebuilt) or (c) uses
in the original markup and then set up font features (perhaps...) so that some s get typeset using the long form. Which would you prefer? (the last probably depends on the font you use)â David Carlisle
Aug 6 at 15:29
I use XeLaTeX, so copying the patterns is definitely the way to go. I'll look it up, right now I'm not sure how it works. (c) is a creative solution but does not work as I have parts where I want the long s and parts were I don't as I have to stay true to the source
â Martin Mueller
Aug 6 at 15:37
oh or if you are using luatex then (d) implement a Lua hyphenation callback that hyphenates using s, then switches to long s and re-inserts the hyphenation points,
â David Carlisle
Aug 6 at 15:37
If the list of problematic patterns is short, in luatex you can also add them with
babelpatterns
in the document itself.â Javier Bezos
Aug 6 at 17:38