понедельник, 30 марта 2015 г.

Cygwin, groff, cyrillic (russian) fonts.

So, the aim is to get pretty-printed pdf, created with groff + ghostscript under Сygwin environment.

What we have in certain:
  • cygwin, more precisely - cygwin64 under Windows 8.1
  • groff 1.22.2, coming with cygwin
  • ghostscript 9.15 (solely to make PDF from PostScript), coming with cygwin
Here are the problems step by step:

I. To make groff create acceptable doc with cyrillic characters in it, we should get to groff input something very one-byte-sized, for example, in KOI8-R encoding for Russian text. And as standart preparation, we need to prepare groff-input with preconv. And so we go:
$ cat $some_file | iconv -f UTF-8 -t KOI8-R | preconv -e KOI8-R | groff $params
First occured problem was that preconv did not support KOI8-R and he told this something like "unsupported encoding" or alike (I did not exactly remember).
Appeared, that preconv was build (was run) without iconv support (you can see it if you run "preconv -v"). And it happened because when installing groff there were not iconv-devel package installed in cygwin.
So, first problem was solved by installing iconv-devel and then reinstalling groff.

Now,

$ iconv -f UTF-8 -t koi8-r examples/example.tr | preconv -e KOI8-R > output.preconv

should work quite well.


II. Then we need to pipe preconv output to groff and should fill those $params from example above.

$ iconv -f UTF-8 -t koi8-r examples/example.tr | preconv -e KOI8-R | groff -Tps -mru > output.ps

Basically we do not need to have lots of params as it will be in real cases. -Tps says to generate PostScript output, and -mru asks to preprocess with ru.tmac macros file.

Okay, in most cases groff will say that he can't find ru.tmac. (troff: fatal error: can't find macro file ru). That is because he do not have it =). But I do have! The only thing to do when you find tmac you need is to put it in proper directory, for example /usr/share/groff/site-tmac/



III. Ok, let me see for result now (iconv -f UTF-8 -t koi8-r examples/example.tr | preconv -e KOI8-R | groff -Tps -mru > output.ps):

:33: warning: can't find special character `u0424'
:33: warning: can't find special character `u0430'
:33: warning: can't find special character `u043C'
[ and so on ]

It means, that groff does not understand russian symbols. That is very pity. More formally, groff does not have metrics for russian fonts. But I do have! I have metrics for Times fonts and I need to put them into /usr/share/groff/current/font/devps, replacing already existing. My metrics are in devps.tar attach.

Now, command

$ iconv -f UTF-8 -t koi8-r examples/example.tr | preconv -e KOI8-R | groff -Tps -mru > output.ps

should finish silently (in UNIX that means that everything is very OK).


IV. Now we have a output.ps - PostScript file. So, a lot of applications are capable of dealing with postscript files - printing, viewing, converting. We will try to create PDF file from that output.ps and will use quite native GhostScript toolchain.

First lets try basic ps2pdf utility (which is just wrapping over gs with several parameters):

 # looks easily to understand
 $ mv output.ps input.ps

 $ ps2pdf input.ps output.pdf

I have it finished with no messages (quite Ok?, hm). Lets try to open PDF with your loved pdf-viewer.
=(, no any russian symbols are visible.

Then I spent a lot of time to discover the root of the problem. I think some details would be good.

Ghostscript is supplied with set of "builtin" fonts. Many popular fonts (Times, for example) are just mapped to those builtin ones. But there is the problem: those fonts basically does not support cyrillic glyphs.
¡No problema!: just install ghostscript-fonts-std package and we have standard cyrrilic fonts available for Ghostscript. But this works ok on "good" platforms and not very good in Cygwin.
Moreover, just installing fonts is not enough and will work only if you delete "builtin" font-files from ghostscript distribution. But this is not very good way (will affect your karma).

How it happens on, for example, Debian: when installing ghostscript-fonts-std, package also creates /etc/fontmap.d/ghostscript/10gsfonts.conf file, which contains Fontmap content for fonts in /usr/share/fonts/Type1/ (path may slightly differ - check yourself).
During install, package runs update-gsfontmap, which copies content of all /etc/fontmap.d/*.conf into single /var/lib/ghostscript/Fontmap, and this file, in order, is looked up by Ghostscript (that is not the only file looked up by Ghostscript, but it is one of the first and better to make Ghostscript find what it needs just now).
Problem is that in Cygwin 10gsfonts.conf is not created. Moreover, Cygwin does not have update-gsfontmap utility. It is pity.

To solve issue I created correct Fontmap for files in /usr/share/fonts/Type1/ and put it in this directory. I append this Fontmap file here, but it contains absolute paths, so it may need correction.

After that,

$ ps2pdf input.ps output.pdf

works like a charm. PDF is finally cooked.


Mission complete. World is saved.

Some files I mentioned:
Cyrrilic font metrics
Fontmap
example.tr (example file for groff)
input.ps (resulting post-script file)
output.pdf (resulting pdf)



3 комментария: