Digitalizing Electronic Article
From TorontoMathWiki
Q Why one need to digitalize an electronic article (computer file). Is not it already digital? A Well, it is digital but may be not digital enough and there are plenty versions of it.
In what follows I will describe different scenarios and what one can do. Victor 16:24, 26 April 2010 (UTC)
Contents |
Old TeX file
The best thing is to revitalize it:
- presuming that it was written in LaTeX 209 (\documentstyle) convert to LaTeX2e (\documentclass) and instead of \input{pfoo.tex} use \usepackage{foo}
- add line \usepackage{hyperref} in the preamble if it was missing
- process and fix errors
Old DVI file
Definitely old TeX file would be better, but DVI file is the second best. Convert it to PS or PDF using modern converters
Old/bad PS file
PS file looks OK via gs but converted to PDF looks bad, "Find" does not find words obviously there, one cannot select and copy, or can but pasting gets gibberish
There are many reasons why ps files could be bad (old dvips is an obvious suspect) and there are different recipes, which may work
- Before converting to PDF run pkfix (part of TeXLive) which replaces bitmap fonts by postscript fonts
- Use Adobe Distiller (comes with Adobe Acrobat) to convert ps to pdf
Old/Bad PDF file
- Deliberately disabled copying, printing, search
- Opening with non-Adobe s/w like Apple Preview which does not honour Adobe limitations and "saving as" removes these limitations
- It has a text but it is screwed up.
- Adobe Acrobat refuses to perform OCR in this case.
- Fortunately other OCR s/w is more sensible and extract images and then OCRs them