Creating Proceedings from Postscript Files
As a Publications Chair for SOCS-3 I had to assemble the
proceedings. As raw input I had a series of postscript files, with
the pages unnumbered. I had to produce a document containing all the
files, with numbered pages and a table of contents. I was too lazy to
add an author's index, although it wouldn't have been that hard.
This task may be easy if you have the right tools, but I tried to
do it using only freely available software (not only free, but also at
no cost). A few attempts were needed before a decent success, so this
page shares my experience.
The basic trick was to create a LaTeX document which would include
the pages of all the papers as figures. For this purpose I had to
slice each document into pages and to transform each page in
encapsulated postscript. Here are the steps I went through; the
solution may be unoptimal, but it works.
- First make sure that all documents are in conforming postscript.
Some would crash ghostscript, some would crash other postscript
utilities. For some very stubborn documents generated with Microsoft
tools I had to ask the authors to supply me a better version, as the
document would even crash the printer. Apparently Windows 95's
Postscript is better than Window NT's.
- Make sure you have the latest versions of the psutils tools, and
of ghostscript and dvips. Running with older tools didn't give usable
results.
- If you have some Adobe tools and non-conforming postscript, try to
distill them into .pdf and back to postscript using the distill and
acroexch tools (available at CMU on Solaris).
- Otherwise there are ps2pdf and pdf2ps tools, but they are likely
to rely on ghostscript, so they are as weak as this program.
- Then I have created the following LaTeX
document (I gave it a .txt suffix, so you can see it in
Netscape). It uses the macro \pageimage to include page images.
Notice the \epsfxsize command which is used to constrain the image to
the proper size.
- Then I have created a text file called order
(the name is hardwired in the script too) listing all the papers.
They appear in the order they should be included in the proceedings.
Each paper is given by its file name only. I have added comments with
# and empty lines, there's also a keyword 'stop' to debug my script.
- The main job is done by the following Perl script. This script does the
following operations for each paper in the 'order' file:
- It cleans the postscript using the ps2ps program. This will also
crop the pages to the bounding box.
- It uses psselect to cut each document into individual pages.
- If transforms each page into an .epsi file. This is a little
wasteful, because epsi is not only encapsulated postscript, but it
also contains a 'preview' of the file in another format (usually
.tiff). But other methods I have tried to generate .eps have failed,
because apparently they rasterize the image and loose resolution.
- It will write for each page a LaTeX line to include the page as a
figure.
- At the end of each document it would generate the LaTeX macro
which is the page number where the document begins.
- The script will generate two auxiliary latex files
'pagenumbers.tex' and 'allpapers.tex', which will be included by the
main LaTeX file.
- In the end, just run the script once and LaTeX once to generate
the final document.
- Using dvips you can get the postscript booklet.
- Caveat: if some of the documents contain pages which are narrower
than usual (e.g. a single column), because of the cropping and
including using \epsfxsize they will be enlarged a lot. For such
pages I had to manually create a new \halfpageimage macro which
includes somewhat narrower figures.