$Header: /home/cvsroot/dvipdfmx/README,v 1.6 2002/10/29 07:45:20 chofchof Exp $

The dvipdfmx Project
====================

Last modified: October 28, 2002


Copyright (C) 2002 by Jin-Hwan Cho and Shunsaku Hirata,
the dvipdfmx project team <dvipdfmx@project.ktug.or.kr>

This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.


CONTENTS
--------

1. INTRODUCTION

2. DOWNLOAD

3. INSTALLATION

4. FEATURES

   4.1. Double-byte character encodings
   4.2. CIDFonts
   4.3. Stylistic variants
   4.4. No-embedding option
   4.5. Advanced typographic features
   4.6. Vertical writing
   4.7. CMap embedding

5. LIMITATION

6. REFERENCES


1. INTRODUCTION
   ------------

The dvipdfmx (formerly dvipdfm-cjk) project provides an eXtended version
of the dvipdfm, a DVI to PDF translator developed by Mark A. Wicks.

The primary goal of this project is to support multi-byte character encodings
and large character sets for East Asian languages by using CID-keyed font
technology. The secondary goal is to support as many features as pdfTeX
developed by Han The Thanh.

This project is a combined work of the dvipdfm-jpn project by Shunsaku
Hirata and its modified one, dvipdfm-kor, by Jin-Hwan Cho.


2. DOWNLOAD
   --------

The current snapshot of the dvipdfmx project is available at:

  http://project.ktug.or.kr/dvipdfmx/snapshot/

The CVS repository for this project can be checked out through anonymous
(pserver) CVS with the following instruction set. When prompted for a
password for anonymous, simple press the Enter key.

  cvs -d:pserver:anonymous@cvs.ktug.or.kr:/home/cvsroot login
  cvs -d:pserver:anonymous@cvs.ktug.or.kr:/home/cvsroot co dvipdfmx


3. INSTALLATION
   ------------

The kpathsea library is required to compile and install dvipdfmx in UNIX
or Linux platforms. It is a part of common TeX distributions, for example,
teTeX. See `INSTALL' for more details.

In addition to the original dvipdfm, the following resources are added in
dvipdfmx.

1) CMap resource files under the directory `${TEXMF}/dvipdfm/CMap',
   or specify the directory containing CMap resource files in the
   variable CMAPINPUTS in `${TEXMF}/web2c/texmf.cnf'.

   The directory `data/CMap' contains a few CMap files written by
   the dvipdfmx project team.
   
   Adobe's `CMaps for PDF 1.4 CJK Fonts' are available at:

     http://partners.adobe.com/asn/developer/technotes/acrobatpdf.html

   Some standard CMap files for CJK-languages are also available at:

     ftp://ftp.oreilly.com/pub/examples/nutshell/cjkv/adobe/

2) SubFont Definition files (.sfd) under `${TEXMF}/ttf2pk' or
   `${TEXMF}/ttf2tfm' as specified in `${TEXMF}/web2c/texmf.cnf'.
   to use the subfont feature required for CJK and HLaTeX packages.

3) an appropriate font mapping file for CID-keyed fonts, for example,
   `${TEXMF}/dvipdfm/config/cid-x.map'. Do not forget that the name of
   the font mapping file should be recorded in the configuration file
   `${TEXMF}/dvipdfm/config/dvipdfmx.cfg'.

   See the document `FONTMAP' for the format of the font mapping
   files for CID-keyed fonts.

4) OpenType fonts (.otf) under `${TEXMF}/dvipdfm/opentype' to use as
   Type 2 CIDFonts, because the kpathsea library does not support
   'kpse_opentype_format' yet. At present, dvipdfmx uses for this
   kind of fonts 'kpse_program_binary_format'.


4. FEATURES
   --------

4.1. Double-byte character encodings

Double-byte character encodings are processed in TeX either by the
direct way with double-byte codes (e.g., ASCII pTeX and Omega) or by the
subfont approach with single-byte codes (e.g., CJK and HLaTeX packages).

In the case of the DVI file generated by the subfont approach, dvipdfmx
converts single-byte codes to double-byte codes according to the subfont
definition record given in the font mapping file.

And then double-byte character codes are converted to CID (Character ID)
numbers according to the CMap record given in the font mapping file;
for example, EUC-H and EUC-V CMaps give the mapping from character codes
in the ASCII/JIS X 0208 character set encoded with EUC-JP encoding to
CID numbers in the Adobe-Japan1 character collection.

Therefore, each CID-keyed font in the generated PDF file consists of
the Identity CMap and an appropriate CIDFont. Even though this conversion
is not necessary, it greatly simplifies the procedure and maximizes
flexibility: Multiple encodings can be applied to a single font, and
non-standard, customized encodings can be used rather easily in the DVI file.

At present, Omega level-0/levle-1 font metric (OFM) format and JFM TeX font
metric format, which is an extended TeX font metric format used by ASCII pTeX,
are supported for double-byte TeX font metric file.


4.2. CIDFonts

There are three kinds of CIDFonts supported by dvipdfmx.

1) Pre-defined CIDFonts

   The following CIDFonts are pre-defined in `source/cid_basefont.h', which
   cannot be embedded. Those fonts are available at:
 
   http://www.adobe.com/products/acrobat/acrrasianfontpack.html

   for use with Acrobat Reader.

   --------------------------------------------
   Character collection     Pre-defined CIDFont
   --------------------------------------------
   Adobe-Japan1-2           HeiseiMin-W3
                            HeiseiKakuGo-W5
   Adobe-Korea1-0           HYGoThic-Medium
                            HYSMyeongJo-Medium
   Adobe-CNS1-0             MHei-Medium
                            MSung-Light
   Adobe-GB1-2              STSong-Light
   --------------------------------------------

2) OpenType CIDFonts (CIDFontType0)

   Postscript CID-keyed fonts are supported only in the CFF OpenType format
   (OTF). OpenType CIDFonts can be embedded if editable- or installable-
   (or preview & print-) embedding is allowed. Those fonts are always
   subsetted, i.e., only actually used glyph data are embedded.

3) TrueType CIDFonts (CIDFontType2)

   TrueType fonts (TTC or TTF) can be embedded as CIDFontType2 CIDFonts in
   PDF file, provided that editable- or installable- (or preview & print-)
   embedding is granted. Those fonts are always subsetted, i.e., only
   actually used glyph data are embedded.

   To embed TrueType glyph data, an appropriate CMap resource is required,
   which defines a mapping from CID numbers to character codes used
   in the TrueType cmap (character to glyph index mapping) table.

   For example, in order to embed a Japanese TrueType font with Unicode
   encoding, the CMap resource `Adobe-Japan1-UCS2' is required, which
   maps CID numbers in the Adobe-Japan1 character collection to Unicode
   value. As indicated by this example, required CMap has a name of
   REGISTRY-ORDERING appended by -ENCODING where REGISTRY-ORDERING is
   the name of character collection, and ENCODING is the name of encoding
   that the TrueType cmap table uses.

   Almost all TrueType cmap tables commonly used in MS-Windows are
   supported (may work with Mac encodings) except for

     Symbol (Platform ID 3, Encoding ID 0)
     Johab  (Platform ID 3, Encoding ID 6)
     UCS4   (Platform ID 3, Encoding ID 10)

   Here is a list of required CMaps for each encodings:

   ---------------------------------------------------------------------
   cmap Encoding  PID  EID  ENCODING   Language   Required CMap
   ---------------------------------------------------------------------
   Unicode        3    1    UCS2       Chinese_S  Adobe-GB1-UCS2
                                       Chinese_T  Adobe-CNS1-UCS2
                                       Japanese   Adobe-Japan1-UCS2
                                       Korean     Adobe-Korea1-UCS2
   ---------------------------------------------------------------------
   RPC            3    3    GBK-EUC    Chinese_S  Adobe-GB1-GBK-EUC
   (Mac)          1   25    GBpc-EUC              Adobe-GB1-GBpc-EUC
   ---------------------------------------------------------------------
   Big5           3    4    ETen-B5    Chinese_T  Adobe-CNS1-ETen-B5
   (Mac)          1    2    B5pc                  Adobe-CNS1-B5pc
   ---------------------------------------------------------------------
   ShiftJIS       3    2    90ms-RKSJ  Japanese   Adobe-Japan1-90ms-RKSJ
   (Mac)          1    1    90pv-RKSJ             Adobe-Japan1-90pv-RKSJ
   ---------------------------------------------------------------------
   Wansung        3    5    KSCms-UHC  Korean     Adobe-Korea1-KSCms-UHC
   (Mac)          1    3    KSCpc-EUC             Adobe-Korea1-KSCpc-EUC
   ---------------------------------------------------------------------
   PID: Platform ID, EID: Encoding ID

   All CMaps listed above can be found in the directory `data/CMap'.

   To know all cmap tables in the TrueType font file, it is recommended
   to use ttfdump/ftdump freely available at FreeType ftp site.
   Microsoft also distributes a TrueType dump program, TTFDUMP.exe, at:

     http://www.microsoft.com/typography/tools/tools.htm


4.3. Stylistic variants (Bold, Italic, BoldItalic)

It is possible to use a bold or italic style even if there is no font
data for that style by appending a comma and the style name (one of Bold,
Italic, or BoldItalic) to the font name in the font mapping file.

Unfortunately, availability of this feature highly depends on the
implementation of PDF viewers. For example, this feature is available
only for non-embedded fonts in popular PDF viewers, Adobe Acrobat Reader
and GNU Ghostscript.

Notice that those variants automatically enable no-embedding option
with a warning message.

At present this feature is implemented in dvipdfmx only for CIDFonts.


4.4. No-embedding option

It is possible to block embedding glyph data with the character `!'
in front of the font name in the font mapping file.

This feature reduces the size of the final PDF file, but the PDF file
may not be viewed exactly in other systems on which appropriate fonts
are not installed.

Use of this option is not recommended for fonts that contains unusual
characters.

At present this feature is implemented in dvipdfmx only for CIDFonts.


4.5. Advanced typographic (OT Layout/AAT) features

Experimental support for single glyph substitution is available for
selecting vertical version of glyphs in OpenType fonts. AAT tables
are not supported yet.


4.6. Vertical writing

The vertical writing mode is supported only for ASCII pTeX.

Here is a note on the ASCII pTeX's implementation of vertical writing:

ASCII pTeX does the same vertical text positioning as PDF/PostScript
interpreters do: The baseline (or centerline) runs through from the top of
page to the bottom of page when the writing mode is `vertical'.
In ASCII pTeX, however, writing mode associated with font (JFM) and the text
direction is separately taken into account as oppose to PDF and Postscript
languages.

There are two flavors of Japanese TeX font metric format; horizontal and
vertical. Those two formats are exactly the same except that they have
different IDs (11 for horizontal, 9 for vertical) and that glyph metrics in
vertical JFM file actually represents vertical glyph metrics.
When the font is vertical font, `depth' (`height') is interpreted as the
distance from the baseline to the left (right) -most side of glyphs bounding
box. The pen-position advances in the vertical direction (downward) in the
amount of character `width'.


          |------- W -------|             |-- D ---|-- H ---|
           -----------------   -           --------O--------  -
          |        *        |  |          |        *        | |
          | *************** |  |          | *************** | |
          |    *      *     |             |    *   |  *     |
          |     *    *      |  H          |     *  | *      | W
          |      * *        |             |      * *        |
          |      * *        |  |          |      * *        | |
          |    *     *      |  |          |    *   | *      | |
 baseline O--*---------**---+  -          |  *     |   **   | |
           -----------------               --------+--------  -
              horizontal                        vertical


    W: width (horizotal/vertical advance)
    H: height
    D: depth
    O: glyph's origin

Fig.1. Interpretation of glyph metrics for horizontal and vertical font in
       ASCII pTeX's JFM format.


The DVI command `dir' (opcode 255) changes the writing direction. The dir
command takes a single argument (an unsigned byte). A sequence of DVI bytes
0xff 0x00 changes current writing direction mode to horizontal mode, and
likewise, 0xff 0x01 changes it to vertical mode. When a horizontal font is
selected in the vertical direction mode, all characters are rotated 90 degrees
in the clock-wise direction (around the glyph's origin). This transformation
is also applied to DVI `rule's (boxes) and embedded figures.

Alternatively, you can say that the h-direction is not necessarily to be
related to the x-direction of the device-coordinate space, instead, it
coincides with the direction that text proceeds in each lines.


4.7. CMap embedding

Embedded ToUnicode CMap is not supported yet.


4.8. New special commands

\special{pdf: tounicode <CMap resource file>}

\special{pdf: literal [direct|reverse]}


5. LIMITATION
   ----------

See `BUGS'.


6. REFERENCES
   ----------

CID-keyed fonts are core technology for supporting CJK (Chinese, Japanese,
and Korean) languages and other languages that requires large number of
characters in PDF. See, Adobe's technical notes for detailed description
of the CID-keyed fonts:

  - Technical Note #5092: CID-Keyed Font Technology Overview
  - Technical Specification #5014: Adobe CMap and CIDFont Files Specification
  - Technical Note #5099: Building CMap Files for CID-Keyed Fonts

Those documents are available at:

  http://partners.adobe.com/asn/developer/technotes/main.html

The OpenType specification is available at:

  http://www.microsoft.com/typography/otspec/default.htm

