In this chapter...
...we introduce you to the concept of XML and how to create parameter files structured using XML. Examples of XML code that is specific to AppendPDF Pro will be distinguished with dark green text followed by the notation: (AppendPDF Pro only).
Here we use AppendPDF and AppendPDF Pro as an example, but the concepts apply to all of our products that use XML.
What is XML?
Extensible Markup Language (XML) is a simple, flexible text format for describing text and data. This description takes the form of tags, descriptive mark-up that surrounds the data, such as:
<tag> data </tag>
Tags are clear and readable descriptions of the data they contain. A combination of tags and data is called an element. Elements are “containers,” an element may be a “parent” which contains other “child” elements, describing the dependent structure of the data. For example:
<parent> <child> data </child> </parent>
The advantages of using XML for parameter files are:
- XML is clearly legible, what data belongs where is more readily apparent
- XML is more flexible, since data is clearly identified by tags, there are less stringent rules about “illegal” characters and spacing.
- XML can be validated against a Document Type Definition to make it easier to
catch errors.
DTDs and validation
A Document Type Definition defines the tags and their inter-relationships, also referred to as structure, for a given application.
Each of our applications that use XML files comes with a DTD for validation. If you include the DTD by copying and pasting it at the beginning of the parameter file, the file will be validated when the application is run.
Understanding Our XML Files
We use XML to describe the parameters that AppendPDF and AppendPDF Pro use to append documents together. As we describe XML elements, there are three sections:
- A tree view of the elements, which describes the elements and shows how they are related.
- The DTD entity, for advanced users.
- The XML code itself.
We break up the explanations into the different subsections of the XML code. Below we show how the code is described, and how it is broken up.
The element table
The element table contains the following information:
- Element — The element tag
- R — Whether the element is required (X) or optional (blank). If the child of an optional element is marked required, it is only required if the parent is used.
- C — The cardinality of the element as expressed in the DTD. The cardinality is defined as the number of elements in a given set. The table below describes the meaning of cardinality of elements.
- Content — what information the element contains, i.e., what it specifies. Note that an element can either contain data or be empty. If it is empty, its mere presence specifies the information needed.
Note: Empty element tags can be entered using the beginning and end tags together or by using a simpler tag placing the slash "/" after the tag name as shown below:
<tag></tag> or <tag/>
Cardinality of elements
| C | Cardinality |
| [blank] |
Default: One and only one instance of the element. |
| + |
One or more instances of the element |
| ? |
Zero or one instance of the element |
| * |
Zero or more instances of the element |
The Tree View
The tree structure
All of the tables in this chapter that describe the structure of elements use notations and indents to indicate specific things about individual elements:
- The element tree structure shows the top most element as being expanded with child elements below.
- The Level column indicates what level from the top parent element described in the table, the current child element resides.
- The notation "+" indicates that the element is collapsed and is a parent which contains child elements within it's structure.
- Indents are used to assist in understanding the structure levels of the elements in the table.
- When there are elements that come before the current element in the <appendparam> structure, and they are not being shown, there will be a ⇔ symbol before the element tag name in the table.
- Some parent elements contain several options. When only one of the options can be used at a time the word "OR" will be before the element tag name in the table.
The following table describes the XML structure of the <sourcepdfs> element. Notice that the element <TOCEntry> is dark green. This indicates that the <TOCEntry> element can only be used with AppendPDF Pro.
Contents of the <sourcepdfs> element
| Element | Level | Pro | R | C | Content |
| <appendparam> version="1.0" |
Top |
— |
X |
— |
Topmost element, contains entire parameter spec |
| <sourcepdfs> |
2 |
— |
X |
— |
Input files to be appended |
| <inputpdf> |
3 |
— |
X |
+ |
Input PDF file |
| <file> |
4 |
— |
X |
— |
Text: Name and path of input file |
| <startpage> |
4 |
— |
— |
* |
Text: Start page of range to extract |
| <endpage> |
4 |
— |
— |
* |
Text: End page of range to extract |
| <TOCEntry> |
4 |
X |
— |
* |
Text: Table of Contents entry (AppendPDF Pro only)
|
| + <inputpdf> |
3 |
— |
— |
+ |
Additional PDF Files |
DTD Elements
<!ELEMENT sourcepdfs (inputpdf+)> <!ELEMENT inputpdf (file, (startpage | endpage | TOCEntry)*> <!ELEMENT file (#PCDATA)> <!ELEMENT startpage (#PCDATA)> <!ELEMENT endpage (#PCDATA)> <!ELEMENT TOCEntry (#PCDATA) >
XML Code Sample
Specifying an input file.
Elements for <sourcepdfs>
Inputpdf
Specifies the details of an input file.
File
Specifies the path and filename of an input PDF file.
Startpage, Endpage
Specifies the start page and end page of a range of pages to extract and append to the output document. The example above will extract pages X through 10 of the input document.
Specifies text to use in the Table of Contents to identify this range of pages. Used only if you specify that AppendPDF Pro build a TOC see Specifying the Table of Contents <TOC> (AppendPDF Pro only).
Note: If you do not specify a TOCEntry for the inputpdf, and a <TOC> element is specified, the Title of the inputpdf will appear in the Table of Contents and as the bookmark text. If the file does not have a Title set, then the filename of the inputpdf will be used, (AppendPDF Pro only).
Contents of the <extras> element
The following table shows the <extras> element which contains child elements that are defined as only one can be used, therefore, the word "OR" appears before that element name.
| Element | Level | R | C | Content |
| <appendparam> version="1.0" |
Top |
X |
— |
Topmost element, contains entire parameter spec |
| ⇔ <extras> |
2 |
— |
* |
Specifies how a document should open |
| <openmode> |
3 |
— |
* |
Specifies navigation mode in which file should open |
| OR <showbookmarks> |
4 |
— |
— |
Empty: Specifies that bookmarks should show |
| OR <showthumbnails> |
4 |
— |
— |
Empty: Specifies that thumbnails should show |
| OR <shownone> |
4 |
— |
— |
Empty: Specifies only the document should show |
| OR <fullscreen> |
4 |
— |
— |
Empty: File opens in full screen mode |
| <opentopage> |
3 |
— |
* |
Text: Specifies page at which to open |
| <stampfile> |
3 |
— |
* |
Text: Name and path of stampfile to use (AppendPDF Pro only)
|
| <viewmode> |
3 |
— |
* |
Specifies the view that the file should open in |
| OR <actualsize> |
4 |
— |
— |
Empty: File opens actual size |
| OR <fitheight> |
4 |
— |
— |
Empty: File opens so window fits full height |
| OR <fitpage> |
4 |
— |
— |
Empty: File opens so window fits full page |
| OR <fitvisible> |
4 |
— |
— |
Empty: File opens so visible area fits the window |
| OR <fitwidth> |
4 |
— |
— |
Empty: File opens to fit full width of window |
| <bookmarkmode> |
3 |
— |
* |
Specifies the initial state of bookmarks |
| OR <openbookmarks> |
4 |
— |
— |
Empty: Show all bookmarks |
| OR <closebookmarks> |
4 |
— |
— |
Empty: Collapse all bookmarks |
| OR <openlevel> |
4 |
— |
— |
Specifies number of bookmark levels shown |
| <layoutmode> |
3 |
— |
* |
Specifies the initial display layout mode |
| OR <single> |
4 |
— |
— |
Empty: One page at a time |
| OR <onecolumn> |
4 |
— |
— |
Empty: Pages in a continuous vertical column |
| OR <twocolleft> |
4 |
— |
— |
Empty: Two pages side by side, first page left |
| OR <twocolright> |
4 |
— |
— |
Empty: Two pages side by side, first page right |
| <displaymode> |
3 |
— |
* |
Specifies the initial window display mode |
| OR <hidetoolbar> |
4 |
— |
+ |
Empty: File opens with tool bar hidden |
| OR <hidemenubar> |
4 |
— |
+ |
Empty: File opens with menu bar hidden |
| OR <hidewinui> |
4 |
— |
+ |
Empty: File opens without window controls |
| OR <fitwin> |
4 |
— |
+ |
Empty: Window opens resized to fit the first page |
| OR <centerwin> |
4 |
— |
+ |
Empty: Centers document window on screen |
| OR <showtitle> |
4 |
— |
+ |
Empty: Displays document title on title bar |
DTD Elements
<!ELEMENT extras (stampfile | opentopage | openmode | viewmode | bookmarkmode | layoutmode | displaymode)*> <!ELEMENT stampfile(#PCDATA)> <!ELEMENT opentopage(#PCDATA)> <!ELEMENT openmode (showbookmarks | showthumbnails | shownone | fullscreen)> <!ELEMENT showbookmarks EMPTY> <!ELEMENT showthumbnails EMPTY> <!ELEMENT shownone EMPTY><!ELEMENT fullscreen EMPTY> <!ELEMENT viewmode (actualsize | fitheight | fitpage | fitvisible | fitwidth)> <!ELEMENT actualsize EMPTY> <!ELEMENT fitheight EMPTY> <!ELEMENT fitpage EMPTY> <!ELEMENT fitvisible EMPTY> <!ELEMENT fitwidth EMPTY> <!ELEMENT bookmarkmode (openbookmarks | closebookmarks | openlevel)> <!ELEMENT openbookmarks EMPTY><!ELEMENT closebookmarks EMPTY> <!ELEMENT openlevel(#PCDATA)> <!ELEMENT layoutmode (single | onecolumn | twocolleft | twocolright)> <!ELEMENT single EMPTY> <!ELEMENT onecolumn EMPTY> <!ELEMENT twocolleft EMPTY> <!ELEMENT twocolright EMPTY> <!ELEMENT displaymode (hidetoolbar | hidemenubar | hidewinui | fitwin | centerwin | showtitle)+> <!ELEMENT hidetoolbar EMPTY> <!ELEMENT hidemenubar EMPTY> <!ELEMENT hidewinui EMPTY> <!ELEMENT fitwin EMPTY> <!ELEMENT centerwin EMPTY> <!ELEMENT showtitle EMPTY>
XML Code Sample
Building XML Parameter Files
Now that you understand our notation, use the information to either:
- Build an XML parameter file from scratch, or
- Edit one of the sample XML files to suit your needs.
|