skip navigation

PDF Document Management Software, Services & Support

Server Desktop Services Support Why Us? About Us

The Latest

SecurSign 5 Now Available! Includes Signature Validation to Detect Tampering.
Lansdowne, PA (July 13, 2011)
Encrypt, digitally sign and verify digital signatures on PDF documents.

Redax 5: Advanced Redaction for PDF Documents
Tuesday, March 22, 2011
The latest Redax adds new patterns, regular expressions and more!

Objects and Semantics

TalkPDF225x100_noDJ.png

by Duff Johnson

Monday, December 20, 2010

Let's step back a moment, and get metaphysical about electronic content. Can we do that?

Whatever else they might possess, electronic documents always possess two things: objects and semantics.

  • Document “objects” are the physical characters, images, lines, bullets and other features that consume ink or toner when printed.
  • Document “semantics” define the logical relationships between the aforementioned objects. Most users never think about such things, but they use document semantics to read nonetheless.

A screen-shot displaying an HTML rendition of a portion of the current page. Text objects are in black type, semantic markup is displayed in a contrasting red type.

If you can read this, you already know that placing specific characters in a specific sequence forms a “word”. Organize words into lines and group the lines together – conventionally-abled users will see “paragraphs”.

Change the typeface and size of a specific line of text, and most users will look there for a “heading”.

Organize some words or numbers into a grid and users will see headings, rows and columns, and will thus understand it as tabular information (a “table”). 

Unless FaceBook, Twitter and the briefest of email are your only outlets for the written word, you create “semantic structures” to organize your “objects” more or less every time you write. And if you can read, you use semantics to organize the objects you see.

Web content managers think about these things in more concrete terms, as HTML semantics. Most people don't think about semantics much at all, a key reason why so much electronic content is inaccessible to users with disabilities.

However the concept is expressed in your preferred authoring software, semantics are chosen by the author. In HTML and PDF, semantics are called "tags".

If you place <p> and </p> tags around a stream of characters, that means “paragraph”.

Change the paragraph tags to <h2> and </h2> and you've made a “heading”.

Tables are denoted with tags such as <table>, <tr> (for “table row”), <th> (for "table heading"), and so on. Other tags allow the author to denote lists, images, links and more.

PDF borrowed all these concepts with HTML over ten years ago.

Once semantics are understood, the way they apply to PDF files becomes easier to understand and appreciate and thus easier to manipulate. Content navigation, text-extraction and search-engine optimization (SEO) get easier. Conformance with accessibility standards such as the Web Content Accessibility Guidelines (WCAG) 2.0 and the forthcoming ISO 14289-1 (PDF/Universal Accessibility) becomes possible.

Learn more...

Appligent Document Solutions provides PDF tagging services and educational and training resources in the Section 508 Center for PDF.


Server Desktop Services Support Why Us? About Us
AppendPDF
AppendPDF Pro
FDFMerge
FDFMerge Lite
pdfHarmony
Redax Enterprise Server
SecurSign
StampPDF Batch
APCrypt
APJavaScript
APSplit
APGetInfo
pdfAPilot Server 2
Redax
StampPDF plugin
StampPDF DE
AppendPDF DE
APSplit DE
PDF Forms
Designer/XFA Forms
PDF JavaScript
PDF Accessibility
Section 508
Publication Scanning
CD/DVD-ROMs
Custom Development
Software Support Policy
Technical Support
Product Documentation
FAQs
Sample Scripts
PDF Glossary
Contact Support

Talking PDF
Appligent Labs
Customers
Testimonials
Case Studies
Cost Effectiveness
Innovation
PDF Standards
Experience
Mission
History
People
Partners
Contact Us
News & Events
Site Accessibility
Site Index
 
Site Accessibility | Email the WebAdmin
Valid HTML 4.01! Section 508 Compliance logo