skip navigation

PDF Document Management Software, Services & Support

Server Desktop Services Support Why Us? About Us

The Latest

SecurSign 5 Now Available! Includes Signature Validation to Detect Tampering.
Lansdowne, PA (July 13, 2011)
Encrypt, digitally sign and verify digital signatures on PDF documents.

Redax 5: Advanced Redaction for PDF Documents
Tuesday, March 22, 2011
The latest Redax adds new patterns, regular expressions and more!

Redax Enterprise Server 3 Ships!
Thursday, January 6, 2011
New Redaction Engine, Powerful New Markup Options and More!

Survey: Server Based PDF Applications
Tuesday, December 7, 2010
The 2010 Survey asked about PDF server application development.

5 PDF Readers Compared
Tuesday, November 30, 2010
Expanding on our previous review, we've included Nitro's Reader and Adobe's new Reader X.

PDF Form Aids Sales Team Collaboration
Friday, November 26, 2010
Take a document, add a dash of JavaScript, a sprinkling of PDF know-how, and serve.

Section 508 Center for PDF now online!
Wednesday, November 17, 2010
A key resource for document authors, content managers and Section 508 coordinators concerned with PDF accessibility.

Case Study: MedWiz Technologies
Friday, November 12, 2010
It's a lot more than simply knowing JavaScript; it's the background, the analysis and the thought that counts.

Section 508 Coordinators Conference
Monday, November 1, 2010
Join us at the annual Federal agencies conference to discuss Section 508.

REVIEW: Acrobat Capture 3.0x: Adobe Fires Back, Nearly Misses

TalkPDF225x100_noDJ.png

Thursday, August 24, 2000

by Duff Johnson

From the start in 1994, Acrobat Capture 1.01 was a visionary product. In 1994, the earliest days of the commercial Internet, the decision to develop Acrobat Capture offered the promise of unifying paper and electronic source materials in a single, compact, standardized, multi-platform environment. This lofty aim helped propel and fulfill the PDF concept, making possible the modern reality: PDF as the globally accepted "full-fidelity" electronic document format.

The only document-imaging product in the Adobe lineup, Capture is focused at the "production" volume imaging market. Unique among OCR software developers, Adobe recognized that fully automated processes and text-accuracy-only correction sub-systems would leave conversions to PDF/Normal files -- i.e., files made from original electronic sources -- hopelessly out of reach.

The original model for Capture was a faux client-server arrangement - a central processing core for automated functions and a workstation application for value-added (text accuracy, page layout and image enhancement) work. Capture 3.0x expands on this premise with beefed up automated processing and a true client-server arrangement for the value-added labor.Gone is the simple "flowchart" dialog. Capture 3.0x sports a busy new interface (screengrab 1) with a Windows Explorer-esque workflow builder mated to a raft of processing information. There is a new Zone tool, simple and reasonably functional, and a superb new QuickFix text correction tool. Capture Reviewer got a major upgrade as well, although the changes are subtle compared to the overall application.

The Engine

One impressive new feature in Capture 3.0x is true load-balanced distributed processing for the OCR and other automated engines. Add a processor to the Clusters, and watch throughput go up as other processors come on-line. It's stable too, with far fewer crashes than the 2.01 software.

Exception handling has been beefed up, but is still weak compared to full-fledged volume imaging systems such as Input Accel and Ascent Capture. Capture includes a scanner-control front-end, but while aimed at the volume scanning market, the software is still best used to process existing scanned, enhanced, rotated and quality-controlled images.

A moving target throughout the development process, OCR in the Capture 3.0 engine represented a substantial improvement over the aged engine in Capture 2.01 -- although we have yet to see this promise fulfilled in the beta 3.0x code. Tuning issues with the 3.01 engine aside, the new Capture should be substantially more accurate overall, particularly on clean documents and small, clear type.

The OCR issue we encounter more than any other, however, is a frequent refusal to capture individual text blocks, particularly those located near images. This problem resulted in a very serious flaw in the 3.0 release - large text areas going without OCR at all. Adding a template zone stage to the workflow helps, but does not fully correct the problem. We chose not to use the 3.0 code for Searchable Image production. An additional concern is the font recognition in the new engine, which we evaluate (so far) as worse than Capture 2.01 in terms of font size and spacing.

New Tools: The Zone Tool

Long-awaited, the new Zone tool (screengrab 2) does what every Zone tool does, and not much more, which is a pity with the PDF-directed Capture. After all, we're not just talking OCR here! We would like to see the ability to control the output of each image zone, as is offered with PageGenie, not to mention a powerful table-recognition tool, as in Scansoft's powerful TextBridge software. Additionally, the engine tends to crash on pages with very many zones - as always, your mileage may vary!

It frequently seems necessary to include a full-page "text zone" step in each workflow to force the OCR on bitmap regions that appear as images to the software. Missing this step can result in a hard-to-detect failure to OCR in Searchable Image pages - another problem Adobe says they will attempt to correct in the release version of Capture 3.01. We still get mixed results on selected pages, and I suspect that this type of failure to OCR will remain a significant annoyance, as it is with most OCR software.

New Tools: QuickFix

This new tool has a unique interface which brings sorting rudiments to the process of OCR error correction. The concept allows for some pretty sophisticated text-correction routines. QuickFix (screengrab 3) allows you to sort suspects, permitting a correction focus on terms to be added to a dictionary, or alphanumeric words of low confidence. It's primarily useful for Searchable Image text correction.

Improved Tools: Capture Reviewer 3.0

Already a reasonably mature tool in 2.01, (Reviewer 3.01 screengrab 4) is the last word (so far) in layout and appearance management for PDF/Formatted Text and Graphics conversions. A central irritation is that the screen paints are slower, retarding productivity.

Valuable new features include:

  • The ability to set text styles, but they only work for the current document - style sets can't be saved.
  • A new align objects tool, which is VERY useful.
  • Efficiency enhancements in several tools. In order to insert a new line of text in 2.01, you had to insert a text box, then a text line, and then edit the text with the text tool. In 3.01, you simply draw a line with the text tool and add your text.
  • Table lines are easier to draw with the improved rectangle and line tools, and the lines are handled consistently in the PDF output.
  • You can now fill document information fields in the final PDF from within Reviewer.
  • You can insert more image types as pictures than in 2.01.
  • You can now add Web links and links that go to other pages in the same document. However, you can't edit link options. All links will be invisible, inverted, and go to full page view.

The Overall Experience

I saved this for last, because the reader should not be distracted from the many fantastic improvements to this ground-breaking 3rd generation software.

HOWEVER! Out of the box, we found Capture 3.0 virtually unusable. From botched OCR (a problem mostly cured with a forced Zone operation) to basic image quality in skew-adjusted images, we could not put Capture 3.0 in production at all. Our extensive testing of the forthcoming 3.01 code promises much better things to come.

A major advantage to Capture 3.0x is in file management. The workflow design will automatically move your files from one step to the next. This saves a lot of the administrative overhead for high-volume production compared with Capture 2.01.

One sour note: As of the last beta version, Capture 3.0x is unusable for color output. Images are highly compressed and, in PDF/Formatted Text & Graphics output, there is gray shading surrounding text and images. We certainly hope Adobe addresses this issue, because we won't be Capturing in color unless they do!

Conclusion

Are there alternatives to Capture for converting paper to PDF? In a word, yes. PageGenie is capable of fine Searchable Image PDF files, and can optionally use PDFWriter to access the Adobe Libraries, if required for the project. Most other alternatives offer far better OCR than Capture 2.01, and most are faster and more stable as well.

The key advantages of Adobe's new Capture product lie in its scalability, stability and the degree with which it will reward the sophisticated user. Just as important is the fact that Adobe's Capture remains the only post-OCR text/layout/images editor (Capture Reviewer) that makes true layout reproduction possible.
For high-volume processing or PDF/Formatted Text & Graphics conversions, you may come to appreciate this software.

Competitors

Since the mid-1997 release of Acrobat Capture 2.0 (and the must-have 2.01 maintenance update), options for paper-to-PDF conversion have multiplied like mushrooms.

Some other currently available applications include:

FineReader, by ABBYY Software House
Awesome OCR from Russia, an advanced desktop OCR engine with PDF/Searchable Image capability.

PageGenie, by Paravision Imaging, Inc.
Powerful and flexible, this station-based software, is a powerful competitor to Acrobat Capture

Prime OCR, by Prime Recognition, Inc.
Another pricey client-server model, with an extraordinarily accurate voting OCR system.

TextBridge, by Scansoft, Inc.
Scansoft has released a new version since this review. (I reviewed this software earlier for Planet PDF.)

TypeReader, by ExperVision, Inc.
A basic OCR engine with the ability to write out PDF files.

This list is not intended to be comprehensive.Several full-scale imaging software solutions that include PDF creation are:

Resources from Document Solutions, Inc

  • "Paper to PDF" [PDF: 1.4 MB] explains document conversion basics.

Originally posted on planetpdf.com



Server Desktop Services Support Why Us? About Us
AppendPDF
AppendPDF Pro
FDFMerge
FDFMerge Lite
pdfHarmony
Redax Enterprise Server
SecurSign
StampPDF Batch
APCrypt
APJavaScript
APSplit
APGetInfo
pdfAPilot Server 2
Redax
StampPDF plugin
StampPDF DE
AppendPDF DE
APSplit DE
PDF Forms
Designer/XFA Forms
PDF JavaScript
PDF Accessibility
Section 508
Publication Scanning
CD/DVD-ROMs
Custom Development
Software Support Policy
Technical Support
Product Documentation
FAQs
Sample Scripts
PDF Glossary
Contact Support

Talking PDF
Appligent Labs
Customers
Testimonials
Case Studies
Cost Effectiveness
Innovation
PDF Standards
Experience
Mission
History
People
Partners
Contact Us
News & Events
Site Accessibility
Site Index
 
Site Accessibility | Email the WebAdmin
Valid HTML 4.01! Section 508 Compliance logo