GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages


Manual Reference Pages  -  DJVUTXT (1)

NAME

djvutxt - Extract the hidden text from DjVu documents.

CONTENTS

Synopsis
Description
Options
Remarks
Credits
See Also

SYNOPSIS

djvutxt [options] inputdjvufile [outputtxtfile]

DESCRIPTION

Program djvutxt decodes the hidden text layer of a DjVu document inputdjvufile and prints it into file outputtxtfile or on the standard output. The hidden text layer is usually generated with the help of an optical character recognition software.

Without options -detail and -escape, this program simply outputs the UTF-8 text. Option -detail cause the output of S-expressions describing the text and its location. Option -escape uses C-style escape sequences to represent nonprintable non-ASCII characters.

OPTIONS

--page=pagespec
  Specify which pages should be processed. When this option is not specified, the text of all pages of the documents is concatenated into the output file. The page specification pagespec contains one or more comma-separated page ranges. A page range is either a page number, or two page numbers separated by a dash. For instance, specification 1-10 outputs pages 1 to 10, and specification 1,3,99999-4 outputs pages 1 and 3, followed by all the document pages in reverse order up to page 4.
--detail=keyword
  This options causes djvutxt to output S-expressions specifying the position of the text in the page. See the manual page djvused(1) for a description of the output format. Argument keyword specifies the maximum level of detail for which text location is reported. The recognized values are: page, column, region, para, line, word, and char. All other values are interpreted as char.
--escape
  Output escape sequences of the form ooo for all non ASCII or non printable UTF-8 characters and for the backslash character.

REMARKS

Use program djvused(1) for more control over the text layer.

CREDITS

This program was initially written by Andrei Erofeev <andrew_erofeev@yahoo.com> and was then improved Bill Riemers <docbill@sourceforge.net> and many others. It was then rewritten to use the ddjvuapi by Leon Bottou <leonb@sourceforge.net>.

SEE ALSO

djvu(1), djvused(1)

Search for    or go to Top of page |  Section 1 |  Main Index


DjVuLibre-3.5 DJVUTXT (1) 10/11/2001

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with manServer 1.07.