Click or drag to resize

PdfToTextConverter Class

The PDF to Text converter class
Inheritance Hierarchy
SystemObject
  ExpertPdfPdfToTextConverter

Namespace:  ExpertPdf
Assembly:  eppdftotext (in eppdftotext.dll) Version: 7.0
Syntax
public class PdfToTextConverter

The PdfToTextConverter type exposes the following members.

Constructors
  NameDescription
Public methodPdfToTextConverter
PDF to Text converter constructor
Top
Properties
  NameDescription
Public propertyAddHtmlMetaTags
When this property is true the extracted text is included in a HTML document with the meta info taken from the PDF document description. The default value is false.
Public propertyClipText
Do not return hidden text from the PDF document.
Public propertyDocumentInformation
PDF document information. This property is populated after the requested operation (ConvertToText) is finished.
Public propertyEndPageNumber
The page number where the text extraction will end. The default value is 0 which means that all the PDF document text will be extracted starting from the StartPageNumber page.
Public propertyHtmlCharset
The charset meta tag added to the generated HTML document. This property has effect only when the AddHtmlMetaTags property is true. The default value is UTF-8.
Public propertyLayout
Gets or sets the layout of the output text. The default value is OriginalLayout.
Public propertyLicenseKey
The license key string received after the product purchase or a demo license key for the demo version of the converter. This property must be set before calling the text extraction method. When this property is null, the converter will run in demo mode.
Public propertyMarkPageBreaks
When this property is true a special character defined by the PAGE_BREAK_MARK property is inserted after the text extracted from each page. The default value is false.
Public propertyStatic memberPAGE_BREAK_MARK
Gets the page break mark character used when the MarkPageBreaks property is true.
Public propertyPdfToolsFullPath
Gets or sets the full path (including the file name) of the pdf tools engine helper file.
Public propertyPdfToolsTimeout
Timeout in seconds for the current operation. Default value is 600 seconds.
Public propertyStartPageNumber
The page number from where the text extraction will start. The default value is 1 which means the text extraction will start from the first page.
Public propertyUserPassword
The ASCII password to be used to open the PDF document for reading. The default value is null which means that no password will be used to open the PDF document.
Public propertyxDf
Internal use only.
Top
Methods
  NameDescription
Public methodConvertToText(Stream)
Converts a PDF document stream to a string.
Public methodConvertToText(String)
Converts a PDF file to a string.
Public methodExtractTextAtCoords(Stream, Int32, Double, Double, Double, Double)
Extracts the text from the specified page and coordinates.
Public methodExtractTextAtCoords(String, Int32, Double, Double, Double, Double)
Extracts the text from the specified page and coordinates.
Public methodExtractTextPositions(Stream, String, Boolean, Boolean)
Searches for a text into a PDF document specified by a Stream.
Public methodExtractTextPositions(String, String, Boolean, Boolean)
Searches for a text into a PDF file.
Public methodGetPageCount(Stream)
Gets the number of pages in the specified PDF stream.
Public methodGetPageCount(String)
Gets the number of pages in the specified PDF file.
Top
See Also