PageConvertToHtml Method (TextExtractionOptions, Boolean)

Apitron PDF Kit help
Apitron.PDF.Kit library for .NET
Converts the page to HTML by combining content returned by ExtractText(TextExtractionOptions, Boolean) and ExtractDrawings(Resolution). A default 72 dpi resolution will be used.

Namespace:  Apitron.PDF.Kit.FixedLayout
Assembly:  Apitron.PDF.Kit (in Apitron.PDF.Kit.dll) Version: 2.0.37.0 (2.0.37.0)
Syntax

public string ConvertToHtml(
	TextExtractionOptions options = TextExtractionOptions.HtmlPage,
	bool processBiDiText = false
)

Parameters

options (Optional)
Type: Apitron.PDF.Kit.ExtractionTextExtractionOptions
Text extraction options. Only HtmlFragment and HtmlPage are allowed.
processBiDiText (Optional)
Type: SystemBoolean
Text extraction algorithm will try to detect and reorder bi-directional strings.

Return Value

Type: String
A string that contains resulting HTML. All images will be embedded using data URIs.
Examples

C#
using (Stream stream = new FileStream("document.pdf", FileMode.Open))
{
    FixedDocument document = new FixedDocument(stream);
    for (int i = 0; i<document.Pages.Count; i++)
    {
        Page page = document.Pages[i];
        string str = page.ConvertToHtml(TextExtractionOptions.HtmlPage, true);

        FileStream fileStream = new FileInfo(string.Format("page{0}.html", i)).Create();
        using (StreamWriter writer = new StreamWriter(fileStream))
        {
            writer.Write(str);
        }
    }
}
See Also

Reference