ByteScout Data Extraction Suite – VB.NET – Convert pdf to xml with images with pdf extractor sdk

Home
/
Articles
/
ByteScout Data Extraction Suite – VB.NET – Convert pdf to xml with images with pdf extractor sdk

printable version:
ByteScout-Data-Extraction-Suite-VB-NET-Convert-pdf-to-xml-with-images-with-pdf-extractor-sdk.pdf

How to convert pdf to xml with images with pdf extractor sdk in VB.NET using ByteScout Data Extraction Suite

If you want to learn more then this tutorial will show how to convert pdf to xml with images with pdf extractor sdk in VB.NET

We made thousands of pre-made source code pieces for easy implementation in your own programming projects. ByteScout Data Extraction Suite can convert pdf to xml with images with pdf extractor sdk. It can be applied from VB.NET. ByteScout Data Extraction Suite is the bundle that includes three SDK tools for data extraction from PDF, scans, images and from spreadsheets: PDF Extractor SDK, Data Extraction SDK, Barcode Reader SDK.

The following code snippet for ByteScout Data Extraction Suite works best when you need to quickly convert pdf to xml with images with pdf extractor sdk in your VB.NET application. Just copy and paste the code into your VB.NET application’s code and follow the instructions. Check VB.NET sample code samples to see if they respond to your needs and requirements for the project.

If you want to try other source code samples then the free trial version of ByteScout Data Extraction Suite is available for download from our website. Just try other source code samples for VB.NET.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

Module1.vb

      Imports Bytescout.PDFExtractor

Namespace PDF2XML

    Class Program

        Shared Sub Main(ByVal args As String())

            ' Create Bytescout.PDFExtractor.XMLExtractor instance
            Dim extractor As New XMLExtractor()
            extractor.RegistrationName = "demo"
            extractor.RegistrationKey = "demo"

            ' Load sample PDF document
            extractor.LoadDocumentFromFile("sample1.pdf")

            ' Uncomment this line to get rid of empty nodes in XML
            'extractor.PreserveFormattingOnTextExtraction = False

            ' Set output image format
            extractor.ImageFormat = OutputImageFormat.PNG

            ' Save images to external files
            extractor.SaveImages = ImageHandling.OuterFile
            extractor.ImageFolder = "images" ' Folder for external images
            extractor.SaveXMLToFile("result_with_external_images.xml")

            ' Embed images into XML as Base64 encoded string
            extractor.SaveImages = ImageHandling.Embed
            extractor.SaveXMLToFile("result_with_embedded_images.xml")

            ' Cleanup
		    extractor.Dispose()

        End Sub

    End Class

End Namespace