ByteScout Data Extraction Suite – VB.NET – Index pdf files with pdf extractor sdk

Home
/
Articles
/
ByteScout Data Extraction Suite – VB.NET – Index pdf files with pdf extractor sdk

printable version:
ByteScout-Data-Extraction-Suite-VB-NET-Index-pdf-files-with-pdf-extractor-sdk.pdf

How to index pdf files with pdf extractor sdk in VB.NET and ByteScout Data Extraction Suite

Learn to code in VB.NET to index pdf files with pdf extractor sdk with this step-by-step tutorial

An easy to understand guide on how to index pdf files with pdf extractor sdk in VB.NET with this source code sample. What is ByteScout Data Extraction Suite? It is the bundle that includes three SDK tools for data extraction from PDF, scans, images and from spreadsheets: PDF Extractor SDK, Data Extraction SDK, Barcode Reader SDK. It can help you to index pdf files with pdf extractor sdk in your VB.NET application.

The following code snippet for ByteScout Data Extraction Suite works best when you need to quickly index pdf files with pdf extractor sdk in your VB.NET application. Follow the instructions from scratch to work and copy the VB.NET code. Enjoy writing a code with ready-to-use sample VB.NET codes.

Our website gives trial version of ByteScout Data Extraction Suite for free. It also includes documentation and source code samples.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

Program.vb

      Imports System.IO
Imports Bytescout.PDFExtractor

Class Program
    Friend Shared Sub Main(ByVal args As String())

        ' Create Bytescout.PDFExtractor.InfoExtractor instance
        Dim infoExtractor As New InfoExtractor()
        infoExtractor.RegistrationName = "demo"
        infoExtractor.RegistrationKey = "demo"

        ' Create Bytescout.PDFExtractor.TextExtractor instance
        Dim textExtractor As New TextExtractor()
        textExtractor.RegistrationName = "demo"
        textExtractor.RegistrationKey = "demo"

        ' List all PDF files in directory
        For Each file As String In Directory.GetFiles("..\..\..\..", "*.pdf")
            infoExtractor.LoadDocumentFromFile(file)

            Console.WriteLine("File Name:      " & Path.GetFileName(file))
            Console.WriteLine("Page Count:     " & infoExtractor.GetPageCount())
            Console.WriteLine("Author:         " & infoExtractor.Author)
            Console.WriteLine("Title:          " & infoExtractor.Title)
            Console.WriteLine("Producer:       " & infoExtractor.Producer)
            Console.WriteLine("Subject:        " & infoExtractor.Subject)
            Console.WriteLine("CreationDate:   " & infoExtractor.CreationDate)
            Console.WriteLine("Text (2 lines): ")

            textExtractor.LoadDocumentFromFile(file)
            Using stringReader As New StringReader(textExtractor.GetTextFromPage(0))
                Console.WriteLine(stringReader.ReadLine())
                Console.WriteLine(stringReader.ReadLine())
            End Using
            Console.WriteLine()
        Next

        ' Cleanup
        infoExtractor.Dispose()
        textExtractor.Dispose()

        Console.WriteLine()
        Console.WriteLine("Press any key to continue...")
        Console.ReadLine()
    End Sub
End Class