ByteScout Data Extraction Suite – VB.NET – Check if ocr is required for pdf with pdf extractor sdk

Home
/
Articles
/
ByteScout Data Extraction Suite – VB.NET – Check if ocr is required for pdf with pdf extractor sdk

printable version:
ByteScout-Data-Extraction-Suite-VB-NET-Check-if-ocr-is-required-for-pdf-with-pdf-extractor-sdk.pdf

check if ocr is required for pdf with pdf extractor sdk in VB.NET with ByteScout Data Extraction Suite

check if ocr is required for pdf with pdf extractor sdk in VB.NET

This page helps you to learn from code samples for programming in VB.NET. ByteScout Data Extraction Suite helps with check if ocr is required for pdf with pdf extractor sdk in VB.NET. ByteScout Data Extraction Suite is the bundle that includes three SDK tools for data extraction from PDF, scans, images and from spreadsheets: PDF Extractor SDK, Data Extraction SDK, Barcode Reader SDK.

If you want to quickly learn then these fast application programming interfaces of ByteScout Data Extraction Suite for VB.NET plus the guideline and the VB.NET code below will help you quickly learn check if ocr is required for pdf with pdf extractor sdk. If you want to implement this functionality, you should copy and paste code below into your app using code editor. Then compile and run your application. Updated and detailed documentation and tutorials are available along with installed ByteScout Data Extraction Suite if you’d like to learn more about the topic and the details of the API.

On our website you may get trial version of ByteScout Data Extraction Suite for free. Source code samples are included to help you with your VB.NET application.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

Program.vb

      Imports Bytescout.PDFExtractor

Module Program

    Sub Main()

        Try

            ' Loop through all files in directory and check whether OCR operation is required
            For Each filePath As String In System.IO.Directory.GetFiles("InputFiles")
                _CheckOCRRequired(filePath)
            Next

        Catch ex As Exception
            Console.WriteLine("Error: " + ex.Message)
        End Try

        Console.WriteLine("Press enter key to exit...")
        Console.ReadLine()

    End Sub

    ''' <summary>
    ''' Check whether OCR Operation is required
    ''' </summary>
    ''' <param name="filePath"></param>
    Private Sub _CheckOCRRequired(ByVal filePath As String)

        ' Read all file content...
        Using extractor As TextExtractor = New TextExtractor()

            extractor.RegistrationKey = "demo"
            extractor.RegistrationName = "demo"

            ' Load document
            extractor.LoadDocumentFromFile(filePath)
            Console.WriteLine("{1}*******************{1}{1}FilePath: {0}", filePath, vbLf)

            Dim pageIndex As Int32 = 0

            ' Identify OCR operation is recommended for page
            If (extractor.IsOCRRecommendedForPage(pageIndex)) Then

                Console.WriteLine("{0}OCR Recommended: True", vbLf)

                ' Enable Optical Character Recognition (OCR)
                ' in .Auto mode (SDK automatically checks if needs to use OCR or not)
                extractor.OCRMode = OCRMode.Auto

                ' Set the location of OCR language data files
                extractor.OCRLanguageDataFolder = "c:\Program Files\Bytescout PDF Extractor SDK\ocrdata\"

                ' Set OCR language
                extractor.OCRLanguage = "eng" ' "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish etc - according to files in "ocrdata" folder
                ' Find more language files at https://github.com/bytescout/ocrdata

                ' Set PDF document rendering resolution
                extractor.OCRResolution = 300

            Else
                Console.WriteLine("{0}OCR Recommended: False", vbLf)
            End If

            ' Read all text
            Dim allExtractedText = extractor.GetText()
            Console.WriteLine("{1}Extracted Text:{1}{0}{1}{1}", allExtractedText, vbLf)

        End Using

    End Sub



End Module