ByteScout Premium Suite – VBScript – Ocr analyser for pdf with pdf extractor sdk

Home
/
Articles
/
ByteScout Premium Suite – VBScript – Ocr analyser for pdf with pdf extractor sdk

printable version:
ByteScout-Premium-Suite-VBScript-Ocr-analyser-for-pdf-with-pdf-extractor-sdk.pdf

ocr analyser for pdf with pdf extractor sdk in VBScript with ByteScout Premium Suite

Build ocr analyser for pdf with pdf extractor sdk in VBScript

Step-by-step instructions on how to do ocr analyser for pdf with pdf extractor sdk in VBScript

This page helps you to learn from code samples for programming in VBScript. ByteScout Premium Suite was created to assist ocr analyser for pdf with pdf extractor sdk in VBScript. ByteScout Premium Suite is the set that includes 12 SDK products from ByteScout including tools and components for PDF, barcodes, spreadsheets, screen video recording.

If you want to quickly learn then these fast application programming interfaces of ByteScout Premium Suite for VBScript plus the guideline and the VBScript code below will help you quickly learn ocr analyser for pdf with pdf extractor sdk. Follow the steps-by-step instructions from the scratch to work and copy and paste code for VBScript into your editor. VBScript application implementation mostly involves various stages of the software development so even if the functionality works please check it with your data and the production environment.

Our website gives free trial version of ByteScout Premium Suite. It includes all these source code samples with the purpose to assist you with your VBScript application implementation.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

PerformOcrAnalysisAndExtractText.vbs

      ' This example demonstrates the use of OCR Analyser - a tooling class for analysis of scanned documents
' in PDF or raster image formats to find best parameters for Optical Character Recognition (OCR) that
' provide highest recognition quality.


' Input document
inputDocument = "..\..\sample_ocr.pdf"

' Document page index
pageIndex = 0

' Location of OCR language data files
ocrLanguageDataFolder = "c:\Program Files\Bytescout PDF Extractor SDK\ocrdata_best\"

' OCR language
ocrLanguage = "eng" ' "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish etc - according to files in "ocrdata" folder
' Find more language files at https://github.com/bytescout/ocrdata


' Create OCRAnalyzer object and activate it with your registration information
Set ocrAnalyzer = CreateObject("Bytescout.PDFExtractor.OCRAnalyzer")
ocrAnalyzer.RegistrationName = "demo"
ocrAnalyzer.RegistrationKey = "demo"

' Load document to OCRAnalyzer
ocrAnalyzer.LoadDocumentFromFile(inputDocument)

' Setup OCRAnalyzer
ocrAnalyzer.OCRLanguage = ocrLanguage
ocrAnalyzer.OCRLanguageDataFolder = ocrLanguageDataFolder

WScript.Echo "Starting the OCR analysis. Click OK and wait, it may last long." & outputDocument

' Perform analysis and get results
Set analysisResults = ocrAnalyzer.AnalyzeByOCRConfidence(pageIndex)


' Now extract page text using detected OCR parameters

outputDocument = ".\result.txt"

' Create TextExtractor object
Set textExtractor = CreateObject("Bytescout.PDFExtractor.TextExtractor")
textExtractor.RegistrationName = "demo"
textExtractor.RegistrationKey = "demo"

' Load document to TextExtractor
textExtractor.LoadDocumentFromFile(inputDocument)

' Setup TextExtractor
textExtractor.OCRMode = 1 ' OCRMode.Auto
textExtractor.OCRLanguageDataFolder = ocrLanguageDataFolder
textExtractor.OCRLanguage = ocrLanguage

' Apply analysys results to TextExtractor instance
ocrAnalyzer.ApplyResults (analysisResults), (textExtractor)

' Save extracted text to file
textExtractor.SaveTextToFile(outputDocument)


WScript.Echo "Extracted text saved to " & outputDocument


Set textExtractor = Nothing
Set ocrAnalyzer = Nothing