ByteScout Text Recognition SDK – PowerShell – Extract Text From Areas

Home
/
Articles
/
ByteScout Text Recognition SDK – PowerShell – Extract Text From Areas

printable version:
ByteScout-Text-Recognition-SDK-PowerShell-Extract-Text-From-Areas.pdf

How to extract text from areas in PowerShell and ByteScout Text Recognition SDK

This code in PowerShell shows how to extract text from areas with this how to tutorial

These source code samples are listed and grouped by their programming language and functions they use. ByteScout Text Recognition SDK: the software development kit for automatic text recognition and OCR from pdf documents and images. Can recognize English and non-English languages. It can extract text from areas in PowerShell.

PowerShell code samples for PowerShell developers help to speed up coding of your application when using ByteScout Text Recognition SDK. This PowerShell sample code is all you need for your app. Just copy and paste the code, add references (if needs to) and you are all set! You can use these PowerShell sample examples in one or many applications.

Trial version of ByteScout Text Recognition SDK can be downloaded for free from our website. It also includes source code samples for PowerShell and other programming languages.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

ExtractFromAreas.ps1

      # Add reference to ByteScout.TextRecognition.dll assembly
Add-Type -Path "c:\Program Files\ByteScout Text Recognition SDK\net40\ByteScout.TextRecognition.dll"

$InputDocument = "..\..\areas-sample.pdf"
$PageIndex = 0
$OutputDocument = ".\result.txt"

# Create and activate TextRecognizer instance
$textRecognizer = New-Object ByteScout.TextRecognition.TextRecognizer
$textRecognizer.RegistrationName = "demo"
$textRecognizer.RegistrationKey = "demo"

try {
    # Load document (image or PDF)
    $textRecognizer.LoadDocument($InputDocument)

    # Set the location of OCR language data files
    $textRecognizer.OCRLanguageDataFolder = "c:\Program Files\ByteScout Text Recognition SDK\ocrdata_best\"

    # Set OCR language.
    # "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish, etc. - according to files in "ocrdata" folder
    # Find more language files at https://github.com/bytescout/ocrdata
    $textRecognizer.OCRLanguage = "eng"


    # Get page size (in pixels). Size of PDF document is computed from PDF Points 
    # and the rendering resolution specified by `textRecognizer.PDFRenderingResolution` (default 300 DPI)
    $pageSize = $textRecognizer.GetPageSize($PageIndex)

    # Add area of interest as a rectangle at the top-right corner of the page
    $textRecognizer.RecognitionAreas.Add($pageSize.Width / 2, 0, $pageSize.Width / 2, 300)
    # Add area of interest as a rectangle at the bottom-left corner of the page,
    # and indicate it should be rotated at 90 deg
    $textRecognizer.RecognitionAreas.Add(0, $pageSize.Height / 2, 300, $pageSize.Height / 2, [ByteScout.TextRecognition.AreaRotation]::Rotate90FlipNone)

    # Now, you can get recognized text for further analysis as a list of objects 
    # containing coordinates, object kind, confidence.
    $ocrObjectList = $textRecognizer.GetOCRObjects($PageIndex)
    foreach ($ocrObject in $ocrObjectList) {
        Write-Host $($ocrObject.ToString())
    }

    # ... or you can save recognized text pieces to file
    $textRecognizer.KeepTextFormatting = $false # save without formatting
    $textRecognizer.SaveText($OutputDocument, $PageIndex, $PageIndex)

    # Open the result file in default associated application (for demo purposes)
    & $OutputDocument
}
catch {
    # Display exception
    Write-Host $_.Exception.Message
}

$textRecognizer.Dispose()

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout Text Recognition SDK Home Page

Explore ByteScout Text Recognition SDK Documentation

Explore Samples

Sign Up for ByteScout Text Recognition SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

run.bat

      @echo off

powershell -NoProfile -ExecutionPolicy Bypass -Command "& .\ExtractFromAreas.ps1"
echo Script finished with errorlevel=%errorlevel%

pause