ByteScout Text Recognition SDK - PowerShell - Image Preprocessing Filters - ByteScout
Announcement
Our ByteScout SDK products are sunsetting as we focus on expanding new solutions.
Learn More Open modal
Close modal
Announcement Important Update
ByteScout SDK Sunsetting Notice
Our ByteScout SDK products are sunsetting as we focus on our new & improved solutions. Thank you for being part of our journey, and we look forward to supporting you in this next chapter!

ByteScout Text Recognition SDK – PowerShell – Image Preprocessing Filters

  • Home
  • /
  • Articles
  • /
  • ByteScout Text Recognition SDK – PowerShell – Image Preprocessing Filters

ByteScout Text Recognition SDK – PowerShell – Image Preprocessing Filters

ImagePreprocessingFilters.ps1

# Add reference to ByteScout.TextRecognition.dll assembly
Add-Type -Path "c:\Program Files\ByteScout Text Recognition SDK\net40\ByteScout.TextRecognition.dll"

$InputDocument = "..\..\skewed.png"
$OutputDocument = ".\result.txt"

# Create and activate TextRecognizer instance
$textRecognizer = New-Object ByteScout.TextRecognition.TextRecognizer
$textRecognizer.RegistrationName = "demo"
$textRecognizer.RegistrationKey = "demo"

try {
    # Load document (image or PDF)
    $textRecognizer.LoadDocument($InputDocument)

    # Set location of "tessdata" folder containing language data files
    $textRecognizer.OCRLanguageDataFolder = "c:\Program Files\ByteScout Text Recognition SDK\tessdata\"

    # Set OCR language.
    # "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish etc - according to files in "tessdata" folder
    # Find more language files at https:#github.com/tesseract-ocr/tessdata/tree/3.04.00
    $textRecognizer.OCRLanguage = "eng"


    # Add deskew filter that automatically rotates the image to make the text horizontal.
    # Note, it analyzes the left edge of scanned text. Any dark artifacts may prevent 
    # the correct angle detection.
    $textRecognizer.ImagePreprocessingFilters.AddDeskew()

    # Other filters that may be useful to improve recognition
    # (note, the filters are applied in the order they were added):

    # Improve image contrast.
    #$textRecognizer.ImagePreprocessingFilters.AddContrast()

    # Apply gamma correction.
    #$textRecognizer.ImagePreprocessingFilters.AddGammaCorrection()

    # Apply median filter. Helps to remove noise.
    #$textRecognizer.ImagePreprocessingFilters.AddMedian()

    # Apply dilate filter. Helps to cure symbols erosion.
    #$textRecognizer.ImagePreprocessingFilters.AddDilate()

    # Lines removers. Removing borders of some tables may improve the recognition.
    #$textRecognizer.ImagePreprocessingFilters.AddHorizontalLinesRemover()
    #$textRecognizer.ImagePreprocessingFilters.AddVerticalLinesRemover()


    # Recognize text from all pages and save it to file
    $textRecognizer.SaveText($OutputDocument)

    # Open the result file in default associated application (for demo purposes)
    & $OutputDocument
}
catch {
    # Display exception
    Write-Host $_.Exception.Message
}

$textRecognizer.Dispose()

run.bat

@echo off

powershell -NoProfile -ExecutionPolicy Bypass -Command "& .\ImagePreprocessingFilters.ps1"
echo Script finished with errorlevel=%errorlevel%

pause

  Click here to get your Free Trial version of the SDK

Tutorials:

prev
next