ByteScout PDF Extractor SDK - C# - Check If OCR Is Required for PDF - ByteScout
Announcement
Our ByteScout SDK products are sunsetting as we focus on expanding new solutions.
Learn More Open modal
Close modal
Announcement Important Update
ByteScout SDK Sunsetting Notice
Our ByteScout SDK products are sunsetting as we focus on our new & improved solutions. Thank you for being part of our journey, and we look forward to supporting you in this next chapter!

ByteScout PDF Extractor SDK – C# – Check If OCR Is Required for PDF

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – C# – Check If OCR Is Required for PDF

check if OCR is required for PDF in C# and ByteScout PDF Extractor SDK

Make check if OCR is required for PDF in C#

:

Tutorial on how to do check if OCR is required for PDF in C#

Every ByteScout tool contains example C# source codes that you can find here or in the folder with installed ByteScout product. Check if OCR is required for PDF in C# can be implemented with ByteScout PDF Extractor SDK. ByteScout PDF Extractor SDK is the Software Development Kit (SDK) that is designed to help developers with data extraction from unstructured documents like pdf, tiff, scans, images, scanned and electronic forms. The library is powered by OCR, computer vision and AI to provide unique functionality like table detection, automatic table structure extraction, data restoration, data restructuring and reconstruction. Supports PDF, TIFF, PNG, JPG images as input and can output CSV, XML, JSON formatted data. Includes full set of utilities like pdf splitter, pdf merger, searchable pdf maker.

The SDK samples like this one below explain how to quickly make your application do check if OCR is required for PDF in C# with the help of ByteScout PDF Extractor SDK. C# sample code is all you need: copy and paste the code to your C# application’s code editor, add a reference to ByteScout PDF Extractor SDK (if you haven’t added yet) and you are ready to go! Enhanced documentation and tutorials are available along with installed ByteScout PDF Extractor SDK if you’d like to dive deeper into the topic and the details of the API.

On our website you may get trial version of ByteScout PDF Extractor SDK for free. Source code samples are included to help you with your C# application.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

Program.cs
      
using Bytescout.PDFExtractor; using System; namespace CheckIfOCRIsRequired { class Program { static void Main(string[] args) { try { // Loop through all files in directory and check whether OCR operation is required foreach (string filePath in System.IO.Directory.GetFiles("InputFiles")) { _CheckOCRRequired(filePath); } } catch (Exception ex) { Console.WriteLine("Error: " + ex.Message); } Console.WriteLine("Press enter key to exit..."); Console.ReadLine(); } /// <summary> /// Check whether OCR Operation is required /// </summary> /// <param name="filePath"></param> private static void _CheckOCRRequired(string filePath) { //Read all file content... using (TextExtractor extractor = new TextExtractor()) { extractor.RegistrationKey = "demo"; extractor.RegistrationName = "demo"; // Load document extractor.LoadDocumentFromFile(filePath); Console.WriteLine("\n*******************\n\nFilePath: {0}", filePath); int pageIndex = 0; // Identify OCR operation is recommended for page if (extractor.IsOCRRecommendedForPage(pageIndex)) { Console.WriteLine("\nOCR Recommended: True"); // Enable Optical Character Recognition (OCR) // in .Auto mode (SDK automatically checks if needs to use OCR or not) extractor.OCRMode = OCRMode.Auto; // Set the location of language data files extractor.OCRLanguageDataFolder = @"c:\Program Files\Bytescout PDF Extractor SDK\ocrdata\"; // Set OCR language extractor.OCRLanguage = "eng"; // "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish etc - according to files in "ocrdata" folder // Find more language files at https://github.com/bytescout/ocrdata // Set PDF document rendering resolution extractor.OCRResolution = 300; } else { Console.WriteLine("\nOCR Recommended: False"); } //Read all text var allExtractedText = extractor.GetText(); Console.WriteLine("\nExtracted Text:\n{0}\n\n", allExtractedText); } } } }

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Extractor SDK Home Page

Explore ByteScout PDF Extractor SDK Documentation

Explore Samples

Sign Up for ByteScout PDF Extractor SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

VIDEO

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Extractor SDK Home Page

Explore ByteScout PDF Extractor SDK Documentation

Explore Samples

Sign Up for ByteScout PDF Extractor SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Tutorials:

prev
next