ByteScout PDF Extractor SDK - C# - Index PDF Files - ByteScout

ByteScout PDF Extractor SDK – C# – Index PDF Files

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – C# – Index PDF Files

How to index PDF files in C# using ByteScout PDF Extractor SDK

The tutorial shows how to index PDF files in C#

Learn how to index PDF files in C# with this source code sample. Want to index PDF files in your C# app? ByteScout PDF Extractor SDK is designed for it. ByteScout PDF Extractor SDK is the SDK that helps developers to extract data from unstructured documents, pdf, images, scanned and electronic forms. Includes AI functions like automatic table detection, automatic table extraction and restructuring, text recognition and text restoration from pdf and scanned documents. Includes PDF to CSV, PDF to XML, PDF to JSON, PDF to searchable PDF functions as well as methods for low level data extraction.

You will save a lot of time on writing and testing code as you may just take the C# code from ByteScout PDF Extractor SDK for index PDF files below and use it in your application. This C# sample code is all you need for your app. Just copy and paste the code, add references (if needs to) and you are all set! Enjoy writing a code with ready-to-use sample C# codes.

Free trial version of ByteScout PDF Extractor SDK is available on our website. Documentation and source code samples are included.

Try ByteScout PDF Extractor SDK today: Get 60 Day Free Trial or sign up for Web API

using System; using System.IO; using Bytescout.PDFExtractor; namespace IndexPDFFiles { class Program { static void Main(string[] args) { // Create Bytescout.PDFExtractor.InfoExtractor instance InfoExtractor infoExtractor = new InfoExtractor(); infoExtractor.RegistrationName = "demo"; infoExtractor.RegistrationKey = "demo"; TextExtractor textExtractor = new TextExtractor(); textExtractor.RegistrationName = "demo"; textExtractor.RegistrationKey = "demo"; // List all PDF files in directory foreach (string file in Directory.GetFiles(@"..\..\..\..", "*.pdf")) { infoExtractor.LoadDocumentFromFile(file); Console.WriteLine("File Name: " + Path.GetFileName(file)); Console.WriteLine("Page Count: " + infoExtractor.GetPageCount()); Console.WriteLine("Author: " + infoExtractor.Author); Console.WriteLine("Title: " + infoExtractor.Title); Console.WriteLine("Producer: " + infoExtractor.Producer); Console.WriteLine("Subject: " + infoExtractor.Subject); Console.WriteLine("CreationDate: " + infoExtractor.CreationDate); Console.WriteLine("Text (first 2 lines): "); // Load a couple of lines from each document textExtractor.LoadDocumentFromFile(file); using (StringReader stringReader = new StringReader(textExtractor.GetTextFromPage(0))) { Console.WriteLine(stringReader.ReadLine()); Console.WriteLine(stringReader.ReadLine()); } Console.WriteLine(); } // Cleanup infoExtractor.Dispose(); textExtractor.Dispose(); Console.WriteLine(); Console.WriteLine("Press any key to continue..."); Console.ReadLine(); } } }

Try ByteScout PDF Extractor SDK today: 60 Day Free Trial (on-premise version) or sign up for Web API (on demand version)



Get 60 Day Free Trial or Visit ByteScout PDF Extractor SDK page

Explore ByteScout PDF Extractor SDK documentation


Sign Up for free Web API key

Explore Web API Documentation