ByteScout PDF Suite – C# – Extract Text From PDF By Pages with PDF Extractor SDK

Home
/
Articles
/
ByteScout PDF Suite – C# – Extract Text From PDF By Pages with PDF Extractor SDK

printable version:
ByteScout-PDF-Suite-C-sharp-Extract-Text-From-PDF-By-Pages-with-PDF-Extractor-SDK.pdf

How to extract text from PDF by pages with PDF extractor SDK in C# with ByteScout PDF Suite

Step-by-step tutorial on how to extract text from PDF by pages with PDF extractor SDK in C#

Every ByteScout tool includes simple example C# source codes that you can get here or in the folder with installed ByteScout product. ByteScout PDF Suite can extract text from PDF by pages with PDF extractor SDK. It can be applied from C#. ByteScout PDF Suite is the set that includes 6 SDK products to work with PDF from generating rich PDF reports to extracting data from PDF documents and converting them to HTML. This bundle includes PDF (Generator) SDK, PDF Renderer SDK, PDF Extractor SDK, PDF to HTML SDK, PDF Viewer SDK and PDF Generator SDK for Javascript.

Want to quickly learn? This fast application programming interfaces of ByteScout PDF Suite for C# plus the guidelines and the code below will help you quickly learn how to extract text from PDF by pages with PDF extractor SDK. Follow the instructions from scratch to work and copy the C# code. Enjoy writing a code with ready-to-use sample C# codes.

Trial version of ByteScout PDF Suite is available for free. Source code samples are included to help you with your C# app.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

Program.cs

      using System;
using Bytescout.PDFExtractor;
using System.Diagnostics;

namespace ExtractTextByPages
{
	class Program
	{
		static void Main(string[] args)
		{
			// Create Bytescout.PDFExtractor.TextExtractor instance
			TextExtractor extractor = new TextExtractor();
			extractor.RegistrationName = "demo";
			extractor.RegistrationKey = "demo";

			// Load sample PDF document
			extractor.LoadDocumentFromFile(@".\sample2.pdf");

			// Get page count
			int pageCount = extractor.GetPageCount();

			for (int i = 0; i < pageCount; i++)
			{
				string fileName = "page" + i + ".txt";
				
				// Save extracted page text to file
				extractor.SavePageTextToFile(i, fileName);
			}

			// Cleanup
			extractor.Dispose();

			// Open first output file in default associated application
			ProcessStartInfo processStartInfo = new ProcessStartInfo(@".\page1.txt");
            processStartInfo.UseShellExecute = true;
            Process.Start(processStartInfo);
		}
	}
}