ByteScout PDF Extractor SDK – C# – Extract Text By Columns from PDF

Home
/
Articles
/
ByteScout PDF Extractor SDK – C# – Extract Text By Columns from PDF

printable version:
ByteScout-PDF-Extractor-SDK-C-sharp-Extract-Text-By-Columns-from-PDF.pdf

How to extract text by columns from PDF in C# using ByteScout PDF Extractor SDK

Write code in C# to extract text by columns from PDF with this step-by-step tutorial

Extract text by columns from PDF is easy to implement in C# if you use these source codes below. ByteScout PDF Extractor SDK is the Software Development Kit (SDK) that is designed to help developers with data extraction from unstructured documents like pdf, tiff, scans, images, scanned and electronic forms. The library is powered by OCR, computer vision and AI to provide unique functionality like table detection, automatic table structure extraction, data restoration, data restructuring and reconstruction. Supports PDF, TIFF, PNG, JPG images as input and can output CSV, XML, JSON formatted data. Includes full set of utilities like pdf splitter, pdf merger, searchable pdf maker. It can extract text by columns from PDF in C#.

The SDK samples like this one below explain how to quickly make your application do extract text by columns from PDF in C# with the help of ByteScout PDF Extractor SDK. In order to implement the functionality, you should copy and paste this code for C# below into your code editor with your app, compile and run your application. Enjoy writing a code with ready-to-use sample C# codes.

ByteScout PDF Extractor SDK free trial version is available on our website. C# and other programming languages are supported.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

Program.cs

      using System;
using Bytescout.PDFExtractor;
using System.Diagnostics;

namespace ExtractTextByColumns
{
    class Program
    {
        static void Main(string[] args)
        {
            // Create Bytescout.PDFExtractor.TextExtractor instance
            TextExtractor extractor = new TextExtractor();
            extractor.RegistrationName = "demo";
            extractor.RegistrationKey = "demo";

            // Load sample PDF document
            extractor.LoadDocumentFromFile(@".\columns.pdf");

            // Extract text by columns (useful if PDF document is designed in column layout like a newspaper)
            extractor.ExtractColumnByColumn = true;

            // Save extracted text to file
            extractor.SaveTextToFile(@".\result.txt");

            // Cleanup
            extractor.Dispose();

            // Open result file in default associated application
            ProcessStartInfo processStartInfo = new ProcessStartInfo(@".\result.txt");
            processStartInfo.UseShellExecute = true;
            Process.Start(processStartInfo);
        }
    }
}