ByteScout PDF Extractor SDK - C# - Remove Empty Pages from PDF - ByteScout

ByteScout PDF Extractor SDK – C# – Remove Empty Pages from PDF

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – C# – Remove Empty Pages from PDF

How to remove empty pages from PDF in C# using ByteScout PDF Extractor SDK

This code in C# shows how to remove empty pages from PDF with this how to tutorial

The sample source code below will teach you how to remove empty pages from PDF in C#. ByteScout PDF Extractor SDK is the SDK that helps developers to extract data from unstructured documents, pdf, images, scanned and electronic forms. Includes AI functions like automatic table detection, automatic table extraction and restructuring, text recognition and text restoration from pdf and scanned documents. Includes PDF to CSV, PDF to XML, PDF to JSON, PDF to searchable PDF functions as well as methods for low level data extraction. It can remove empty pages from PDF in C#.

The SDK samples like this one below explain how to quickly make your application do remove empty pages from PDF in C# with the help of ByteScout PDF Extractor SDK. Follow the instructions from the scratch to work and copy the C# code. You can use these C# sample examples in one or many applications.

Trial version of ByteScout PDF Extractor SDK is available for free. Source code samples are included to help you with your C# app.

Try it today: Get 60 Day Free Trial or sign up for Web API

using System.Collections.Generic; using System.Diagnostics; using System.IO; using Bytescout.PDFExtractor; namespace RemoveEmptyPagesExample { /// <summary> /// The example demonstrates detection of empty pages, splitting the document to separate /// pages excluding empty ones, then combine parts back to a single document. /// </summary> class Program { static string InputFile = @".\sample.pdf"; static string OutputFile = @".\result.pdf"; static string TempFolder = @".\temp"; static void Main(string[] args) { // Create and setup Bytescout.PDFExtractor.TextExtractor instance TextExtractor extractor = new TextExtractor("demo", "demo"); // Load PDF document extractor.LoadDocumentFromFile(InputFile); // List to keep non-empty page numbers List<string> nonEmptyPages = new List<string>(); // Iterate through pages for (int pageIndex = 0; pageIndex < extractor.GetPageCount(); pageIndex++) { // Extract page text string pageText = extractor.GetTextFromPage(pageIndex); // If extracted text is not empty keep the page number if (pageText.Length > 0) nonEmptyPages.Add((pageIndex + 1).ToString()); } // Cleanup extractor.Dispose(); // Form comma-separated list of page numbers to split("1,3,5") string ranges = string.Join(",", nonEmptyPages); // Create Bytescout.PDFExtractor.DocumentSplitter instance DocumentSplitter splitter = new DocumentSplitter("demo", "demo"); splitter.OptimizeSplittedDocuments = true; // Split document by non-empty in temp folder string[] parts = splitter.Split(InputFile, ranges, TempFolder); // Cleanup splitter.Dispose(); // Create Bytescout.PDFExtractor.DocumentMerger instance DocumentMerger merger = new DocumentMerger("demo", "demo"); // Merge parts merger.Merge(parts, OutputFile); // Cleanup merger.Dispose(); // Delete temp folder Directory.Delete(TempFolder, true); // Open result document in default associated application (for demo purpose) ProcessStartInfo processStartInfo = new ProcessStartInfo(OutputFile); processStartInfo.UseShellExecute = true; Process.Start(processStartInfo); } } }

Try it today: Get 60 Day Free Trial or sign up for Web API

<?xml version="1.0" encoding="utf-8"?> <Project Sdk="Microsoft.NET.Sdk"> <PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>netcoreapp2.0</TargetFramework> <EnableDefaultCompileItems>false</EnableDefaultCompileItems> <GenerateAssemblyCompanyAttribute>false</GenerateAssemblyCompanyAttribute> <GenerateAssemblyConfigurationAttribute>false</GenerateAssemblyConfigurationAttribute> <GenerateAssemblyFileVersionAttribute>false</GenerateAssemblyFileVersionAttribute> <GenerateAssemblyInformationalVersionAttribute>false</GenerateAssemblyInformationalVersionAttribute> <GenerateAssemblyProductAttribute>false</GenerateAssemblyProductAttribute> <GenerateAssemblyTitleAttribute>false</GenerateAssemblyTitleAttribute> <GenerateAssemblyVersionAttribute>false</GenerateAssemblyVersionAttribute> <GenerateAssemblyCopyrightAttribute>false</GenerateAssemblyCopyrightAttribute> <GenerateAssemblyTrademarkAttribute>false</GenerateAssemblyTrademarkAttribute> <GenerateAssemblyCultureAttribute>false</GenerateAssemblyCultureAttribute> <GenerateAssemblyDescriptionAttribute>false</GenerateAssemblyDescriptionAttribute> </PropertyGroup> <ItemGroup> <Compile Include="Program.cs" /> <Compile Include="Properties\AssemblyInfo.cs" /> <None Include="sample.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> </ItemGroup> <ItemGroup> <PackageReference Include="Microsoft.Windows.Compatibility" Version="2.0.0" /> </ItemGroup> <ItemGroup> <Reference Include="Bytescout.PDFExtractor"> <SpecificVersion>False</SpecificVersion> <HintPath>c:\Program Files\Bytescout PDF Extractor SDK\netcoreapp2.0\Bytescout.PDFExtractor.dll</HintPath> </Reference> <Reference Include="Bytescout.PDFExtractor.OCRExtension"> <SpecificVersion>False</SpecificVersion> <HintPath>c:\Program Files\Bytescout PDF Extractor SDK\netcoreapp2.0\Bytescout.PDFExtractor.OCRExtension.dll</HintPath> </Reference> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

<?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="15.0" xmlns=""> <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" /> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProjectGuid>{E59DB9C3-278F-4055-B111-492FE74A54F7}</ProjectGuid> <OutputType>Exe</OutputType> <RootNamespace>RemoveEmptyPagesExample</RootNamespace> <AssemblyName>RemoveEmptyPagesExample</AssemblyName> <TargetFrameworkVersion>v4.0</TargetFrameworkVersion> <FileAlignment>512</FileAlignment> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG;TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <ItemGroup> <Reference Include="Bytescout.PDFExtractor, Version=, Culture=neutral, PublicKeyToken=f7dd1bd9d40a50eb, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> <HintPath>C:\Program Files\Bytescout PDF Extractor SDK\net4.00\Bytescout.PDFExtractor.dll</HintPath> </Reference> <Reference Include="System" /> <Reference Include="System.Core" /> <Reference Include="System.Xml.Linq" /> <Reference Include="System.Data" /> <Reference Include="System.Xml" /> </ItemGroup> <ItemGroup> <Compile Include="Program.cs" /> <Compile Include="Properties\AssemblyInfo.cs" /> </ItemGroup> <ItemGroup> <Content Include="sample.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </Content> </ItemGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" /> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API


Get 60 Day Free Trial or Visit ByteScout PDF Extractor SDK page

Explore ByteScout PDF Extractor SDK documentation


Sign Up for free Web API key

Explore Web API Documentation