ByteScout Document Parser SDK - C# - Extract line items from tables on multiple pages - ByteScout

ByteScout Document Parser SDK – C# – Extract line items from tables on multiple pages

  • Home
  • /
  • Articles
  • /
  • ByteScout Document Parser SDK – C# – Extract line items from tables on multiple pages

How to extract line items from tables on multiple pages in C# and ByteScout Document Parser SDK

ByteScout Document Parser SDK: the robost offline data extraction platform for template based data extraction and processing. Supports high load with millions of documents as input. Templates can be quickly created and updated with no special technical knowledge required.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

ExtractLineItemFromTableOnMultiplePages.csproj
      
<?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="15.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" /> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProjectGuid>{A73776C6-D2B2-4E37-B852-06C6454D1B5B}</ProjectGuid> <OutputType>Exe</OutputType> <RootNamespace>ExtractLineItemFromTableOnMultiplePages</RootNamespace> <AssemblyName>ExtractLineItemFromTableOnMultiplePages</AssemblyName> <TargetFrameworkVersion>v4.0</TargetFrameworkVersion> <FileAlignment>512</FileAlignment> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG;TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <ItemGroup> <Reference Include="ByteScout.DocumentParser, Version=1.0.0.100, Culture=neutral, PublicKeyToken=f7dd1bd9d40a50eb, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> <HintPath>c:\Program Files\ByteScout Document Parser SDK\net40\ByteScout.DocumentParser.dll</HintPath> </Reference> <Reference Include="System" /> <Reference Include="System.Core" /> </ItemGroup> <ItemGroup> <Compile Include="Program.cs" /> </ItemGroup> <ItemGroup> <None Include="..\..\MultiPageTable.pdf"> <Link>MultiPageTable.pdf</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> <None Include="..\..\_Sample Templates\MultiPageTable-template1.yml"> <Link>MultiPageTable-template1.yml</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> <None Include="..\..\_Sample Templates\MultiPageTable-template2.yml"> <Link>MultiPageTable-template2.yml</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> </ItemGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" /> </Project>

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout Document Parser SDK Home Page

Explore ByteScout Document Parser SDK Documentation

Explore Samples

Sign Up for ByteScout Document Parser SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

ExtractLineItemFromTableOnMultiplePages.sln
      
Microsoft Visual Studio Solution File, Format Version 12.00 # Visual Studio 15 VisualStudioVersion = 15.0.27703.2018 MinimumVisualStudioVersion = 10.0.40219.1 Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "ExtractLineItemFromTableOnMultiplePages", "ExtractLineItemFromTableOnMultiplePages.csproj", "{A73776C6-D2B2-4E37-B852-06C6454D1B5B}" EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Any CPU = Debug|Any CPU Release|Any CPU = Release|Any CPU EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution {A73776C6-D2B2-4E37-B852-06C6454D1B5B}.Debug|Any CPU.ActiveCfg = Debug|Any CPU {A73776C6-D2B2-4E37-B852-06C6454D1B5B}.Debug|Any CPU.Build.0 = Debug|Any CPU {A73776C6-D2B2-4E37-B852-06C6454D1B5B}.Release|Any CPU.ActiveCfg = Release|Any CPU {A73776C6-D2B2-4E37-B852-06C6454D1B5B}.Release|Any CPU.Build.0 = Release|Any CPU EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE EndGlobalSection GlobalSection(ExtensibilityGlobals) = postSolution SolutionGuid = {7E6DAA79-020B-421A-844A-5FE05EFC9B15} EndGlobalSection EndGlobal

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout Document Parser SDK Home Page

Explore ByteScout Document Parser SDK Documentation

Explore Samples

Sign Up for ByteScout Document Parser SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Program.cs
      
using System; using ByteScout.DocumentParser; // This example demonstrates extracting line items from tables on multiple pages with two different approaches. // See comments in the code of templates. namespace ExtractLineItemFromTableOnMultiplePages { class Program { static void Main(string[] args) { string inputDocument = @".\MultiPageTable.pdf"; string template1 = @".\MultiPageTable-template1.yml"; string template2 = @".\MultiPageTable-template2.yml"; // Process using template-1 using (DocumentParser documentParser = new DocumentParser("demo", "demo")) { Console.WriteLine({code}quot;Loading template 1..."); documentParser.AddTemplate(template1); Console.WriteLine({code}quot;Template 1 loaded."); Console.WriteLine(); Console.WriteLine({code}quot;Parsing \"{inputDocument}\"..."); Console.WriteLine(); // Parse document data in JSON format documentParser.ParseDocument(inputDocument, "result1.json", OutputFormat.JSON); Console.WriteLine("Parsing results saved to `result1.json`."); Console.WriteLine(); } // Process using template-2 using (DocumentParser documentParser = new DocumentParser("demo", "demo")) { Console.WriteLine({code}quot;Loading template 2..."); documentParser.AddTemplate(template2); Console.WriteLine({code}quot;Template 2 loaded."); Console.WriteLine(); Console.WriteLine({code}quot;Parsing \"{inputDocument}\"..."); Console.WriteLine(); // Parse document data in JSON format documentParser.ParseDocument(inputDocument, "result2.json", OutputFormat.JSON); Console.WriteLine("Parsing results saved to `result2.json`."); Console.WriteLine(); } Console.WriteLine(); Console.WriteLine("Press any key to continue..."); Console.ReadLine(); } } }

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout Document Parser SDK Home Page

Explore ByteScout Document Parser SDK Documentation

Explore Samples

Sign Up for ByteScout Document Parser SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

VIDEO

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout Document Parser SDK Home Page

Explore ByteScout Document Parser SDK Documentation

Explore Samples

Sign Up for ByteScout Document Parser SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Tutorials:

prev
next