ByteScout PDF Extractor SDK - VB.NET - Find Table in PDF And Extract As XML - ByteScout

ByteScout PDF Extractor SDK – VB.NET – Find Table in PDF And Extract As XML

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – VB.NET – Find Table in PDF And Extract As XML

How to find table in PDF and extract as XML in VB.NET using ByteScout PDF Extractor SDK

The tutorial shows how to find table in PDF and extract as XML in VB.NET

On this page you will learn from code samples for programming in VB.NET.Writing of the code to find table in PDF and extract as XML in VB.NET can be done by developers of any level using ByteScout PDF Extractor SDK. ByteScout PDF Extractor SDK is the Software Development Kit (SDK) that is designed to help developers with data extraction from unstructured documents like pdf, tiff, scans, images, scanned and electronic forms. The library is powered by OCR, computer vision and AI to provide unique functionality like table detection, automatic table structure extraction, data restoration, data restructuring and reconstruction. Supports PDF, TIFF, PNG, JPG images as input and can output CSV, XML, JSON formatted data. Includes full set of utilities like pdf splitter, pdf merger, searchable pdf maker. It can find table in PDF and extract as XML in VB.NET.

Fast application programming interfaces of ByteScout PDF Extractor SDK for VB.NET plus the instruction and the code below will help you quickly learn how to find table in PDF and extract as XML. In order to implement the functionality, you should copy and paste this code for VB.NET below into your code editor with your app, compile and run your application. Detailed tutorials and documentation are available along with installed ByteScout PDF Extractor SDK if you’d like to dive deeper into the topic and the details of the API.

ByteScout PDF Extractor SDK free trial version is available on our website. VB.NET and other programming languages are supported.

Try it today: Get 60 Day Free Trial or sign up for Web API

FindTableAndExtractAsXml.VS2005.vbproj
      
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProductVersion>8.0.50727</ProductVersion> <SchemaVersion>2.0</SchemaVersion> <ProjectGuid>{34509168-5D95-4323-8808-2A10FDE4E9A9}</ProjectGuid> <OutputType>Exe</OutputType> <AppDesignerFolder>Properties</AppDesignerFolder> <RootNamespace>FindTableAndExtractAsXml</RootNamespace> <AssemblyName>FindTableAndExtractAsXml</AssemblyName> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG,TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <Import Project="$(MSBuildBinPath)\Microsoft.VisualBasic.Targets" /> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Reference Include="Bytescout.PDFExtractor, Version=1.0.0.12, Culture=neutral, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Drawing" /> <Reference Include="System.Xml" /> <Compile Include="Program.vb" /> <Compile Include="Properties\AssemblyInfo.vb" /> </ItemGroup> <ItemGroup> <Content Include="..\..\sample3.pdf"> <Link>sample3.pdf</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </Content> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

FindTableAndExtractAsXml.VS2008.vbproj
      
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="3.5"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProductVersion>9.0.21022</ProductVersion> <SchemaVersion>2.0</SchemaVersion> <ProjectGuid>{34509168-5D95-4323-8808-2A10FDE4E9A9}</ProjectGuid> <OutputType>Exe</OutputType> <AppDesignerFolder>Properties</AppDesignerFolder> <RootNamespace>FindTableAndExtractAsXml</RootNamespace> <AssemblyName>FindTableAndExtractAsXml</AssemblyName> <OldToolsVersion>2.0</OldToolsVersion> <TargetFrameworkVersion>v3.5</TargetFrameworkVersion> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG,TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.VisualBasic.Targets" /> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Reference Include="Bytescout.PDFExtractor, Version=1.0.0.12, Culture=neutral, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Drawing" /> <Reference Include="System.Xml" /> <Compile Include="Program.vb" /> <Compile Include="Properties\AssemblyInfo.vb" /> </ItemGroup> <ItemGroup> <Content Include="..\..\sample3.pdf"> <Link>sample3.pdf</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </Content> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

FindTableAndExtractAsXml.VS2010.vbproj
      
<?xml version="1.0" encoding="utf-8"?> <Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <SchemaVersion>2.0</SchemaVersion> <ProjectGuid>{34509168-5D95-4323-8808-2A10FDE4E9A9}</ProjectGuid> <OutputType>Exe</OutputType> <AppDesignerFolder>Properties</AppDesignerFolder> <RootNamespace>FindTableAndExtractAsXml</RootNamespace> <AssemblyName>FindTableAndExtractAsXml</AssemblyName> <OldToolsVersion>3.5</OldToolsVersion> <TargetFrameworkVersion>v4.0</TargetFrameworkVersion> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG,TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup> <StartupObject>Sub Main</StartupObject> </PropertyGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.VisualBasic.Targets" /> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Reference Include="Bytescout.PDFExtractor, Version=1.0.0.12, Culture=neutral, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Drawing" /> <Reference Include="System.Xml" /> <Compile Include="Program.vb" /> <Compile Include="Properties\AssemblyInfo.vb" /> </ItemGroup> <ItemGroup> <Content Include="..\..\sample3.pdf"> <Link>sample3.pdf</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </Content> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

Program.vb
      
Imports Bytescout.PDFExtractor Class Program Friend Shared Sub Main(args As String()) ' Create Bytescout.PDFExtractor.XMLExtractor instance Dim xmlExtractor As New XMLExtractor() xmlExtractor.RegistrationName = "demo" xmlExtractor.RegistrationKey = "demo" ' Create Bytescout.PDFExtractor.TableDetector instance Dim tableDetector As New TableDetector() tableDetector.RegistrationName = "demo" tableDetector.RegistrationKey = "demo" ' We should define what kind of tables we should detect. ' So we set min required number of columns to 3 ... tableDetector.DetectionMinNumberOfColumns = 3 ' ... and we set min required number of rows to 3 tableDetector.DetectionMinNumberOfRows = 3 ' Load sample PDF document xmlExtractor.LoadDocumentFromFile(".\sample3.pdf") tableDetector.LoadDocumentFromFile(".\sample3.pdf") ' Get page count Dim pageCount As Integer = tableDetector.GetPageCount() For i As Integer = 0 To pageCount - 1 Dim t As Integer = 1 ' Find first table and continue if found If (tableDetector.FindTable(i)) Then Do ' Set extraction area for XML extractor to rectangle received from the table detector xmlExtractor.SetExtractionArea(tableDetector.FoundTableLocation) ' Export the table to XML file xmlExtractor.SavePageXMLToFile(i, "page-" + i.ToString() + "-table-" + t.ToString() + ".xml") t = t + 1 Loop While tableDetector.FindNextTable() End If Next ' Cleanup xmlExtractor.Dispose() tableDetector.Dispose() ' Open first output file in default associated application (for demo purposes) System.Diagnostics.Process.Start("page-0-table-1.xml") End Sub End Class

Try it today: Get 60 Day Free Trial or sign up for Web API

MORE INFORMATION

Get 60 Day Free Trial or Visit ByteScout PDF Extractor SDK page

Explore ByteScout PDF Extractor SDK documentation

WEB API VERSION

Sign Up for free Web API key

Explore Web API Documentation

Tutorials:

prev
next