ByteScout PDF Extractor SDK - VB.NET - Extract Text By Pages from PDF - ByteScout

ByteScout PDF Extractor SDK – VB.NET – Extract Text By Pages from PDF

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – VB.NET – Extract Text By Pages from PDF

How to extract text by pages from PDF in VB.NET using ByteScout PDF Extractor SDK

Write code in VB.NET to extract text by pages from PDF with this step-by-step tutorial

The coding tutorials are designed to help you test the features without need to write your own code. ByteScout PDF Extractor SDK can extract text by pages from PDF. It can be used from VB.NET. ByteScout PDF Extractor SDK is the Software Development Kit (SDK) that is designed to help developers with data extraction from unstructured documents like pdf, tiff, scans, images, scanned and electronic forms. The library is powered by OCR, computer vision and AI to provide unique functionality like table detection, automatic table structure extraction, data restoration, data restructuring and reconstruction. Supports PDF, TIFF, PNG, JPG images as input and can output CSV, XML, JSON formatted data. Includes full set of utilities like pdf splitter, pdf merger, searchable pdf maker.

This code snippet below for ByteScout PDF Extractor SDK works best when you need to quickly extract text by pages from PDF in your VB.NET application. In order to implement the functionality, you should copy and paste this code for VB.NET below into your code editor with your app, compile and run your application. Detailed tutorials and documentation are available along with installed ByteScout PDF Extractor SDK if you’d like to dive deeper into the topic and the details of the API.

Our website provides trial version of ByteScout PDF Extractor SDK for free. It also includes documentation and source code samples.

Try it today: Get 60 Day Free Trial or sign up for Web API

ExtractTextByPages.VS2005.vbproj
      
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProductVersion>8.0.50727</ProductVersion> <SchemaVersion>2.0</SchemaVersion> <ProjectGuid>{34509168-5D95-4323-8808-2A10FDE4E9A9}</ProjectGuid> <OutputType>Exe</OutputType> <AppDesignerFolder>Properties</AppDesignerFolder> <RootNamespace>ExtractTextByPages</RootNamespace> <AssemblyName>ExtractTextByPages</AssemblyName> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG,TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <Import Project="$(MSBuildBinPath)\Microsoft.VisualBasic.Targets" /> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Reference Include="Bytescout.PDFExtractor, Version=1.0.0.12, Culture=neutral, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Xml" /> <Compile Include="Program.vb" /> <Compile Include="Properties\AssemblyInfo.vb" /> <Content Include="..\..\sample2.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> <Link>sample2.pdf</Link> </Content> </ItemGroup> <ItemGroup> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

ExtractTextByPages.VS2008.vbproj
      
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="3.5"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProductVersion>9.0.21022</ProductVersion> <SchemaVersion>2.0</SchemaVersion> <ProjectGuid>{34509168-5D95-4323-8808-2A10FDE4E9A9}</ProjectGuid> <OutputType>Exe</OutputType> <AppDesignerFolder>Properties</AppDesignerFolder> <RootNamespace>ExtractTextByPages</RootNamespace> <AssemblyName>ExtractTextByPages</AssemblyName> <OldToolsVersion>2.0</OldToolsVersion> <TargetFrameworkVersion>v3.5</TargetFrameworkVersion> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG,TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.VisualBasic.Targets" /> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Reference Include="Bytescout.PDFExtractor, Version=1.0.0.12, Culture=neutral, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Xml" /> <Compile Include="Program.vb" /> <Compile Include="Properties\AssemblyInfo.vb" /> <Content Include="..\..\sample2.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> <Link>sample2.pdf</Link> </Content> </ItemGroup> <ItemGroup> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

ExtractTextByPages.VS2010.vbproj
      
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProductVersion> </ProductVersion> <SchemaVersion>2.0</SchemaVersion> <ProjectGuid>{34509168-5D95-4323-8808-2A10FDE4E9A9}</ProjectGuid> <OutputType>Exe</OutputType> <AppDesignerFolder>Properties</AppDesignerFolder> <RootNamespace>ExtractTextByPages</RootNamespace> <AssemblyName>ExtractTextByPages</AssemblyName> <OldToolsVersion>3.5</OldToolsVersion> <TargetFrameworkVersion>v4.0</TargetFrameworkVersion> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG,TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.VisualBasic.Targets" /> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Reference Include="Bytescout.PDFExtractor, Version=1.0.0.12, Culture=neutral, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Xml" /> <Compile Include="Program.vb" /> <Compile Include="Properties\AssemblyInfo.vb" /> <Content Include="..\..\sample2.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> <Link>sample2.pdf</Link> </Content> </ItemGroup> <ItemGroup> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

Program.vb
      
Imports Bytescout.PDFExtractor Class Program Friend Shared Sub Main(args As String()) ' Create Bytescout.PDFExtractor.TextExtractor instance Dim extractor As New TextExtractor() extractor.RegistrationName = "demo" extractor.RegistrationKey = "demo" ' Load sample PDF document extractor.LoadDocumentFromFile(".\sample2.pdf") ' Get page count Dim pageCount As Integer = extractor.GetPageCount() For i As Integer = 0 To pageCount - 1 Dim fileName As String = "page" & i & ".txt" ' Save extracted page text to file extractor.SavePageTextToFile(i, fileName) Next ' Cleanup extractor.Dispose() ' Open result file in default associated application (for demo purposes) System.Diagnostics.Process.Start(".\page1.txt") End Sub End Class

Try it today: Get 60 Day Free Trial or sign up for Web API

MORE INFORMATION

Get 60 Day Free Trial or Visit ByteScout PDF Extractor SDK page

Explore ByteScout PDF Extractor SDK documentation

WEB API VERSION

Sign Up for free Web API key

Explore Web API Documentation

Tutorials:

prev
next