ByteScout Text Recognition SDK - VB.NET - Extract Text From Areas - ByteScout

ByteScout Text Recognition SDK – VB.NET – Extract Text From Areas

  • Home
  • /
  • Articles
  • /
  • ByteScout Text Recognition SDK – VB.NET – Extract Text From Areas

How to extract text from areas in VB.NET and ByteScout Text Recognition SDK

The tutorial below will demonstrate how to extract text from areas in VB.NET

ByteScout tutorials are designed to explain the code for both VB.NET beginners and advanced programmers. ByteScout Text Recognition SDK can extract text from areas. It can be used from VB.NET. ByteScout Text Recognition SDK is the text recognition SDK to help with extraction of text using OCR from scanned images and documents. Supports English and non-Latin languages, can take PDF as input.

Fast application programming interfaces of ByteScout Text Recognition SDK for VB.NET plus the instruction and the code below will help you quickly learn how to extract text from areas. In order to implement the functionality, you should copy and paste this code for VB.NET below into your code editor with your app, compile and run your application. This basic programming language sample code for VB.NET will do the whole work for you to extract text from areas.

You can download free trial version of ByteScout Text Recognition SDK from our website to see and try many others source code samples for VB.NET.

Try it today: Get 60 Day Free Trial or sign up for Web API

Module1.vb
      
Imports System.Drawing Imports Bytescout.TextRecognition Module Module1 Sub Main() Dim inputDocument As String = ".\areas-sample.pdf" Dim pageIndex As Integer = 0 Dim outputDocument As String = ".\result.txt" ' Create and activate TextRecognizer instance Using textRecognizer As TextRecognizer = New TextRecognizer("demo", "demo") Try ' Load document (image or PDF) textRecognizer.LoadDocument(inputDocument) ' Set the location of OCR language data files textRecognizer.OCRLanguageDataFolder = "c:\Program Files\ByteScout Text Recognition SDK\ocrdata_best\" ' Set OCR language. ' "eng" for english, "deu" for German, "fra" for French, "spa" for Spanish, etc. - according to files in "ocrdata" folder ' Find more language files at https://github.com/bytescout/ocrdata textRecognizer.OCRLanguage = "eng" ' Get page size (in pixels). Size of PDF document is computed from PDF Points ' and the rendering resoultion specified by `textRecognizer.PDFRenderingResolution` (default 300 DPI) Dim pageSize As Size = textRecognizer.GetPageSize(pageIndex) ' Add area of interest as a rectangle at the top-right corner of the page textRecognizer.RecognitionAreas.Add(pageSize.Width / 2, 0, pageSize.Width / 2, 300) ' Add area of interest as a rectangle at the bottom-left corner of the page, ' and indicate it should be rotated at 90 deg textRecognizer.RecognitionAreas.Add(0, pageSize.Height / 2, 300, pageSize.Height / 2, AreaRotation.Rotate90FlipNone) ' Now you can get recognized text for further analysis as a list of objects ' containing coordinates, object kind, confidence. Dim ocrObjectList As OCRObjectList = textRecognizer.GetOCRObjects(pageIndex) For Each ocrObject As OCRObject In ocrObjectList Console.WriteLine(ocrObject.ToString()) Next ' ... or you can save recognized text pieces to file textRecognizer.KeepTextFormatting = False ' save without formatting textRecognizer.SaveText(outputDocument, pageIndex, pageIndex) ' Open the result file in default associated application (for demo purposes) Process.Start(outputDocument) Catch exception As Exception Console.WriteLine(exception) End Try End Using End Sub End Module

Try it today: Get 60 Day Free Trial or sign up for Web API

TextRecognitionExample.sln
      
Microsoft Visual Studio Solution File, Format Version 11.00 # Visual Studio 2010 Project("{F184B08F-C81C-45F6-A57F-5ABD9991F28F}") = "TextRecognitionExample", "TextRecognitionExample.vbproj", "{E5339352-1EBE-4547-B281-88D9FEEF92D7}" EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Any CPU = Debug|Any CPU Release|Any CPU = Release|Any CPU EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution {E5339352-1EBE-4547-B281-88D9FEEF92D7}.Debug|Any CPU.ActiveCfg = Debug|Any CPU {E5339352-1EBE-4547-B281-88D9FEEF92D7}.Debug|Any CPU.Build.0 = Debug|Any CPU {E5339352-1EBE-4547-B281-88D9FEEF92D7}.Release|Any CPU.ActiveCfg = Release|Any CPU {E5339352-1EBE-4547-B281-88D9FEEF92D7}.Release|Any CPU.Build.0 = Release|Any CPU EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE EndGlobalSection EndGlobal

Try it today: Get 60 Day Free Trial or sign up for Web API

TextRecognitionExample.vbproj
      
<?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="4.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProductVersion> </ProductVersion> <SchemaVersion> </SchemaVersion> <ProjectGuid>{E5339352-1EBE-4547-B281-88D9FEEF92D7}</ProjectGuid> <OutputType>Exe</OutputType> <StartupObject>TextRecognitionExample.Module1</StartupObject> <RootNamespace>TextRecognitionExample</RootNamespace> <AssemblyName>TextRecognitionExample</AssemblyName> <FileAlignment>512</FileAlignment> <MyType>Console</MyType> <TargetFrameworkVersion>v4.0</TargetFrameworkVersion> <TargetFrameworkProfile>Client</TargetFrameworkProfile> </PropertyGroup> <PropertyGroup> <OptionExplicit>On</OptionExplicit> </PropertyGroup> <PropertyGroup> <OptionCompare>Binary</OptionCompare> </PropertyGroup> <PropertyGroup> <OptionStrict>Off</OptionStrict> </PropertyGroup> <PropertyGroup> <OptionInfer>On</OptionInfer> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)' == 'Debug|AnyCPU'"> <DebugSymbols>true</DebugSymbols> <DefineDebug>true</DefineDebug> <DefineTrace>true</DefineTrace> <OutputPath>bin\Debug\</OutputPath> <DocumentationFile>TextRecognitionExample.xml</DocumentationFile> <NoWarn>42016,41999,42017,42018,42019,42032,42036,42020,42021,42022</NoWarn> <DebugType>full</DebugType> <PlatformTarget>AnyCPU</PlatformTarget> <CodeAnalysisIgnoreBuiltInRuleSets>true</CodeAnalysisIgnoreBuiltInRuleSets> <CodeAnalysisIgnoreBuiltInRules>true</CodeAnalysisIgnoreBuiltInRules> </PropertyGroup> <PropertyGroup Condition="'$(Configuration)|$(Platform)' == 'Release|AnyCPU'"> <DefineTrace>true</DefineTrace> <OutputPath>bin\Release\</OutputPath> <DocumentationFile>TextRecognitionExample.xml</DocumentationFile> <Optimize>true</Optimize> <NoWarn>42016,41999,42017,42018,42019,42032,42036,42020,42021,42022</NoWarn> <DebugType>pdbonly</DebugType> <PlatformTarget>AnyCPU</PlatformTarget> <CodeAnalysisIgnoreBuiltInRuleSets>true</CodeAnalysisIgnoreBuiltInRuleSets> <CodeAnalysisIgnoreBuiltInRules>true</CodeAnalysisIgnoreBuiltInRules> <CodeAnalysisFailOnMissingRules>true</CodeAnalysisFailOnMissingRules> </PropertyGroup> <ItemGroup> <Reference Include="ByteScout.TextRecognition"> <HintPath>C:\Program Files\ByteScout Text Recognition SDK\net4.00\ByteScout.TextRecognition.dll</HintPath> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Deployment" /> <Reference Include="System.Drawing" /> <Reference Include="System.Xml" /> <Reference Include="System.Core" /> <Reference Include="System.Xml.Linq" /> <Reference Include="System.Data.DataSetExtensions" /> </ItemGroup> <ItemGroup> <Import Include="Microsoft.VisualBasic" /> <Import Include="System" /> <Import Include="System.Collections" /> <Import Include="System.Collections.Generic" /> <Import Include="System.Data" /> <Import Include="System.Diagnostics" /> <Import Include="System.Linq" /> <Import Include="System.Xml.Linq" /> </ItemGroup> <ItemGroup> <Compile Include="Module1.vb" /> <Compile Include="My Project\AssemblyInfo.vb" /> <Compile Include="My Project\Application.Designer.vb"> <AutoGen>True</AutoGen> <DependentUpon>Application.myapp</DependentUpon> </Compile> <Compile Include="My Project\Resources.Designer.vb"> <AutoGen>True</AutoGen> <DesignTime>True</DesignTime> <DependentUpon>Resources.resx</DependentUpon> </Compile> <Compile Include="My Project\Settings.Designer.vb"> <AutoGen>True</AutoGen> <DependentUpon>Settings.settings</DependentUpon> <DesignTimeSharedInput>True</DesignTimeSharedInput> </Compile> </ItemGroup> <ItemGroup> <EmbeddedResource Include="My Project\Resources.resx"> <Generator>VbMyResourcesResXFileCodeGenerator</Generator> <LastGenOutput>Resources.Designer.vb</LastGenOutput> <CustomToolNamespace>My.Resources</CustomToolNamespace> <SubType>Designer</SubType> </EmbeddedResource> </ItemGroup> <ItemGroup> <Content Include="areas-sample.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </Content> <None Include="My Project\Application.myapp"> <Generator>MyApplicationCodeGenerator</Generator> <LastGenOutput>Application.Designer.vb</LastGenOutput> </None> <None Include="My Project\Settings.settings"> <Generator>SettingsSingleFileGenerator</Generator> <CustomToolNamespace>My</CustomToolNamespace> <LastGenOutput>Settings.Designer.vb</LastGenOutput> </None> </ItemGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.VisualBasic.targets" /> <!-- To modify your build process, add your task inside one of the targets below and uncomment it. Other similar extension points exist, see Microsoft.Common.targets. <Target Name="BeforeBuild"> </Target> <Target Name="AfterBuild"> </Target> --> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

MORE INFORMATION

Get 60 Day Free Trial or Visit ByteScout Text Recognition SDK page

Explore ByteScout Text Recognition SDK documentation

WEB API VERSION

Sign Up for free Web API key

Explore Web API Documentation

Tutorials:

prev
next