ByteScout PDF Extractor SDK - VB.NET - Find US Address in PDF with Regex - ByteScout

ByteScout PDF Extractor SDK – VB.NET – Find US Address in PDF with Regex

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – VB.NET – Find US Address in PDF with Regex

How to find US address in PDF with regex in VB.NET with ByteScout PDF Extractor SDK

How to find US address in PDF with regex in VB.NET

Sample source code below will show you how to cope with a difficult task like find US address in PDF with regex in VB.NET. What is ByteScout PDF Extractor SDK? It is the SDK that helps developers to extract data from unstructured documents, pdf, images, scanned and electronic forms. Includes AI functions like automatic table detection, automatic table extraction and restructuring, text recognition and text restoration from pdf and scanned documents. Includes PDF to CSV, PDF to XML, PDF to JSON, PDF to searchable PDF functions as well as methods for low level data extraction. It can help you to find US address in PDF with regex in your VB.NET application.

This rich sample source code in VB.NET for ByteScout PDF Extractor SDK includes the number of functions and options you should do calling the API to find US address in PDF with regex. This VB.NET sample code is all you need for your app. Just copy and paste the code, add references (if needs to) and you are all set! Implementing VB.NET application typically includes multiple stages of the software development so even if the functionality works please test it with your data and the production environment.

Trial version of ByteScout PDF Extractor SDK can be downloaded for free from our website. It also includes source code samples for VB.NET and other programming languages.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

Program.vb
      
Imports Bytescout.PDFExtractor Module Program Sub Main() Try ' Create Bytescout.PDFExtractor.TextExtractor instance Using extractor As TextExtractor = New TextExtractor() extractor.RegistrationName = "demo" extractor.RegistrationKey = "demo" ' Load sample PDF document extractor.LoadDocumentFromFile("samplePDF_Address.pdf") ' Enable the regular expression extractor.RegexSearch = True Dim pageCount As Integer = extractor.GetPageCount() ' Search through pages For i As Integer = 0 To pageCount - 1 ' Search Address Dim regexPattern = "((\w+[ ,])+ ){2}([a-zA-Z]){2}[ , ] (\d+)" ' See the complete regular expressions reference at https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx ' Search each page for the pattern If extractor.Find(i, regexPattern, False) Then Do ' Iterate through each element in the found text For Each element As ISearchResultElement In extractor.FoundText.Elements Console.WriteLine("Found Address: " & element.Text) Next Loop While extractor.FindNext() End If Next End Using Catch ex As Exception Console.WriteLine("Error: " & ex.Message) End Try Console.WriteLine() Console.WriteLine("Press enter key to continue...") Console.ReadLine() End Sub End Module

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Extractor SDK Home Page

Explore ByteScout PDF Extractor SDK Documentation

Explore Samples

Sign Up for ByteScout PDF Extractor SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

VIDEO

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Extractor SDK Home Page

Explore ByteScout PDF Extractor SDK Documentation

Explore Samples

Sign Up for ByteScout PDF Extractor SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Tutorials:

prev
next