ByteScout PDF Extractor SDK – VB.NET – Find US Address (with Regex)

Home
/
Articles
/
ByteScout PDF Extractor SDK – VB.NET – Find US Address (with Regex)

ByteScout PDF Extractor SDK – VB.NET – Find US Address (with Regex)

Program.vb

Imports Bytescout.PDFExtractor

Module Program

    Sub Main()

        Try
            ' Create Bytescout.PDFExtractor.TextExtractor instance
            Using extractor As TextExtractor = New TextExtractor()
                extractor.RegistrationName = "demo"
                extractor.RegistrationKey = "demo"

                ' Load sample PDF document
                extractor.LoadDocumentFromFile("samplePDF_Address.pdf")

                ' Enable the regular expression 
                extractor.RegexSearch = True

                Dim pageCount As Integer = extractor.GetPageCount()

                ' Search through pages
                For i As Integer = 0 To pageCount - 1
                    ' Search Address
                    Dim regexPattern = "((\w+[ ,])+ ){2}([a-zA-Z]){2}[ , ] (\d+)"
                    ' See the complete regular expressions reference at https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx

                    ' Search each page for the pattern
                    If extractor.Find(i, regexPattern, False) Then

                        Do
                            ' Iterate through each element in the found text
                            For Each element As ISearchResultElement In extractor.FoundText.Elements
                                Console.WriteLine("Found Address: " & element.Text)
                            Next
                        Loop While extractor.FindNext()

                    End If
                Next
            End Using

        Catch ex As Exception
            Console.WriteLine("Error: " & ex.Message)
        End Try

        Console.WriteLine()
        Console.WriteLine("Press enter key to continue...")
        Console.ReadLine()

    End Sub

End Module

Click here to get your Free Trial version of the SDK

ByteScout PDF Extractor SDK – VB.NET – Find US Address (with Regex)

ByteScout PDF Extractor SDK – VB.NET – Find US Address (with Regex)

Tutorials: