ByteScout PDF Extractor SDK - VBScript - Find Text in PDF Using Regex - ByteScout

ByteScout PDF Extractor SDK – VBScript – Find Text in PDF Using Regex

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – VBScript – Find Text in PDF Using Regex

How to find text in PDF using regex in VBScript with ByteScout PDF Extractor SDK

Write code in VBScript to find text in PDF using regex with this step-by-step tutorial

The sample source code below will teach you how to find text in PDF using regex in VBScript. ByteScout PDF Extractor SDK: the SDK that helps developers to extract data from unstructured documents, pdf, images, scanned and electronic forms. Includes AI functions like automatic table detection, automatic table extraction and restructuring, text recognition and text restoration from pdf and scanned documents. Includes PDF to CSV, PDF to XML, PDF to JSON, PDF to searchable PDF functions as well as methods for low level data extraction. It can find text in PDF using regex in VBScript.

You will save a lot of time on writing and testing code as you may just take the VBScript code from ByteScout PDF Extractor SDK for find text in PDF using regex below and use it in your application. In your VBScript project or application you may simply copy & paste the code and then run your app! Detailed tutorials and documentation are available along with installed ByteScout PDF Extractor SDK if you’d like to dive deeper into the topic and the details of the API.

ByteScout free trial version is available for download from our website. It includes all these programming tutorials along with source code samples.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

FindTextUsingRegex.vbs
      
' Create Bytescout.PDFExtractor.TextExtractor object Set extractor = CreateObject("Bytescout.PDFExtractor.TextExtractor") extractor.RegistrationName = "demo" extractor.RegistrationKey = "demo" ' Load sample PDF document extractor.LoadDocumentFromFile("..\..\Invoice.pdf") extractor.RegexSearch = True ' Turn on the regex search pattern = "[0-9]{2}/[0-9]{2}/[0-9]{4}" ' Search dates in format 'mm/dd/yyyy' ' Get page count pageCount = extractor.GetPageCount() For i = 0 to PageCount - 1 If extractor.Find(i, pattern, false) Then ' Parameters are: page index, string to find, case sensitivity Do extractedString = extractor.FoundText.Text MsgBox "Found match on page #" & CStr(i) & ": " & extractedString extractor.ResetExtractionArea() Loop While extractor.FindNext End If Next MsgBox "Done" Set extractor = Nothing

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Extractor SDK Home Page

Explore ByteScout PDF Extractor SDK Documentation

Explore Samples

Sign Up for ByteScout PDF Extractor SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

VIDEO

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Extractor SDK Home Page

Explore ByteScout PDF Extractor SDK Documentation

Explore Samples

Sign Up for ByteScout PDF Extractor SDK Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Tutorials:

prev
next