ByteScout PDF Suite – VBScript – Find pdf table and extract as csv with pdf extractor sdk

Home
/
Articles
/
ByteScout PDF Suite – VBScript – Find pdf table and extract as csv with pdf extractor sdk

printable version:
ByteScout-PDF-Suite-VBScript-Find-pdf-table-and-extract-as-csv-with-pdf-extractor-sdk.pdf

How to find pdf table and extract as csv with pdf extractor sdk in VBScript and ByteScout PDF Suite

Learning is essential in computer world and the tutorial below will demonstrate how to find pdf table and extract as csv with pdf extractor sdk in VBScript

The sample source code below will teach you how to find pdf table and extract as csv with pdf extractor sdk in VBScript. What is ByteScout PDF Suite? It is the set that includes 6 SDK products to work with PDF from generating rich PDF reports to extracting data from PDF documents and converting them to HTML. This bundle includes PDF (Generator) SDK, PDF Renderer SDK, PDF Extractor SDK, PDF to HTML SDK, PDF Viewer SDK and PDF Generator SDK for Javascript. It can help you to find pdf table and extract as csv with pdf extractor sdk in your VBScript application.

Want to quickly learn? This fast application programming interfaces of ByteScout PDF Suite for VBScript plus the guidelines and the code below will help you quickly learn how to find pdf table and extract as csv with pdf extractor sdk. Follow the instructions from scratch to work and copy the VBScript code. Check VBScript sample code samples to see if they respond to your needs and requirements for the project.

ByteScout provides the free trial version of ByteScout PDF Suite along with the documentation and source code samples.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

FindTableAndExtractAsCSV.vbs

      ' Create Bytescout.PDFExtractor.TextExtractor object
Set tableDetector= CreateObject("Bytescout.PDFExtractor.TableDetector")
tableDetector.RegistrationName = "demo"
tableDetector.RegistrationKey = "demo"

' Create Bytescout.PDFExtractor.CSVExtractor object
Set csvExtractor = CreateObject("Bytescout.PDFExtractor.CSVExtractor")
csvExtractor.RegistrationName = "demo"
csvExtractor.RegistrationKey = "demo"

' We should define what kind of tables we should detect.
' So we set min required number of columns to 3 ...
tableDetector.DetectionMinNumberOfColumns = 3
' ... and we set min required number of rows to 3
tableDetector.DetectionMinNumberOfRows = 3

' Set table detection mode to "bordered tables" - best for tables with closed solid borders.
tableDetector.ColumnDetectionMode = 3 ' 3 = ColumnDetectionMode.BorderedTables

' Load sample PDF document
tableDetector.LoadDocumentFromFile("..\..\sample3.pdf")
csvExtractor.LoadDocumentFromFile "..\..\sample3.pdf"

' Get page count
pageCount = tableDetector.GetPageCount()

' Iterate through pages
For i = 0 to pageCount - 1 
 
	t = 0
	' Find first table and continue if found
	If (tableDetector.FindTable(i)) Then

		Do
			' Set extraction area for CSV extractor to rectangle received from the table detector
			csvExtractor.SetExtractionArea _
				tableDetector.GetFoundTableRectangle_Left(), _
				tableDetector.GetFoundTableRectangle_Top(), _
				tableDetector.GetFoundTableRectangle_Width(), _
				tableDetector.GetFoundTableRectangle_Height()
			' Export the table to CSV file
			csvExtractor.SavePageCSVToFile i, "page-" & CStr(i) & "-table-" & CStr(t) & ".csv"
			t = t + 1
		Loop While tableDetector.FindNextTable()
		
	End If

Next

Set csvExtractor = Nothing
Set tableDetector = Nothing