Converting PDF to CSV in VB.NET after finding tables in the document is made easier with PDF Extractor SDK. This SDK is designed to streamline the process of doing many tasks involved in software development, thus making the job of developers easier. It includes extracting data from PDF to CSV, XLS, XLSX.
PDF Extractor SDK is just one of several Software Development Kits (SDKs) developed by ByteScout. ByteScout SDKs can also be also used to generate barcodes and decode them from images, PDFs, and scanned documents. It is built with an OCR and image recognition technology.
We prepared a sample code for you to test PDF Extractor SDK functionality, particularly in implementing conversion of PDF to CSV in VB.NET. Just copy and paste this code to your application and run it. You will appreciate the time and energy saved in writing and testing code. This VB.NET code snippet is also obtainable from our GitHub.
We also have a FREE ByteScout trial version that you can download from our website. Besides getting nifty source code samples, you can also find some helpful programming tutorials attached to it.
Imports Bytescout.PDFExtractor
Class Program
Friend Shared Sub Main(args As String())
' Create Bytescout.PDFExtractor.CSVExtractor instance
Dim csvExtractor As New CSVExtractor()
csvExtractor.RegistrationName = "demo"
csvExtractor.RegistrationKey = "demo"
' Create Bytescout.PDFExtractor.TableDetector instance
Dim tableDetector As New TableDetector()
tableDetector.RegistrationName = "demo"
tableDetector.RegistrationKey = "demo"
' We should define what kind of tables we should detect.
' So we set min required number of columns to 3 ...
tableDetector.DetectionMinNumberOfColumns = 3
' ... and we set min required number of rows to 3
tableDetector.DetectionMinNumberOfRows = 3
' Set table detection mode to "bordered tables" - best for tables with closed solid borders.
tableDetector.ColumnDetectionMode = ColumnDetectionMode.BorderedTables
' Load sample PDF document
csvExtractor.LoadDocumentFromFile(".\sample3.pdf")
tableDetector.LoadDocumentFromFile(".\sample3.pdf")
' Get page count
Dim pageCount As Integer = tableDetector.GetPageCount()
' Iterate through pages
For i As Integer = 0 To pageCount - 1
Dim t As Integer = 1
' Find first table and continue if found
If (tableDetector.FindTable(i)) Then
Do
' Set extraction area for CSV extractor to rectangle received from the table detector
csvExtractor.SetExtractionArea(tableDetector.FoundTableLocation)
' Export the table to CSV file
csvExtractor.SavePageCSVToFile(i, "page-" + i.ToString() + "-table-" + t.ToString() + ".csv")
t = t + 1
Loop While tableDetector.FindNextTable()
End If
Next
' Cleanup
csvExtractor.Dispose()
tableDetector.Dispose()
' Open first output file in default associated application (for demo purposes)
System.Diagnostics.Process.Start("page-0-table-1.csv")
End Sub
End Class
Click here to get your Free Trial version of the SDK
also available as: