ByteScout PDF Extractor SDK – VB.NET – Find Table And Extract As CSV

Home
/
Articles
/
ByteScout PDF Extractor SDK – VB.NET – Find Table And Extract As CSV

ByteScout PDF Extractor SDK – VB.NET – Find Table And Extract As CSV

Converting PDF to CSV in VB.NET after finding tables in the document is made easier with PDF Extractor SDK. This SDK is designed to streamline the process of doing many tasks involved in software development, thus making the job of developers easier. It includes extracting data from PDF to CSV, XLS, XLSX.

PDF Extractor SDK is just one of several Software Development Kits (SDKs) developed by ByteScout. ByteScout SDKs can also be also used to generate barcodes and decode them from images, PDFs, and scanned documents. It is built with an OCR and image recognition technology.

We prepared a sample code for you to test PDF Extractor SDK functionality, particularly in implementing conversion of PDF to CSV in VB.NET. Just copy and paste this code to your application and run it. You will appreciate the time and energy saved in writing and testing code. This VB.NET code snippet is also obtainable from our GitHub.

We also have a FREE ByteScout trial version that you can download from our website. Besides getting nifty source code samples, you can also find some helpful programming tutorials attached to it.

Program.vb

Imports Bytescout.PDFExtractor

Class Program
	Friend Shared Sub Main(args As String())

        ' Create Bytescout.PDFExtractor.CSVExtractor instance
        Dim csvExtractor As New CSVExtractor()
        csvExtractor.RegistrationName = "demo"
        csvExtractor.RegistrationKey = "demo"

        ' Create Bytescout.PDFExtractor.TableDetector instance
        Dim tableDetector As New TableDetector()
        tableDetector.RegistrationName = "demo"
        tableDetector.RegistrationKey = "demo"

        ' We should define what kind of tables we should detect.
        ' So we set min required number of columns to 3 ...
        tableDetector.DetectionMinNumberOfColumns = 3
        ' ... and we set min required number of rows to 3
        tableDetector.DetectionMinNumberOfRows = 3

        ' Set table detection mode to "bordered tables" - best for tables with closed solid borders.
        tableDetector.ColumnDetectionMode = ColumnDetectionMode.BorderedTables

		' Load sample PDF document
        csvExtractor.LoadDocumentFromFile(".\sample3.pdf")
        tableDetector.LoadDocumentFromFile(".\sample3.pdf")

		' Get page count
        Dim pageCount As Integer = tableDetector.GetPageCount()

        ' Iterate through pages
		For i As Integer = 0 To pageCount - 1
            Dim t As Integer = 1
            ' Find first table and continue if found
            If (tableDetector.FindTable(i)) Then
                Do
                    ' Set extraction area for CSV extractor to rectangle received from the table detector
                    csvExtractor.SetExtractionArea(tableDetector.FoundTableLocation)
                    ' Export the table to CSV file
                    csvExtractor.SavePageCSVToFile(i, "page-" + i.ToString() + "-table-" + t.ToString() + ".csv")
                    t = t + 1
                Loop While tableDetector.FindNextTable()
            End If
        Next

        ' Cleanup
		csvExtractor.Dispose()
		tableDetector.Dispose()

        ' Open first output file in default associated application (for demo purposes)
        System.Diagnostics.Process.Start("page-0-table-1.csv")

	End Sub
End Class

Click here to get your Free Trial version of the SDK

ByteScout PDF Extractor SDK – VB.NET – Find Table And Extract As CSV

ByteScout PDF Extractor SDK – VB.NET – Find Table And Extract As CSV

Program.vb

Tutorials: