ByteScout PDF Suite - VB.NET - Convert Scanned PDF to Excel - ByteScout

ByteScout PDF Suite – VB.NET – Convert Scanned PDF to Excel

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Suite – VB.NET – Convert Scanned PDF to Excel

How to convert scanned PDF to excel in VB.NET with ByteScout PDF Suite

This code in VB.NET shows how to convert scanned PDF to excel with this how to tutorial

Convert scanned PDF to excel is simple to apply in VB.NET if you use these source codes below. ByteScout PDF Suite is the bundle that provides six different SDK libraries to work with PDF from generating rich PDF reports to extracting data from PDF documents and converting them to HTML. This bundle includes PDF (Generator) SDK, PDF Renderer SDK, PDF Extractor SDK, PDF to HTML SDK, PDF Viewer SDK and PDF Generator SDK for Javascript. It can be applied to convert scanned PDF to excel using VB.NET.

Want to save time? You will save a lot of time on writing and testing code as you may just take the VB.NET code from ByteScout PDF Suite for convert scanned PDF to excel below and use it in your application. Just copy and paste the code into your VB.NET application’s code and follow the instructions. Want to see how it works with your data then code testing will allow the function to be tested and work properly.

You can download free trial version of ByteScout PDF Suite from our website to see and try many others source code samples for VB.NET.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

Form1.Designer.vb
      
Partial Class Form1 ''' <summary> ''' Required designer variable. ''' </summary> Private components As System.ComponentModel.IContainer = Nothing ''' <summary> ''' Clean up any resources being used. ''' </summary> ''' <param name="disposing">true if managed resources should be disposed; otherwise, false.</param> Protected Overrides Sub Dispose(disposing As Boolean) If disposing AndAlso (components IsNot Nothing) Then components.Dispose() End If MyBase.Dispose(disposing) End Sub #Region "Windows Form Designer generated code" ''' <summary> ''' Required method for Designer support - do not modify ''' the contents of this method with the code editor. ''' </summary> Private Sub InitializeComponent() Dim resources As System.ComponentModel.ComponentResourceManager = New System.ComponentModel.ComponentResourceManager(GetType(Form1)) Me.pdfViewerControl1 = New Bytescout.PDFViewer.PDFViewerControl() Me.toolStrip1 = New System.Windows.Forms.ToolStrip() Me.tsbOpen = New System.Windows.Forms.ToolStripButton() Me.ToolStripSeparator1 = New System.Windows.Forms.ToolStripSeparator() Me.tsbExportToCSV = New System.Windows.Forms.ToolStripButton() Me.tsbExportToXLSX = New System.Windows.Forms.ToolStripButton() Me.toolStrip1.SuspendLayout Me.SuspendLayout ' 'pdfViewerControl1 ' Me.pdfViewerControl1.BackColor = System.Drawing.SystemColors.ButtonShadow Me.pdfViewerControl1.Dock = System.Windows.Forms.DockStyle.Fill Me.pdfViewerControl1.Location = New System.Drawing.Point(0, 25) Me.pdfViewerControl1.MouseMode = Bytescout.PDFViewer.MouseMode.Selection Me.pdfViewerControl1.Name = "pdfViewerControl1" Me.pdfViewerControl1.RegistrationKey = Nothing Me.pdfViewerControl1.RegistrationName = Nothing Me.pdfViewerControl1.ResetRotationOnPageChange = false Me.pdfViewerControl1.Scale = 100 Me.pdfViewerControl1.SelectionColor = System.Drawing.Color.Red Me.pdfViewerControl1.ShowImageObjects = true Me.pdfViewerControl1.ShowTextObjects = true Me.pdfViewerControl1.ShowVectorObjects = true Me.pdfViewerControl1.Size = New System.Drawing.Size(842, 514) Me.pdfViewerControl1.TabIndex = 0 ' 'toolStrip1 ' Me.toolStrip1.Items.AddRange(New System.Windows.Forms.ToolStripItem() {Me.tsbOpen, Me.ToolStripSeparator1, Me.tsbExportToCSV, Me.tsbExportToXLSX}) Me.toolStrip1.Location = New System.Drawing.Point(0, 0) Me.toolStrip1.Name = "toolStrip1" Me.toolStrip1.Size = New System.Drawing.Size(842, 25) Me.toolStrip1.TabIndex = 1 Me.toolStrip1.Text = "toolStrip1" ' 'tsbOpen ' Me.tsbOpen.Image = Global.Sample_UI_Application.My.Resources.Resources.folder_page Me.tsbOpen.ImageTransparentColor = System.Drawing.Color.Magenta Me.tsbOpen.Name = "tsbOpen" Me.tsbOpen.Size = New System.Drawing.Size(80, 22) Me.tsbOpen.Text = "&Open PDF" ' 'ToolStripSeparator1 ' Me.ToolStripSeparator1.Name = "ToolStripSeparator1" Me.ToolStripSeparator1.Size = New System.Drawing.Size(6, 25) ' 'tsbExportToCSV ' Me.tsbExportToCSV.Image = CType(resources.GetObject("tsbExportToCSV.Image"),System.Drawing.Image) Me.tsbExportToCSV.ImageTransparentColor = System.Drawing.Color.Magenta Me.tsbExportToCSV.Name = "tsbExportToCSV" Me.tsbExportToCSV.Size = New System.Drawing.Size(100, 22) Me.tsbExportToCSV.Text = "Export To CSV" ' 'tsbExportToXLSX ' Me.tsbExportToXLSX.Image = CType(resources.GetObject("tsbExportToXLSX.Image"),System.Drawing.Image) Me.tsbExportToXLSX.ImageTransparentColor = System.Drawing.Color.Magenta Me.tsbExportToXLSX.Name = "tsbExportToXLSX" Me.tsbExportToXLSX.Size = New System.Drawing.Size(105, 22) Me.tsbExportToXLSX.Text = "Export To XLSX" ' 'Form1 ' Me.AutoScaleDimensions = New System.Drawing.SizeF(6!, 13!) Me.AutoScaleMode = System.Windows.Forms.AutoScaleMode.Font Me.ClientSize = New System.Drawing.Size(842, 539) Me.Controls.Add(Me.pdfViewerControl1) Me.Controls.Add(Me.toolStrip1) Me.Name = "Form1" Me.StartPosition = System.Windows.Forms.FormStartPosition.CenterScreen Me.Text = "Form1" Me.toolStrip1.ResumeLayout(false) Me.toolStrip1.PerformLayout Me.ResumeLayout(false) Me.PerformLayout End Sub #End Region Private pdfViewerControl1 As Bytescout.PDFViewer.PDFViewerControl Private toolStrip1 As System.Windows.Forms.ToolStrip Private WithEvents tsbOpen As System.Windows.Forms.ToolStripButton Friend WithEvents ToolStripSeparator1 As Windows.Forms.ToolStripSeparator Friend WithEvents tsbExportToCSV As Windows.Forms.ToolStripButton Friend WithEvents tsbExportToXLSX As Windows.Forms.ToolStripButton End Class

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Suite Home Page

Explore ByteScout PDF Suite Documentation

Explore Samples

Sign Up for ByteScout PDF Suite Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Form1.vb
      
Imports System.Diagnostics Imports System.Drawing Imports System.Windows.Forms Imports Bytescout.PDFExtractor ' This example requires 'PDF Viewer SDK' and 'PDF Extractor SDK' installed. ' Download link: http://cdn.bytescout.com/ByteScoutInstaller.exe Public Partial Class Form1 Inherits Form Public Sub New() InitializeComponent() End Sub Protected Overrides Sub OnLoad(e As EventArgs) ' Preload document into viewer pdfViewerControl1.InputFile = ".\sample_ocr.pdf" MyBase.OnLoad(e) End Sub Private Sub tsbOpen_Click(ByVal sender As Object, ByVal e As EventArgs) Handles tsbOpen.Click Using openFileDialog As New OpenFileDialog() openFileDialog.Title = "Open PDF Document" openFileDialog.Filter = "PDF Files (*.pdf)|*.pdf|All Files|*.*" If openFileDialog.ShowDialog() = DialogResult.OK Then Me.Text = openFileDialog.FileName Cursor = Cursors.WaitCursor Try pdfViewerControl1.InputFile = openFileDialog.FileName Catch exception As Exception MessageBox.Show(exception.Message) Finally Cursor = Cursors.[Default] End Try End If End Using End Sub Private Sub tsbExportToCSV_Click(sender As Object, e As EventArgs) Handles tsbExportToCSV.Click ' Get selections from viewer Dim selections As RectangleF() = pdfViewerControl1.SelectionInPoints Dim outputFile As String = ".\result.csv" Using csvExtractor As CSVExtractor = New CSVExtractor() ' Load document into extractor csvExtractor.LoadDocumentFromFile(pdfViewerControl1.InputFile) ' Enable OCR to recongize text from images csvExtractor.OCRMode = OCRMode.Auto csvExtractor.OCRResolution = 300 csvExtractor.OCRLanguage = "eng" csvExtractor.OCRLanguageDataFolder = "c:\Program Files\Bytescout PDF Extractor SDK\net4.00\tessdata\" ' FYI, removing horizontal lines may increase the text recognition quality in some cases 'csvExtractor.OCRImagePreprocessingFilters.AddHorizontalLinesRemover() ' Another filter able to improve the recognition 'csvExtractor.OCRImagePreprocessingFilters.AddGammaCorrection() ' If selection exists set the extraction area. ' Overwise it will extract the whole page. If selections.Length > 0 Then csvExtractor.SetExtractionArea(selections(0)) End If ' Save extraction results to CSV files csvExtractor.SavePageCSVToFile(pdfViewerControl1.CurrentPageIndex, outputFile) End Using Process.Start(outputFile) End Sub Private Sub tsbExportToXLSX_Click(sender As Object, e As EventArgs) Handles tsbExportToXLSX.Click ' Get selections from viewer Dim selections As RectangleF() = pdfViewerControl1.SelectionInPoints Dim outputFile As String = ".\result.xlsx" Using xlsExtractor As XLSExtractor = New XLSExtractor() ' Load document into extractor xlsExtractor.LoadDocumentFromFile(pdfViewerControl1.InputFile) ' Enable OCR to recongize text from images xlsExtractor.OCRMode = OCRMode.Auto xlsExtractor.OCRResolution = 300 xlsExtractor.OCRLanguage = "eng" xlsExtractor.OCRLanguageDataFolder = "c:\Program Files\Bytescout PDF Extractor SDK\net4.00\tessdata\" xlsExtractor.OutputFormat = SpreadseetOutputFormat.XLSX xlsExtractor.RichTextFormatting = false ' FYI, removing horizontal lines may increase the text recognition quality in some cases 'xlsExtractor.OCRImagePreprocessingFilters.AddHorizontalLinesRemover() ' Another filter able to improve the recognition 'xlsExtractor.OCRImagePreprocessingFilters.AddGammaCorrection() ' If selection exists set the extraction area. ' Overwise it will extract the whole page. If selections.Length > 0 Then xlsExtractor.SetExtractionArea(selections(0)) End If ' Save extraction results to CSV files xlsExtractor.SavePageToXLSFile(pdfViewerControl1.CurrentPageIndex, outputFile) End Using Process.Start(outputFile) End Sub End Class

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Suite Home Page

Explore ByteScout PDF Suite Documentation

Explore Samples

Sign Up for ByteScout PDF Suite Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Program.vb
      
Imports System.Collections.Generic Imports System.Windows.Forms NotInheritable Class Program Private Sub New() End Sub ''' <summary> ''' The main entry point for the application. ''' </summary> <STAThread> _ Friend Shared Sub Main() Application.EnableVisualStyles() Application.SetCompatibleTextRenderingDefault(False) Application.Run(New Form1()) End Sub End Class

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Suite Home Page

Explore ByteScout PDF Suite Documentation

Explore Samples

Sign Up for ByteScout PDF Suite Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

app.config
      
<?xml version="1.0" encoding="utf-8"?> <configuration> <system.diagnostics> <sources> <!-- This section defines the logging configuration for My.Application.Log --> <source name="DefaultSource" switchName="DefaultSwitch"> <listeners> <add name="FileLog"/> <!-- Uncomment the below section to write to the Application Event Log --> <!--<add name="EventLog"/>--> </listeners> </source> </sources> <switches> <add name="DefaultSwitch" value="Information"/> </switches> <sharedListeners> <add name="FileLog" type="Microsoft.VisualBasic.Logging.FileLogTraceListener, Microsoft.VisualBasic, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL" initializeData="FileLogWriter"/> <!-- Uncomment the below section and replace APPLICATION_NAME with the name of your application to write to the Application Event Log --> <!--<add name="EventLog" type="System.Diagnostics.EventLogTraceListener" initializeData="APPLICATION_NAME"/> --> </sharedListeners> </system.diagnostics> <startup><supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.0"/></startup></configuration>

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Suite Home Page

Explore ByteScout PDF Suite Documentation

Explore Samples

Sign Up for ByteScout PDF Suite Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

VIDEO

ON-PREMISE OFFLINE SDK

60 Day Free Trial or Visit ByteScout PDF Suite Home Page

Explore ByteScout PDF Suite Documentation

Explore Samples

Sign Up for ByteScout PDF Suite Online Training

ON-DEMAND REST WEB API

Get Your API Key

Explore Web API Docs

Explore Web API Samples

Tutorials:

prev
next