An easy to understand sample source code to learn how to remove sensitive data from scanned document in VB.NET ByteScout Sensitive Data Suite can remove sensitive data from scanned document. It can be applied from VB.NET. ByteScout Sensitive Data Suite is the bundle that includes multiple components from ByteScout for working with sensitive and personal data. With these components you may analyze, redact, remove, blackout sensitive data in documents and pdf.
The following code snippet for ByteScout Sensitive Data Suite works best when you need to quickly remove sensitive data from scanned document in your VB.NET application. Follow the instructions from scratch to work and copy the VB.NET code. Use of ByteScout Sensitive Data Suite in VB.NET is also described in the documentation included along with the product.
You can download free trial version of ByteScout Sensitive Data Suite from our website with this and other source code samples for VB.NET.
On-demand (REST Web API) version:
Web API (on-demand version)
On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)
Imports System.IO
Imports Bytescout.PDFExtractor
Class Program
Shared Sub Main(ByVal args As String())
Dim searchablePDFStream As New MemoryStream()
' STEP-1 Make Searchable PDF
' STEP-2: Get search text result from that searchable PDF
' STEP-3: Remove sensitive data
' Create Bytescout.PDFExtractor.SearchablePDFMaker instance
Using searchablePDFMaker As New SearchablePDFMaker("demo", "demo")
' Load sample PDF document
searchablePDFMaker.LoadDocumentFromFile("sampleScannedPDF_EmailAddress.pdf")
' Set the location of language data files
searchablePDFMaker.OCRLanguageDataFolder = "c:\Program Files\Bytescout PDF Extractor SDK\ocrdata\"
' Set OCR language
searchablePDFMaker.OCRLanguage = "eng" ' "eng" For english, "deu" For German, "fra" For French, "spa" For Spanish etc - according To files In "ocrdata" folder
' Set PDF document rendering resolution
searchablePDFMaker.OCRResolution = 300
' Save extracted text to file
searchablePDFMaker.MakePDFSearchable(searchablePDFStream)
' Prepare TextExtractor
Using textExtractor As New TextExtractor("demo", "demo")
' Load stream into TextExtractor
textExtractor.LoadDocumentFromStream(searchablePDFStream)
' Search email Addresses
'See the complete regular expressions reference at https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx
Dim regexPattern As String = "\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b"
' Enable RegexSearch
textExtractor.RegexSearch = True
' Set word matching options
textExtractor.WordMatchingMode = WordMatchingMode.None
Dim searchResults() As ISearchResult = textExtractor.FindAll(0, regexPattern, caseSensitive:=False)
' Create Bytescout.PDFExtractor.Remover instance
Using remover As New Remover2("demo", "demo")
' Load sample PDF document
remover.LoadDocumentFromStream(searchablePDFStream)
' Mask removed text
remover.MaskRemovedText = True
' Make output file unsearchable
remover.MakePDFUnsearchable = True
' Provide text to remove
remover.AddTextToRemove(searchResults)
' Remove text objects find by SearchResults.
remover.PerformRemoval("result1.pdf")
End Using
End Using
End Using
Console.WriteLine()
Console.WriteLine("Press any key to continue and open result PDF files in default PDF viewer...")
Console.ReadKey()
Process.Start("result1.pdf")
End Sub
End Class
60 Day Free Trial or Visit ByteScout Sensitive Data Suite Home Page
Explore ByteScout Sensitive Data Suite Documentation
Explore Samples
Sign Up for ByteScout Sensitive Data Suite Online Training
Get Your API Key
Explore Web API Docs
Explore Web API Samples
60 Day Free Trial or Visit ByteScout Sensitive Data Suite Home Page
Explore ByteScout Sensitive Data Suite Documentation
Explore Samples
Sign Up for ByteScout Sensitive Data Suite Online Training
Get Your API Key
Explore Web API Docs
Explore Web API Samples