PDF Extractor SDK

Home
/
PDF Extractor SDK

ByteScout PDF Extractor SDK – C# – Profiles

ByteScout PDF Extractor SDK - C# - Profiles Program.cs Click here to get your Free Trial version of the SDK

ByteScout PDF Extractor SDK – C# – Parallel Processing

ByteScout PDF Extractor SDK - C# - Parallel Processing Program.cs using System; using System.IO; using System.Threading; using Bytescout.PDFExtractor; namespace Parallel_Processing { class Program { // Limit to 4 threads in queue. // Set this value to number of your processor cores for max performance. private static readonly Semaphore ThreadLimiter = new Semaphore(4, 4); static void Main(string[] args) { // Get all PDF files in a folder string[] files = Directory.GetFiles(@"..\..\..\..\", "*.pdf"); // Array of events [...]

ByteScout PDF Extractor SDK – C++ – OCR (Optical Character Recognition)

ByteScout PDF Extractor SDK - C++ - OCR (Optical Character Recognition) CPPExample.cpp stdafx.cpp Click here to get your Free Trial version of the SDK

ByteScout PDF Extractor SDK – C++ – Merge Documents

ByteScout PDF Extractor SDK - C++ - Merge Documents CPPExample.cpp stdafx.cpp Click here to get your Free Trial version of the SDK

ByteScout PDF Extractor SDK – C# – Invoice Parsing

ByteScout PDF Extractor SDK - C# - Invoice Parsing Program.cs Click here to get your Free Trial version of the SDK

ByteScout PDF Extractor SDK – C# – Find Text With Hyphens

ByteScout PDF Extractor SDK - C# - Find Text With Hyphens Program.cs using System; using System.Drawing; using Bytescout.PDFExtractor; namespace FindText { class Program { static void Main(string[] args) { // Create Bytescout.PDFExtractor.TextExtractor instance TextExtractor extractor = new TextExtractor(); extractor.RegistrationName = "demo"; extractor.RegistrationKey = "demo"; // Load sample PDF document extractor.LoadDocumentFromFile("words-with-hyphens.pdf"); int pageCount = extractor.GetPageCount(); for (int i = 0; i < pageCount; i++) { // Search each page for "hyphen" string if (extractor.Find(i, "hyphen", false)) [...]

ByteScout PDF Extractor SDK – C# – Find Text – Smart Match

ByteScout PDF Extractor SDK - C# - Find Text - Smart Match Program.cs using System; using Bytescout.PDFExtractor; namespace FindTextSmartMatch { class Program { static void Main(string[] args) { TextExtractor extractor = new TextExtractor("demo", "demo"); // Load the document extractor.LoadDocumentFromFile("sample2.pdf"); // Smart match the search string like Adobe Reader extractor.WordMatchingMode = WordMatchingMode.SmartMatch; string searchString = "land"; // Get page count int pageCount = extractor.GetPageCount(); // Iterate through pages for (int i = 0; i < pageCount; [...]

ByteScout PDF Extractor SDK – C# – Find Text (Regex)

ByteScout PDF Extractor SDK - C# - Find Text (Regex) Program.cs using System; using Bytescout.PDFExtractor; namespace FindText { class Program { static void Main(string[] args) { // Create Bytescout.PDFExtractor.TextExtractor instance TextExtractor extractor = new TextExtractor(); extractor.RegistrationName = "demo"; extractor.RegistrationKey = "demo"; // Load sample PDF document extractor.LoadDocumentFromFile(@".\Invoice.pdf"); extractor.RegexSearch = true; // Enable the regular expressions int pageCount = extractor.GetPageCount(); // Search through pages for (int i = 0; i < pageCount; i++) { // Search [...]

ByteScout PDF Extractor SDK – C# – Find Text

ByteScout PDF Extractor SDK - C# - Find Text Program.cs using System; using System.Drawing; using Bytescout.PDFExtractor; namespace FindText { class Program { static void Main(string[] args) { // Create Bytescout.PDFExtractor.TextExtractor instance TextExtractor extractor = new TextExtractor(); extractor.RegistrationName = "demo"; extractor.RegistrationKey = "demo"; // Load sample PDF document extractor.LoadDocumentFromFile(@".\sample1.pdf"); // Set the matching mode. // WordMatchingMode.None - treats the search string as substring // WordMatchingMode.ExactMatch - treats the search string as separate word // WordMatchingMode.SmartMatch - [...]

ByteScout PDF Extractor SDK – C# – Find Table And Extract As XML

ByteScout PDF Extractor SDK - C# - Find Table And Extract As XML Program.cs using System.Diagnostics; using Bytescout.PDFExtractor; namespace FindTableAndExtractAsXml { class Program { static void Main(string[] args) { // Create Bytescout.PDFExtractor.XMLExtractor instance XMLExtractor xmlExtractor = new XMLExtractor(); xmlExtractor.RegistrationName = "demo"; xmlExtractor.RegistrationKey = "demo"; // Create Bytescout.PDFExtractor.TableDetector instance TableDetector tableDetector = new TableDetector(); tableDetector.RegistrationKey = "demo"; tableDetector.RegistrationName = "demo"; // We should define what kind of tables we should detect. // So we set min [...]