ByteScout PDF Extractor SDK – C# – Find Phone Number in PDF with Regex

Home
/
Articles
/
ByteScout PDF Extractor SDK – C# – Find Phone Number in PDF with Regex

printable version:
ByteScout-PDF-Extractor-SDK-C-sharp-Find-Phone-Number-in-PDF-with-Regex.pdf

How to find phone number in PDF with regex in C# and ByteScout PDF Extractor SDK

This code in C# shows how to find phone number in PDF with regex with this how to tutorial

Every ByteScout tool contains example C# source codes that you can find here or in the folder with installed ByteScout product. ByteScout PDF Extractor SDK is the SDK that helps developers to extract data from unstructured documents, pdf, images, scanned and electronic forms. Includes AI functions like automatic table detection, automatic table extraction and restructuring, text recognition and text restoration from pdf and scanned documents. Includes PDF to CSV, PDF to XML, PDF to JSON, PDF to searchable PDF functions as well as methods for low level data extraction. It can find phone number in PDF with regex in C#.

Fast application programming interfaces of ByteScout PDF Extractor SDK for C# plus the instruction and the code below will help you quickly learn how to find phone number in PDF with regex. In your C# project or application you may simply copy & paste the code and then run your app! Detailed tutorials and documentation are available along with installed ByteScout PDF Extractor SDK if you’d like to dive deeper into the topic and the details of the API.

Our website provides trial version of ByteScout PDF Extractor SDK for free. It also includes documentation and source code samples.

On-demand (REST Web API) version:
Web API (on-demand version)

On-premise offline SDK for Windows:
60 Day Free Trial (on-premise)

Program.cs

      using Bytescout.PDFExtractor;
using System;

namespace FindPhoneNumberRegex
{
    class Program
    {
        static void Main(string[] args)
        {
            try
            {
                // Create Bytescout.PDFExtractor.TextExtractor instance
                using (TextExtractor extractor = new TextExtractor())
                {
                    extractor.RegistrationName = "demo";
                    extractor.RegistrationKey = "demo";

                    // Load sample PDF document
                    extractor.LoadDocumentFromFile("samplePDF_PhoneNo.pdf");

                    extractor.RegexSearch = true; // Enable the regular expressions

                    int pageCount = extractor.GetPageCount();

                    // Search through pages
                    for (int i = 0; i < pageCount; i++)
                    {
                        // Search phoneNos in format 202-555-0130
                        string regexPattern = "[0-9]{3}-[0-9]{3}-[0-9]{4}";
                        // See the complete regular expressions reference at https://msdn.microsoft.com/en-us/library/az24scfc(v=vs.110).aspx

                        // Search each page for the pattern
                        if (extractor.Find(i, regexPattern, false))
                        {
                            do
                            {
                                // Iterate through each element in the found text
                                foreach (ISearchResultElement element in extractor.FoundText.Elements)
                                {
                                    Console.WriteLine("Found Phone No: " + element.Text);
                                }
                            }
                            while (extractor.FindNext());
                        }
                    }
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine("Error: " + ex.Message);
            }

            Console.WriteLine();
            Console.WriteLine("Press enter key to continue...");
            Console.ReadLine();
        }
    }
}