ByteScout PDF Extractor SDK - C# - Find SSN in PDF with Regex - ByteScout

ByteScout PDF Extractor SDK – C# – Find SSN in PDF with Regex

  • Home
  • /
  • Articles
  • /
  • ByteScout PDF Extractor SDK – C# – Find SSN in PDF with Regex

How to find SSN in PDF with regex in C# using ByteScout PDF Extractor SDK

Write code in C# to find SSN in PDF with regex with this step-by-step tutorial

This sample source code below will demonstrate you how to find SSN in PDF with regex in C#. ByteScout PDF Extractor SDK is the SDK that helps developers to extract data from unstructured documents, pdf, images, scanned and electronic forms. Includes AI functions like automatic table detection, automatic table extraction and restructuring, text recognition and text restoration from pdf and scanned documents. Includes PDF to CSV, PDF to XML, PDF to JSON, PDF to searchable PDF functions as well as methods for low level data extraction. It can be used to find SSN in PDF with regex using C#.

The SDK samples like this one below explain how to quickly make your application do find SSN in PDF with regex in C# with the help of ByteScout PDF Extractor SDK. Just copy and paste the code into your C# application’s code and follow the instruction. Test C# sample code examples whether they respond your needs and requirements for the project.

Download free trial version of ByteScout PDF Extractor SDK from our website with this and other source code samples for C#.

Try it today: Get 60 Day Free Trial or sign up for Web API

<?xml version="1.0" encoding="utf-8"?> <Project Sdk="Microsoft.NET.Sdk"> <PropertyGroup> <OutputType>Exe</OutputType> <TargetFramework>netcoreapp2.0</TargetFramework> <EnableDefaultCompileItems>false</EnableDefaultCompileItems> <GenerateAssemblyCompanyAttribute>false</GenerateAssemblyCompanyAttribute> <GenerateAssemblyConfigurationAttribute>false</GenerateAssemblyConfigurationAttribute> <GenerateAssemblyFileVersionAttribute>false</GenerateAssemblyFileVersionAttribute> <GenerateAssemblyInformationalVersionAttribute>false</GenerateAssemblyInformationalVersionAttribute> <GenerateAssemblyProductAttribute>false</GenerateAssemblyProductAttribute> <GenerateAssemblyTitleAttribute>false</GenerateAssemblyTitleAttribute> <GenerateAssemblyVersionAttribute>false</GenerateAssemblyVersionAttribute> <GenerateAssemblyCopyrightAttribute>false</GenerateAssemblyCopyrightAttribute> <GenerateAssemblyTrademarkAttribute>false</GenerateAssemblyTrademarkAttribute> <GenerateAssemblyCultureAttribute>false</GenerateAssemblyCultureAttribute> <GenerateAssemblyDescriptionAttribute>false</GenerateAssemblyDescriptionAttribute> </PropertyGroup> <ItemGroup> <Compile Include="Program.cs" /> <None Include="samplePDF_SSNNo.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> </ItemGroup> <ItemGroup> <PackageReference Include="Microsoft.Windows.Compatibility" Version="2.0.0" /> </ItemGroup> <ItemGroup> <Reference Include="Bytescout.PDFExtractor"> <SpecificVersion>False</SpecificVersion> <HintPath>c:\Program Files\Bytescout PDF Extractor SDK\netcoreapp2.0\Bytescout.PDFExtractor.dll</HintPath> </Reference> <Reference Include="Bytescout.PDFExtractor.OCRExtension"> <SpecificVersion>False</SpecificVersion> <HintPath>c:\Program Files\Bytescout PDF Extractor SDK\netcoreapp2.0\Bytescout.PDFExtractor.OCRExtension.dll</HintPath> </Reference> </ItemGroup> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

<?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="15.0" xmlns=""> <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" /> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProjectGuid>{99735776-2956-463D-9795-EBCE16928C30}</ProjectGuid> <OutputType>Exe</OutputType> <RootNamespace>FindSSNNumberRegex</RootNamespace> <AssemblyName>FindSSNNumberRegex</AssemblyName> <TargetFrameworkVersion>v2.0</TargetFrameworkVersion> <FileAlignment>512</FileAlignment> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG;TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <ItemGroup> <Reference Include="Bytescout.PDFExtractor, Version=, Culture=neutral, PublicKeyToken=f7dd1bd9d40a50eb, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> <HintPath>c:\Program Files\Bytescout PDF Extractor SDK\net2.00\Bytescout.PDFExtractor.dll</HintPath> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Xml" /> </ItemGroup> <ItemGroup> <Compile Include="Program.cs" /> </ItemGroup> <ItemGroup> <None Include="samplePDF_SSNNo.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> </ItemGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" /> </Project>

Try it today: Get 60 Day Free Trial or sign up for Web API

using Bytescout.PDFExtractor; using System; namespace FindSSNNumberRegexp { class Program { static void Main(string[] args) { try { // Create Bytescout.PDFExtractor.TextExtractor instance using (TextExtractor extractor = new TextExtractor()) { extractor.RegistrationName = "demo"; extractor.RegistrationKey = "demo"; // Load sample PDF document extractor.LoadDocumentFromFile("samplePDF_SSNNo.pdf"); extractor.RegexSearch = true; // Enable the regular expressions int pageCount = extractor.GetPageCount(); // Search through pages for (int i = 0; i < pageCount; i++) { // Search SSN in format 202-55-0130 string regexPattern = "[0-9]{3}-[0-9]{2}-[0-9]{4}"; // See the complete regular expressions reference at // Search each page for the pattern if (extractor.Find(i, regexPattern, false)) { do { // Iterate through each element in the found text foreach (ISearchResultElement element in extractor.FoundText.Elements) { Console.WriteLine("Found SSN No: " + element.Text); } } while (extractor.FindNext()); } } } } catch (Exception ex) { Console.WriteLine("Error: " + ex.Message); } Console.WriteLine(); Console.WriteLine("Press enter key to continue..."); Console.ReadLine(); } } }

Try it today: Get 60 Day Free Trial or sign up for Web API


Get 60 Day Free Trial or Visit ByteScout PDF Extractor SDK page

Explore ByteScout PDF Extractor SDK documentation


Sign Up for free Web API key

Explore Web API Documentation