RPA Robotic Process Automation - PDF Invoices to Google Sheets - C# - ByteScout
Announcement
Our ByteScout SDK products are sunsetting as we focus on expanding new solutions.
Learn More Open modal
Close modal
Announcement Important Update
ByteScout SDK Sunsetting Notice
Our ByteScout SDK products are sunsetting as we focus on our new & improved solutions. Thank you for being part of our journey, and we look forward to supporting you in this next chapter!

RPA Robotic Process Automation – PDF Invoices to Google Sheets – C#

  • Home
  • /
  • Articles
  • /
  • RPA Robotic Process Automation – PDF Invoices to Google Sheets – C#

PDF invoices to google sheets in C# using ByteScout Robotic Process Automation

ByteScout Robotic Process Automation is tools and API for rapid manual data entry automation using Robotic Process Automation based on ByteScout data extraction tools.

On-demand (REST Web API) version:
 Web API (on-demand version)

On-premise offline SDK for Windows:
 60 Day Free Trial (on-premise)

InvoicesToGoogleSheets.VS2015.csproj

      
<?xml version="1.0" encoding="utf-8"?> <Project ToolsVersion="14.0" DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" /> <PropertyGroup> <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration> <Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform> <ProjectGuid>{BF74D299-912C-4259-A022-71628F76F2B9}</ProjectGuid> <OutputType>Exe</OutputType> <RootNamespace>InvoicesToGoogleSheets</RootNamespace> <AssemblyName>InvoicesToGoogleSheets</AssemblyName> <TargetFrameworkVersion>v4.0</TargetFrameworkVersion> <FileAlignment>512</FileAlignment> <NuGetPackageImportStamp> </NuGetPackageImportStamp> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugSymbols>true</DebugSymbols> <DebugType>full</DebugType> <Optimize>false</Optimize> <OutputPath>bin\Debug\</OutputPath> <DefineConstants>DEBUG;TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' "> <PlatformTarget>AnyCPU</PlatformTarget> <DebugType>pdbonly</DebugType> <Optimize>true</Optimize> <OutputPath>bin\Release\</OutputPath> <DefineConstants>TRACE</DefineConstants> <ErrorReport>prompt</ErrorReport> <WarningLevel>4</WarningLevel> </PropertyGroup> <ItemGroup> <Reference Include="BouncyCastle.Crypto, Version=1.7.4137.9688, Culture=neutral, PublicKeyToken=a4292a325f69b123, processorArchitecture=MSIL"> <HintPath>packages\BouncyCastle.1.7.0\lib\Net40-Client\BouncyCastle.Crypto.dll</HintPath> </Reference> <Reference Include="Bytescout.PDFExtractor, Version=6.30.0.2431, Culture=neutral, PublicKeyToken=f7dd1bd9d40a50eb, processorArchitecture=MSIL"> <SpecificVersion>False</SpecificVersion> <HintPath>C:\Program Files\Bytescout PDF Extractor SDK\net2.00\Bytescout.PDFExtractor.dll</HintPath> </Reference> <Reference Include="Google.Apis, Version=1.10.0.25332, Culture=neutral, PublicKeyToken=4b01fa6e34db77ab, processorArchitecture=MSIL"> <HintPath>packages\Google.Apis.1.10.0\lib\net40\Google.Apis.dll</HintPath> </Reference> <Reference Include="Google.Apis.Auth, Version=1.10.0.25333, Culture=neutral, PublicKeyToken=4b01fa6e34db77ab, processorArchitecture=MSIL"> <HintPath>packages\Google.Apis.Auth.1.10.0\lib\net40\Google.Apis.Auth.dll</HintPath> </Reference> <Reference Include="Google.Apis.Auth.PlatformServices, Version=1.10.0.25333, Culture=neutral, PublicKeyToken=4b01fa6e34db77ab, processorArchitecture=MSIL"> <HintPath>packages\Google.Apis.Auth.1.10.0\lib\net40\Google.Apis.Auth.PlatformServices.dll</HintPath> </Reference> <Reference Include="Google.Apis.Core, Version=1.10.0.25331, Culture=neutral, PublicKeyToken=4b01fa6e34db77ab, processorArchitecture=MSIL"> <HintPath>packages\Google.Apis.Core.1.10.0\lib\portable-net40+sl50+win+wpa81+wp80\Google.Apis.Core.dll</HintPath> </Reference> <Reference Include="Google.Apis.PlatformServices, Version=1.10.0.25332, Culture=neutral, PublicKeyToken=4b01fa6e34db77ab, processorArchitecture=MSIL"> <HintPath>packages\Google.Apis.1.10.0\lib\net40\Google.Apis.PlatformServices.dll</HintPath> </Reference> <Reference Include="Google.Apis.Sheets.v4, Version=1.45.0.1930, Culture=neutral, PublicKeyToken=4b01fa6e34db77ab, processorArchitecture=MSIL"> <HintPath>packages\Google.Apis.Sheets.v4.1.45.0.1930\lib\net40\Google.Apis.Sheets.v4.dll</HintPath> </Reference> <Reference Include="log4net, Version=1.2.13.0, Culture=neutral, PublicKeyToken=669e0ddf0bb1aa2a, processorArchitecture=MSIL"> <HintPath>packages\log4net.2.0.3\lib\net40-full\log4net.dll</HintPath> </Reference> <Reference Include="Microsoft.Threading.Tasks, Version=1.0.12.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Bcl.Async.1.0.168\lib\net40\Microsoft.Threading.Tasks.dll</HintPath> </Reference> <Reference Include="Microsoft.Threading.Tasks.Extensions, Version=1.0.12.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Bcl.Async.1.0.168\lib\net40\Microsoft.Threading.Tasks.Extensions.dll</HintPath> </Reference> <Reference Include="Microsoft.Threading.Tasks.Extensions.Desktop, Version=1.0.168.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Bcl.Async.1.0.168\lib\net40\Microsoft.Threading.Tasks.Extensions.Desktop.dll</HintPath> </Reference> <Reference Include="Newtonsoft.Json, Version=7.0.0.0, Culture=neutral, PublicKeyToken=30ad4fe6b2a6aeed, processorArchitecture=MSIL"> <HintPath>packages\Newtonsoft.Json.7.0.1\lib\net40\Newtonsoft.Json.dll</HintPath> </Reference> <Reference Include="System" /> <Reference Include="System.Data" /> <Reference Include="System.Drawing" /> <Reference Include="System.IO, Version=2.6.10.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Bcl.1.1.10\lib\net40\System.IO.dll</HintPath> </Reference> <Reference Include="System.Net" /> <Reference Include="System.Net.Http, Version=2.2.29.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Net.Http.2.2.29\lib\net40\System.Net.Http.dll</HintPath> </Reference> <Reference Include="System.Net.Http.Extensions, Version=2.2.29.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Net.Http.2.2.29\lib\net40\System.Net.Http.Extensions.dll</HintPath> </Reference> <Reference Include="System.Net.Http.Primitives, Version=2.2.29.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Net.Http.2.2.29\lib\net40\System.Net.Http.Primitives.dll</HintPath> </Reference> <Reference Include="System.Net.Http.WebRequest, Version=2.2.29.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Net.Http.2.2.29\lib\net40\System.Net.Http.WebRequest.dll</HintPath> </Reference> <Reference Include="System.Runtime, Version=2.6.10.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Bcl.1.1.10\lib\net40\System.Runtime.dll</HintPath> </Reference> <Reference Include="System.Threading.Tasks, Version=2.6.10.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a, processorArchitecture=MSIL"> <HintPath>packages\Microsoft.Bcl.1.1.10\lib\net40\System.Threading.Tasks.dll</HintPath> </Reference> <Reference Include="System.Web.Extensions" /> <Reference Include="System.Xml" /> <Reference Include="Zlib.Portable, Version=1.11.0.0, Culture=neutral, PublicKeyToken=431cba815f6a8b5b, processorArchitecture=MSIL"> <HintPath>packages\Zlib.Portable.Signed.1.11.0\lib\portable-net4+sl5+wp8+win8+wpa81+MonoTouch+MonoAndroid\Zlib.Portable.dll</HintPath> </Reference> </ItemGroup> <ItemGroup> <Compile Include="Program.cs" /> </ItemGroup> <ItemGroup> <None Include="app.config" /> <None Include="credentials.json"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> <None Include="packages.config" /> <None Include="Sample_Files\InvoiceApr.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> <None Include="Sample_Files\InvoiceMar.pdf"> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </None> </ItemGroup> <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" /> <Import Project="packages\Microsoft.Bcl.Build.1.0.21\build\Microsoft.Bcl.Build.targets" Condition="Exists('packages\Microsoft.Bcl.Build.1.0.21\build\Microsoft.Bcl.Build.targets')" /> <Target Name="EnsureNuGetPackageBuildImports" BeforeTargets="PrepareForBuild"> <PropertyGroup> <ErrorText>This project references NuGet package(s) that are missing on this computer. Use NuGet Package Restore to download them. For more information, see http://go.microsoft.com/fwlink/?LinkID=322105. The missing file is {0}.</ErrorText> </PropertyGroup> <Error Condition="!Exists('packages\Microsoft.Bcl.Build.1.0.21\build\Microsoft.Bcl.Build.targets')" Text="$([System.String]::Format('$(ErrorText)', 'packages\Microsoft.Bcl.Build.1.0.21\build\Microsoft.Bcl.Build.targets'))" /> </Target> <!-- To modify your build process, add your task inside one of the targets below and uncomment it. Other similar extension points exist, see Microsoft.Common.targets. <Target Name="BeforeBuild"> </Target> <Target Name="AfterBuild"> </Target> --> </Project>

InvoicesToGoogleSheets.VS2015.sln

      
Microsoft Visual Studio Solution File, Format Version 12.00 # Visual Studio Version 16 VisualStudioVersion = 16.0.29001.49 MinimumVisualStudioVersion = 10.0.40219.1 Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "InvoicesToGoogleSheets.VS2015", "InvoicesToGoogleSheets.VS2015.csproj", "{BF74D299-912C-4259-A022-71628F76F2B9}" EndProject Global GlobalSection(SolutionConfigurationPlatforms) = preSolution Debug|Any CPU = Debug|Any CPU Release|Any CPU = Release|Any CPU EndGlobalSection GlobalSection(ProjectConfigurationPlatforms) = postSolution {BF74D299-912C-4259-A022-71628F76F2B9}.Debug|Any CPU.ActiveCfg = Debug|Any CPU {BF74D299-912C-4259-A022-71628F76F2B9}.Debug|Any CPU.Build.0 = Debug|Any CPU {BF74D299-912C-4259-A022-71628F76F2B9}.Release|Any CPU.ActiveCfg = Release|Any CPU {BF74D299-912C-4259-A022-71628F76F2B9}.Release|Any CPU.Build.0 = Release|Any CPU EndGlobalSection GlobalSection(SolutionProperties) = preSolution HideSolutionNode = FALSE EndGlobalSection GlobalSection(ExtensibilityGlobals) = postSolution SolutionGuid = {EDABDEE2-49E8-4139-9277-143673CEC8F3} EndGlobalSection EndGlobal

Program.cs

      
using Bytescout.PDFExtractor; using Google.Apis.Auth.OAuth2; using Google.Apis.Sheets.v4; using Google.Apis.Sheets.v4.Data; using Google.Apis.Util.Store; using System; using System.Collections.Generic; using System.Drawing; using System.IO; using System.Threading; namespace InvoicesToGoogleSheets { class Program { // If modifying these scopes, delete your previously saved credentials // at ~/.credentials/sheets.googleapis.com-dotnet-quickstart.json static string[] Scopes = new string[] { SheetsService.Scope.Spreadsheets }; static string ApplicationName = "Google Sheets Demo"; static void Main(string[] args) { var allInputFiles = new string[] { "Sample_Files\\InvoiceMar.pdf", "Sample_Files\\InvoiceApr.pdf" }; // Google Sheet data to write var reqRowData = new List<RowData>(); // Set page index var pageIndex = 0; // Loop through all input files foreach (var itmFile in allInputFiles) { var invName = Path.GetFileNameWithoutExtension(itmFile); // Create TextExtractor instance using (var tableExtractor = new StructuredExtractor("demo", "demo")) { var tableExtractionArea = GetTableExtractionArea(itmFile); if (tableExtractionArea.HasValue) { // Load document from file tableExtractor.LoadDocumentFromFile(itmFile); // Set extraction area tableExtractor.SetExtractionArea(tableExtractionArea.Value); // Prepare table structure tableExtractor.PrepareStructure(pageIndex); int rowCount = tableExtractor.GetRowCount(pageIndex); // Ignoring first title row for (int row = 1; row < rowCount; row++) { // Google Sheet RowData Input var itmRowData = new RowData(); // Google Sheet CellData Input var lstCellData = new List<CellData>(); int columnCount = tableExtractor.GetColumnCount(pageIndex, row); // Add invName to CellData Collection lstCellData.Add(new CellData { UserEnteredValue = new ExtendedValue() { StringValue = invName } }); for (int col = 0; col < columnCount; col++) { // Get table cell value var tableCellValue = tableExtractor.GetCellValue(pageIndex, row, col); // Google Sheet CellData Input var itmCellData = new CellData { UserEnteredValue = new ExtendedValue() { StringValue = tableCellValue } }; // Add to CellData Collection lstCellData.Add(itmCellData); } // Add to Google Sheet RowData Request reqRowData.Add(new RowData { Values = lstCellData }); } } } } // Proceed with writing to Google Sheets UserCredential credential; // Follow article "https://developers.google.com/sheets/api/quickstart/dotnet" to generate credentials.json using (var stream = new FileStream("credentials.json", FileMode.Open, FileAccess.Read)) { // The file token.json stores the user's access and refresh tokens, and is created // automatically when the authorization flow completed for the first time. string credPath = "token.json"; credential = GoogleWebAuthorizationBroker.AuthorizeAsync( GoogleClientSecrets.Load(stream).Secrets, Scopes, "user", CancellationToken.None, new FileDataStore(credPath, true)).Result; Console.WriteLine({code}quot;Credential file saved to : {credPath}"); } // Create Google Sheets API Service var service = new SheetsService(new Google.Apis.Services.BaseClientService.Initializer() { HttpClientInitializer = credential, ApplicationName = ApplicationName }); // Fill following parameter with your google spreadsheet Id // Usually find at url of spreadsheet: https://docs.google.com/spreadsheets/d/...ThisIsSpreadsheetId.../edit var spreadsheetId = "xxxxxxSpreadsheetIdxxxxxxxxxx"; // Get exising row counts var existingRowCount = service.Spreadsheets.Values.Get(spreadsheetId, range: "A1:D").Execute().Values.Count; var lstRequsts = new List<Request>(); lstRequsts.Add(new Request { // New data to be written UpdateCells = new UpdateCellsRequest() { Range = new GridRange { StartRowIndex = existingRowCount, StartColumnIndex = 0 }, Rows = reqRowData, Fields = "*" } }); // Prepare batch update spreadsheet request var batchUpdateSpreadsheetRequest = new BatchUpdateSpreadsheetRequest() { Requests = lstRequsts }; var batchUpdateRequest = service.Spreadsheets.BatchUpdate(batchUpdateSpreadsheetRequest, spreadsheetId); // Perform google sheet updation batchUpdateRequest.Execute(); Console.WriteLine("Google spreadsheet updated!"); Console.ReadLine(); } /// <summary> /// Gets table extraction area /// </summary> private static RectangleF? GetTableExtractionArea(string inputFile, int pageIndex = 0) { // Create Bytescout.PDFExtractor.TableDetector instance using (var tableDetector = new TableDetector("demo", "demo")) { // Set table detection mode to "bordered tables" - best for tables with closed solid borders. tableDetector.ColumnDetectionMode = ColumnDetectionMode.BorderedTables; // We should define what kind of tables we should detect. // So we set min required number of columns to 2 ... tableDetector.DetectionMinNumberOfColumns = 2; // ... and we set min required number of rows to 2 tableDetector.DetectionMinNumberOfRows = 2; tableDetector.LoadDocumentFromFile(inputFile); if (!tableDetector.FindTable(pageIndex)) { return null; } // Getting first found table location return tableDetector.FoundTableLocation; } } } }

app.config

      
<?xml version="1.0" encoding="utf-8"?> <configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <dependentAssembly> <assemblyIdentity name="System.Runtime" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" /> <bindingRedirect oldVersion="0.0.0.0-2.6.10.0" newVersion="2.6.10.0" /> </dependentAssembly> <dependentAssembly> <assemblyIdentity name="System.Threading.Tasks" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" /> <bindingRedirect oldVersion="0.0.0.0-2.6.10.0" newVersion="2.6.10.0" /> </dependentAssembly> <dependentAssembly> <assemblyIdentity name="System.Net.Http" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" /> <bindingRedirect oldVersion="0.0.0.0-2.2.29.0" newVersion="2.2.29.0" /> </dependentAssembly> <dependentAssembly> <assemblyIdentity name="System.Net.Http.Primitives" publicKeyToken="b03f5f7f11d50a3a" culture="neutral" /> <bindingRedirect oldVersion="0.0.0.0-2.2.29.0" newVersion="2.2.29.0" /> </dependentAssembly> </assemblyBinding> </runtime> </configuration>

credentials.json

      
/* ===================================================== Follow article https://developers.google.com/sheets/api/quickstart/dotnet to get credentials.json ===================================================== */

packages.config

      
<?xml version="1.0" encoding="utf-8"?> <packages> <package id="BouncyCastle" version="1.7.0" targetFramework="net40" /> <package id="Google.Apis" version="1.10.0" targetFramework="net40" /> <package id="Google.Apis.Auth" version="1.10.0" targetFramework="net40" /> <package id="Google.Apis.Core" version="1.10.0" targetFramework="net40" /> <package id="Google.Apis.Sheets.v4" version="1.45.0.1930" targetFramework="net40" /> <package id="log4net" version="2.0.3" targetFramework="net40" /> <package id="Microsoft.Bcl" version="1.1.10" targetFramework="net40" /> <package id="Microsoft.Bcl.Async" version="1.0.168" targetFramework="net40" /> <package id="Microsoft.Bcl.Build" version="1.0.21" targetFramework="net40" /> <package id="Microsoft.Net.Http" version="2.2.29" targetFramework="net40" /> <package id="Newtonsoft.Json" version="7.0.1" targetFramework="net40" /> <package id="Zlib.Portable.Signed" version="1.11.0" targetFramework="net40" /> </packages>

VIDEO

ON-PREMISE OFFLINE SDK

Get 60 Day Free Trial

See also:

ON-DEMAND REST WEB API

Get Your API Key

See also:

Tutorials:

prev
next