How to extract data from PDF to Text or CSV in PHP using Cloud API (low level) - ByteScout

How to extract data from PDF to Text or CSV in PHP using Cloud API (low level)

  • Home
  • /
  • Articles
  • /
  • How to extract data from PDF to Text or CSV in PHP using Cloud API (low level)

You may use the source code samples below to extract data from PDF to Text or CSV in PHP using Cloud API (low level).

Also, check these code samples showing how to extract and convert spreadsheets between various file formats in PHP using Cloud API.

PDF to Plain Text

POST/GET endpoint:

https://bytescout.io/v1/pdf/convert/to/text

Code Sample (PDF to Text):

<?php
<?php
require_once(__DIR__ . '/vendor/autoload.php');
 
// Configure API key authorization: api_key
Bytescout\Client\API\Configuration::getDefaultConfiguration()->setApiKey('x-api-key', 'YOUR_API_KEY');
// Uncomment below to setup prefix (e.g. Bearer) for API key, if needed
// Bytescout\Client\API\Configuration::getDefaultConfiguration()->setApiKeyPrefix('x-api-key', 'Bearer');
 
$api_instance = new Swagger\Client\Api\DefaultApi();
$pages = pages_example; // String | Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
$name = name_example; // String | File name for generated result.
$url = url_example; // String | URL of the source PDF file.
 
try {
    $result = $api_instance->pdfConvertToTextPost($pages, $name, $url);
    print_r($result);
} catch (Exception $e) {
    echo 'Exception when calling DefaultApi->pdfConvertToTextPost: ', $e->getMessage(), PHP_EOL;
}
?>

Though program is pretty simple and straight forward, let’s analyze its main parts.

1. First of all we’re providing api keys, so that server can authenticate request and process it. This api keys will be retrieved upon registration with pdf.co.

Bytescout\Client\API\Configuration::getDefaultConfiguration()->setApiKey('x-api-key', 'YOUR_API_KEY');

2. Next step is to prepare api request. Here we’re providing necessary parameters such as pages (of which we need to extract text), name of output file and source file url.

$api_instance = new Swagger\Client\Api\DefaultApi();
$pages = pages_example; // String | Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
$name = name_example; // String | File name for generated result. $url = url_example; // String | URL of the source PDF file.

3. Next step is to execute request and process response.

$result = $api_instance->pdfConvertToTextPost($pages, $name, $url);
print_r($result);

PDF to CSV

POST/GET endpoint:

https://bytescout.io/v1/pdf/convert/to/csv

Code Sample (PDF to CSV):

<?php
require_once(__DIR__ . '/vendor/autoload.php');
 
// Configure API key authorization: api_key
Bytescout\Client\API\Configuration::getDefaultConfiguration()->setApiKey('x-api-key', 'YOUR_API_KEY');
// Uncomment below to setup prefix (e.g. Bearer) for API key, if needed
// Bytescout\Client\API\Configuration::getDefaultConfiguration()->setApiKeyPrefix('x-api-key', 'Bearer');
 
$api_instance = new Swagger\Client\Api\DefaultApi();
$pages = pages_example; // String | Comma-separated list of page indices (or ranges) to process. Leave empty for all pages. Example: '0,2-5,7-'.
$name = name_example; // String | File name for generated result.
$url = url_example; // String | URL of the source PDF file.
 
try {
    $result = $api_instance->pdfConvertToCsvPost($pages, $name, $url);
    print_r($result);
} catch (Exception $e) {
    echo 'Exception when calling DefaultApi->pdfConvertToCsvPost: ', $e->getMessage(), PHP_EOL;
}
?>

This program is similarly executing logic to generate csv format outupt for input PDF.

Hope you find this article useful,
Happy Coding!

Tutorials:

prev
next