How to PDF to HTML in JavaScript & jQuery using Cloud API (low level) - ByteScout

How to PDF to HTML in JavaScript & jQuery using Cloud API (low level)

  • Home
  • /
  • Articles
  • /
  • How to PDF to HTML in JavaScript & jQuery using Cloud API (low level)

Use the sample source code below to convert PDF to HTML in JavaScript & jQuery using ByteScout Cloud API (low level).

It is also possible to convert HTML to PDF using Cloud API.

Let’s view the source code for this example first, and then we’ll analyze it.

Sample.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="utf-8">
    <title>Cloud API JQuery sample</title>

    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.12.0/jquery.min.js"></script>
    <script src="pdf_to_html.js" type="text/javascript"></script>

</head>
<body>
 
    <form id="form" enctype="multipart/form-data">
        <p>
            <label>Copy-paste your API Key for ByteScout Cloud API here</label>
            <input type="text" id="apiKey" placeholder="your cloud API Key" value=""/> 
            <a href="https://secure.bytescout.com/sign_up" target="_blank">no api key yet? sign up here</a>
        </p>
        <p>
            <label>Input PDF File</label>
            <input type="file" name="file" id="inputFile" />
        </p>
        <p>
            <label>Convert To</label>
            <select id="convertType">
                <option value="text"> TXT  </option>
                <option value="csv"> CSV  </option>
                <option value="json"> JSON  </option>                
                <option value="xml"> XML  </option>
                <option value="xls"> XLS  </option>
                <option value="xlsx"> XLSX  </option>
                <option value="html"> HTML  </option>
            </select>
        </p>
        <p>
            <label>Output As</label>
            <select id="outputType">
                <option value="link"> URL to output file  </option>
                <option value="inline"> inline content</option>
            </select>
        </p>        
        <button type="button" id="submit">Convert</button> <span id="status"></span>
    </form>
 
</pre>
<div id="errorBlock">
<h2>Error:</h2>
<h4>Code: <span id="statusCode"></span></h4>
<ul id="errors"></ul>
</div>

<div id="resultBlock">
<h2>Output:</h2>
<a id="result" href="" target="_blank"></a>
<div id="inlineOutput"></div>
</div>

</body>
</html>

pdf_to_html.js

//*******************************************************************************************//
//                                                                                           //
// Download Free Evaluation Version From: https://bytescout.com/download/web-installer       //
//                                                                                           //
// Also available as Web API! Free Trial Sign Up: https://secure.bytescout.com/users/sign_up //
//                                                                                           //
// Copyright © 2017-2018 ByteScout Inc. All rights reserved.                                 //
// https://bytescout.com                                                                  //
//                                                                                           //
//*******************************************************************************************//


$(document).ready(function () {
    $("#resultBlock").hide();
    $("#errorBlock").hide();
    $("#result").attr("href", '').html('');
});
 
$(document).on("click", "#submit", function () {
    $("#resultBlock").hide();
    $("#errorBlock").hide();
    $("#inlineOutput").text(''); // inline output div
    $("#status").text(''); // status div
 
    var apiKey = $("#apiKey").val().trim(); //Get your API key at https://secure.bytescout.com/cloudapi.html
 
    var formData = $("#form input[type=file]")[0].files[0]; // file to upload
    var toType = $("#convertType").val(); // output type
    var isInline = $("#outputType").val() == "inline"; // if we need output as inline content or link to output file

    $("#status").text('requesting presigned url for upload...');

    $.ajax({
        url: 'https://bytescout.io/v1/file/upload/get-presigned-url?name=test.pdf&contenttype=application/pdf&encrypt=true',
        type: 'GET',
        headers: {'x-api-key': apiKey}, // passing our api key
        success: function (result) {    

            if (result['error'] === false) {
                var presignedUrl = result['presignedUrl']; // reading provided presigned url to put our content into
                var accessUrl = result['url']; // reading output url that will indicate uploaded file

                $("#status").text('uploading...');

                $.ajax({
                    url: presignedUrl, // no api key is required to upload file
                    type: 'PUT',
                    headers: {'content-type': 'application/pdf'}, // setting to pdf type as we are uploading pdf file
                    data: formData,
                    processData: false,
                    success: function (result) {                               
                        
                        $("#status").text('converting...');

                        $.ajax({
                            url: 'https://bytescout.io/v1/pdf/convert/to/'+toType+'?url='+ presignedUrl + '&encrypt=true&inline=' + isInline,
                            type: 'POST',
                            headers: {'x-api-key': apiKey},
                            success: function (result) { 

                                $("#status").text('done converting.');

                                // console.log(JSON.stringify(result));
                                
                                $("#resultBlock").show();

                                if (isInline)
                                {                                    
                                    $("#inlineOutput").text(result['body']);
                                }
                                else {
                                    $("#result").attr("href", result['url']).html(result['url']);
                                }
                                
                            }
                        });
                

                    },
                    error: function () {
                        $("#status").text('error');
                    }
                  });                
        

                }
            }
        });
});
 

Let’s review the logic.

Sample.html

This HTML file represents the display structure. The followings are the main elements:

  • File input which represents input PDF file to be uploaded.
  • Select element containing list of format to which input PDF file can be converted. As in this case provided formats are “TXT”, “CSV”, “JSON”, “XML”, “XLS”, “XLSX” and “HTML”.
  • Select the list to specify the output type. Here, we can have output in two ways. First, if we want the result file as a link then we can specify “link” as the output type. And second, if we want to receive the content of the result file, then we can set “inline” as the output type.
  • We have a button labeled as “Convert“, on click of whom it’ll start the conversion process.
  • Other than these we can div/label elements which display the status of the conversion process or result.

This sample HTML file contains links to the script file “converter.js” which contains event binding and logic to process file conversation and to handle UI changes. This script file “converter.js” is built upon jQuery.

pdf_to_html.js

This JS file mainly handles the submit button click which processes uploaded PDF file and do conversation, also handles all UI rendering changes. Let’s review the submit button click event here.

Initially, we’re uploading the file to the cloud and getting the uploaded file URL so that it can be converted to a different format of choice. If you already have this URL pointing to a PDF file, then you can skip these steps to obtain a cloud uploaded URL (a pre-signed URL) and uploading the file to it.

Here, to get a pre-signed URL we are preparing a URL like below where we are passing the name of the result file as “test.pdf“.

url: 'https://bytescout.io/v1/file/upload/get-presigned-url?name=test.pdf&contenttype=application/pdf&encrypt=true'

Once, URL is ready we’re passing API keys in the header and executing the GET request, which will give us a pre-signed URL as a response.

headers: {'x-api-key': apiKey}, // passing our api key

Now that we have the cloud URL of the uploaded file, we can proceed with actually uploading files to that URL. For this, we are using a pre-signed URL and passing the user uploaded file as data, and executing it as a PUT request like in the below code snippet. As you have noticed, this is a two-step process here. First, we’re getting a pre-signed URL and Then we’re actually uploading the file to it.

...
$.ajax({
url: presignedUrl, // no api key is required to upload file
type: 'PUT',
headers: {'content-type': 'application/pdf'}, // setting to pdf type as we are uploading pdf file
data: formData,
...

Once the file is uploaded to the pre-signed URL, we’ll start the converting process. For this first, we’ll prepare the URL as below.

url: 'https://bytescout.io/v1/pdf/convert/to/'+toType+'?url='+ presignedUrl + '&encrypt=true&inline=' + isInline,

We are also passing destination format type as “toType” variable to “to” segment. We’re passing the pre-signed URL to the “URL” query string parameter and specified whether we want results as an inline mode or as URL in the “inline” query string parameter.

Lastly, we’re executing POST request by passing the API key in headers and handling response which is a converted PDF file in a specified format.

That’s all guys. It’s easy to convert PDF documents to other formats with ByteScout API.

Happy Coding!

Tutorials:

prev
next