How to Extract PDF with Multiple Columns to Text using PDF Multitool? - ByteScout

How to Extract PDF with Multiple Columns to Text using PDF Multitool?

  • Home
  • /
  • How to Extract PDF with Multiple Columns to Text using PDF Multitool?

Step by step guide:

1. In this tutorial, we will learn how to extract a PDF document with multiple columns such as Periodicals, Newspaper, School Newspaper, etc into Text. We will use this PDF document with three columns to extract and preserve its text formatting in the output.

Source PDF File With Multiple Columns
Screenshot of Source File

2. First, let’s open our PDF file in the PDF Multitool.

Open Document In PDF Multitool

3. Next, on the left navigation panel click on Extract as TXT under the Data Extraction folder.

PDF Multitool Extract as TXT Function

4. It will open a small window where you can choose and set some of the functions for the text extraction. In this demonstration, we will only cover the functions below. The OCR Settings is very useful to scanned PDF files. To learn more about it, please check out this tutorial here.

  • The Column layout combines multiple columns into a single column.
  • The Preserve text formatting keeps the original document’s format.
  • The Extract current page will only extract the document’s page that you see in the PDF Multitool.
  • The Extract page range lets you set the page numbers that you want to extract.
  • The Preview button lets you preview what the output will look like before you save it to your computer.

We will use the default settings and click on the Extract to File button to save the TXT file.

Extract as TXT Settings Page

5. Great! We have converted the PDF document to TXT successfully.

In this tutorial, we learned how to convert a PDF document with multiple columns into text. We preserved the three columns format in the TXT output and covered some of the basic functions in the PDF Extractor.

PDF to TXT With Multiple Columns Output
Screenshot of Output TXT