How to Extract PDF with Multiple Columns to Text using PDF Multitool? - ByteScout

How to Extract PDF with Multiple Columns to Text using PDF Multitool?

  • Home
  • /
  • Articles
  • /
  • How to Extract PDF with Multiple Columns to Text using PDF Multitool?

In this tutorial, we will learn how to extract a PDF document with multiple columns such as Periodicals, Newspaper, School Newspaper, etc into Text. We will use this PDF document with three columns to extract and preserve its text formatting in the output.

  1. Open File in PDF Multitool
  2. Click Extract as TXT
  3. Set Function for Text Extraction
  4. Click Extract to File to Save TXT File

Source PDF File With Multiple Columns
Screenshot of Source File

1. Open File in PDF Multitool

First, let’s open our PDF file in the PDF Multitool.

Open Document In PDF Multitool

2. Click Extract as TXT

Next, on the left navigation panel click on Extract as TXT under the Data Extraction folder.

PDF Multitool Extract as TXT Function

3. Set Function for Text Extraction

It will open a small window where you can choose and set some of the functions for the text extraction. In this demonstration, we will only cover the functions below. The OCR Settings is very useful for scanned PDF files. To learn more about it, please check out this tutorial here.

  • The Column layout combines multiple columns into a single column.
  • The Preserve text formatting keeps the original document’s format.
  • The Extract current page will only extract the document’s page that you see in the PDF Multitool.
  • The Extract page range lets you set the page numbers that you want to extract.
  • The Preview button lets you preview what the output will look like before you save it to your computer.

4. Click Extract to File to Save TXT File

We will use the default settings and click on the Extract to File button to save the TXT file.

Extract as TXT Settings Page

Great! We have converted the PDF document to TXT successfully.

In this tutorial, we learned how to convert a PDF document with multiple columns into text. We preserved the three columns format in the TXT output and covered some of the basic functions in the PDF Extractor.

PDF to TXT With Multiple Columns Output
Screenshot of Output TXT

prev
next