PDF rectangle to text tutorial - extract text from PDF in C# and VB - ByteScout

PDF rectangle to text tutorial – extract text from PDF in C# and VB

  • Home
  • /
  • Articles
  • /
  • PDF rectangle to text tutorial – extract text from PDF in C# and VB

PDF rectangle to text tutorial will help you to extract PDF text from given rectangle in C# or Visual Basic using PDF Extractor SDK. Use the source code samples below to extract text from PDF files.

C#

// create TextExtractor object from PDF Extractor SDK 
           TextExtractor extractor = new TextExtractor("demo", "demo");

           // load pdf document to extract text from
           extractor.LoadDocumentFromFile("sample2.pdf");

           // define a rectangle location to get text from it from pdf at 0,0 with width and height as 200x200 accordingly
           RectangleF location = new RectangleF(0, 0, 200, 200);
               
           // set text extractor extraction area to this rectangle
           extractor.SetExtractionArea(location);

           // now we can get text from this pdf rectangle from page #0
           string extractedString = extractor.GetTextFromPage(0);
               
           // write text from pdf rectangle to the console
           Console.WriteLine("Extracted from page #" + i + ":rn" + extractedString);

VB

' create TextExtractor object from PDF Extractor SDK 
           Dim extractor As New TextExtractor extractor = new TextExtractor("demo", "demo");

           ' load pdf document to extract text from
           extractor.LoadDocumentFromFile("sample2.pdf")

           ' define a rectangle location to get text from it from pdf at 0,0 with width and height as 200x200 accordingly
           Dim location as RectangleF  = new RectangleF(0, 0, 200, 200)
               
           ' set text extractor extraction area to this rectangle
           extractor.SetExtractionArea(location)

           ' now we can get text from this pdf rectangle from page #0
           Dim extractedString As String  = extractor.GetTextFromPage(0);
               
           ' write text from pdf rectangle to the console
           Console.WriteLine("Extracted from page #" + i + ":rn" + extractedString)

Tutorials:

prev
next