Extract PDF to XML file in C# using PDF Extractor SDK - ByteScout
Announcement
Our ByteScout SDK products are sunsetting as we focus on expanding new solutions.
Learn More Open modal
Close modal
Announcement Important Update
ByteScout SDK Sunsetting Notice
Our ByteScout SDK products are sunsetting as we focus on our new & improved solutions. Thank you for being part of our journey, and we look forward to supporting you in this next chapter!

Extract PDF to XML file in C# using PDF Extractor SDK

  • Home
  • /
  • Articles
  • /
  • Extract PDF to XML file in C# using PDF Extractor SDK

The sample below allows to convert PDF to XML (eXtensible Markup Language) file using Bytescout PDF Extractor SDK. You can use this option both from PDF Extractor SDK Dashboard and from Bytescout PDF Viewer (Data Extraction > Extract as XML):

C#

using System;
using System.IO;
using System.Text;
using Bytescout.PDFExtractor;
using System.Xml;
using System.Drawing;
using System.Diagnostics;

namespace PDFtoXML
{
	class Program
	{
		static void Main(string[] args)
		{

            // Create Bytescout.PDFExtractor.XMLExtractor instance
            XMLExtractor extractor = new XMLExtractor();
            extractor.RegistrationName = "demo";
            extractor.RegistrationKey = "demo";

            // Load sample PDF document
            extractor.LoadDocumentFromFile("sample3.pdf");

            extractor.SaveXMLToFile("output.xml");            

			Console.WriteLine();
			Console.WriteLine("Data has been extracted to 'output.xml' file.");
			Console.WriteLine();
			Console.WriteLine("Press any key to continue and open OUTPUT.XML in default viewer...");
			Console.ReadKey();

            Process.Start("output.xml");
		}
	}
}

Tutorials:

prev
next