In this program, we’re going to see how you can take a PDF and export its data to different formats like CSV, JSON, XML, Excel, etc. We have one input file here. It contains a simple table here and let’s include the PDF file in our program. Copy and paste it into the Solution Explorer Window. Make sure it is continued in the output directory. What we’re going to do is to create an instance of the extractor i.e, for CSV, we’re having the CSVExtractor. For JSON, we are having the JSONExtractor.
For XML we’re having XMLExtractor and for Excel, we are having XLSExtractor. We’re going to load the document. Provide extra options if any, for example, if you want to change the default CSV separator then we can use CSVSeparatorSymbol. In Excel Extractor, if you want to do something like for each page and have a different worksheet then we can enable the PageToWorksheet property. Lastly, we’re going to save the output.
For CSV extractor, using (CSVExtractor csvExtractor = new csvExtractor(“demo”, “demo”), registration name and key here. Then load the document, csvExtractor.LoadDocumentFromFile(“sample_program10.pdf”). I’m going to output it, csvExtractor.SaveCsvToFile(“result.csv”). Now execute it. We can see the program compiled and output results in the bin folder. We’re having the result.csv file which contains all the data in the CSV format.
Now let us see if we want to change the extractor symbol. csvExtractor.CSVSeparatorSymbol = “^”; instead of the comma if we have the cap and execute it, then we can see the file containing data perfectly.
Likewise we can do for the XML, using (XMLExtractor xmlExtractor = new xmlExtractor(“demo”, “demo”); then xmlExtractor.LoadDocumentFromFile(“sample_program10.pdf”); and then xmlExtractor.SaveXMLToFile(“result.xml”). Let’s execute it and we’re having the result of XML which has all the data in XML format.
Likewise we can do for the JSON, using (JSONExtractor josnExtractor = new JSONExtractor(“demo”, “demo”); then jsonExtractor.LoadDocumentFromFile(“sample_program10.pdf”); and then jsonExtractor.SaveJSONToFile(“result.json”). Let’s run it and we’re having the result of JSON which has all the data in JSON format.
Lastly let us go for the XLS, using (XLSExtractor xlsExtractor = new XLSExtractor(“demo”, “demo”); then xlsExtractor.LoadDocumentFromFile(“sample_program10.pdf”); and then xlsExtractor.SaveXLSToFile(“result.xls”). let us run it.
It’s that easy to get data and convert it to whatever format you like.
also available as: