Key Benefits
- Replace manual data entry workflows with automatic and low-code data extraction engine
- Built-in data extraction engine provides up to 10x faster time to market;
- No cloud or Internet required to process documents;
- Battle-tested by thousands of companies;
- Optional customization, integration, training, expert review, maintenance.
- REST API interface for use from almost all programming and scripting languages;
- Available Modules:
- PDF Extractor – extract unstructured data from PDF as CSV, XML, JSON, plain text. Get attachments, form fields, run OCR processing with multiple languages;
- HTML to PDF – generate high quality PDF from HTML and HTML based templates;
- PDF Generator – split PDF, merge PDF, fill PDF forms, add text and images to PDF, set security, remove security options;
- PDF To HTML – convert PDF into HTML representation with text layout, tables, styles, font styles, images and vectors;
- PDF To Image – render PDF to PNG, PDF to JPG with high quality and high resolution. Control rendering by enabling layers of text, images, vectors;
- Document Parser – extract data from documents based on no-code required templates. Automatically detects tables, extracts fields, supports multipaged tables and much more!
- Barcode Reader – read barcodes from images and from PDF documents. Reads QR Code, Datamatrix, Code 128, Code 39 and dozens of other 1D and 2D barcode types;
- Barcode Generator – generate QR Code, QR Code with images inside, Datamatrix, PDF417, Code 128, GS1, Postal codes. All popular barcode types are supported;
- Email Extractor – extract detailed information about email messages, extract attachments;
- Spreadsheet – convert XLS and XLSX to CSV, XML, JSON. Generate PDF from spreadsheets;
- Document To PDF – convert documents and spreadsheets to PDF;
Key Features
- Deploy in less than 30 minutes;
- Optimized for use on Amazon AWS, Microsoft Azure and Google Cloud servers;
- Works offline and can process data solely on your server (even without access to the Internet);
- Extract data and text, tables, emails, text from images, attachment, scanned documents, spreadsheets, barcodes and other types of documents;
- Powered by AI and machine learning, supports damaged and mixed content documents;
- Supports multiple ways for unstructured data extraction: automatic, low-code templates, raw data extraction;
- Process sensitive data on your own server. Optional cloud storage is also supported;
- REST API interfaces for Java, VB, C#, Javascript, PHP, cURL, ASP.NET, Python etcc;
- Hundreds of source code samples that you can copy and paste from;
- High priority technical support and updates are included;
Enterprise Features
- Easy to deploy;
- Scalable and configurable;
- Customizable with multiple modules;
- Configured for max performance;
- Can work with sensitive data, no cloud is required;
- High priority support and updates from our experts;
Technical Details
- REST API with dozens of data handling errors that can be called from Java, PHP, Javascript, .NET languages or even from cURL inside your intranet;
- Built upon modular architecture so you may select functions you need and turn off other functions;
- Battle-tested by thousands of users and millions of files by users of ByteScout Tools and on-demand Web API;
- Secure document storage with options for:
- local file storage (on the same server);
- optional remote secure storage based on Amazon AWS S3 bucket with optional encryption;
- Background jobs are supported for large files processing (for example, PDF documents with 2000+ pages or more);
- Runs on standard Windows IIS Server (on Windows Server 2012 and higher). Can run along with other web applications on the machine;
- Control over the performance: you may use optimized mode if you use ByteScout API Server along with other applications on the server, or use “exclusive” mode that will utilize server’s CPU and memory to provide maximum and fastest performance;
- Data Extraction functionality included: PDF tables to CSV, PDF tables to JSON and XML. Supports image to text extraction functionality through built-in Optical Character Recognition support. Non-English languages are supported: Spanish, German and many other languages are supported for scanned documents;
- Document Data extraction tools: invoice parser engine for automated invoice parsing, custom data extraction based on pre-made templates (full-featured template editor is included);
- PDF tools functionality included: PDF optimization, PDF merging, and splitting, PDF pages reordering, rendering of PDF to PNG, PDF to JPG, PDF to multi-page TIFF, PDF to Searchable PDF using OCR;
- Full set of PDF generation tools: converting HTML to PDF, URL to PDF, images to new PDF, JPEG to PDF, PNG to PDF, TIFF to PDF, RTF to PDF, DOC to PDF, DOCX to PDF, PPT to PDF;
- Barcodes tools
- Can decode dozens of popular barcode types from both images and PDF. Supported types include Code 128, Code 39, QR Code, Datamatrix, PDF417, UPCA, EAN, Interleaved 2 of 5, GS1, GS1 Databar and many others. Supports noisy pictures and scanned PDF files with rotated, skewed and damaged barcodes;
- Can generate all modern barcode types: Code 39, Code 128, QR Code, PDF 417, Datamatrix, GS1 Databar barcodes, EAN, UPC-A, UPC-E, UPC, and postal barcodes. Supports options and variations for subformats.
- E-Signature tools: build your own e-signature solution;
- Support and updates are available from the engineering team; We can help you to create your first Proof of Concept based on ByteScout API Server;
Server Requirements
- Windows Server 2012 or higher;
- Microsoft .NET Framework 4.00 or higher installed;
- IIS server;
- 500 MB or more free file space.