Machine Learning Technology


Many businesses still receive unstructured documents in their purchase-to-pay automation process. Documents, such as invoices, are printed out on paper and sent via regular postal services or sent as PDF attachments in e-mails. Either way is not efficient because both structured and validated data are needed to import these documents into ERP-systems. So, these documents must be processed, converted to a structured version. Today, most commonly used solutions rely on OCR-technology or pseudo manual retyping effort.

The first category is known for fast processing but tedious, inaccurate, and requires human intervention to create templates. By definition, Optical Character Recognition (OCR) is the technology that allows machines to scan printed or handwritten documents, PDFs, images taken by a camera, and convert image data into editable text formats automatically for further processing.

Current OCR systems provide a partial solution to some manual data capture issues, but they also create new ones. Human operators must write rules and templates for every invoice layout, making maintenance a never-ending task.

Secondly, retyping provides better overall accurate results, but the process is much slower because it requires human effort, it’s way more expensive than OCR, and it’s difficult to scale up because this requires additional trained staff.

Today D Soft introduces DocFlows, a new approach to solve the problem, a solution that combines the best of both worlds. It’s fast, scalable, affordable, and does not require any manual input from the customer. Unlike traditional OCR systems, DocFlows does not require templates.

Because the system is document-structure agnostic and relies on machine-learning technology, it delivers increasingly accurate results with continued use.


Fast, scalable Infrastructure

DocFlows runs on a high performance, highly scalable Kubernetes platform. We guarantee every processing with seconds.


DocFlows uses a transactional model.


Because DocFlows validates against a predefined XML schema, you can rely on result to be compliant.


Converting documents to structured files provides basic data for further in-depth analysis.

Our goal is to provide a fully hands-off document processing service. D Soft trains DocFlows for your specific market and type of documents. You provide us with samples, we train the model. DocFlows is fully integrated with DocTrails.


DocFlows is a three-step process

DocFlows relies on Machine Learning technology and custom built-in logic. Every step of the process is governed by confidence scores. DocFlows selects the candidate with the highest score and learns from that process.

Step 1:
First, an incoming document is classified. During this step, DocFlows uses its built-in Machine Learning algorithms to determine Sender and Receiver. For new Senders, DocFlows requires 5 similar documents to train its models.

Step 2:
Secondly, DocFlows will extract data from the document transform it into a structured document based upon the labeled data stored in models.

Step 3:
Finally, DocFlows will evaluate the result based on the structured required output. DocFlows calculates line-item integrity, VAT totals, and much more.


DocFlows works with its own system of quality labels. The result of processing in DocFlows always receives a quality label ranging from F to A. Files with an A label are validated on various criteria, such as the presence of all legally required fields, consistency between line items, invoice totals, VAT, etc. Files with an A label can in principle be imported directly.


DocFlows provides multiple input and output channels that can be combined as workflows. Simply choose one input channel that will accept your documents and combine it with an output channel that will return your processed XML documents.

Web Portal
Using the web portal, users upload documents to be processed, configure document flows, monitor processing and overall system performance.  
DocFlows Connector
When installed, DocFlows Connector will monitor a folder on your system for incoming PDF files, process them, and put the result back in another folder.


Certainly, the simplest way to get up and running with DocFlows, just scan your documents and forward them to your DocFlows e-mail address.
Cloud Sharing Platform
Simply link DocFlows with your favorite Cloud Sharing Platform account and start processing PDF documents right away.    
Legacy environments might prefer to use old-school File Transfer Protocol. Upload your documents to your DocFlows sFTP account and DocFlows return the result in another folder.





DocFlows provides an extensive RestAPI that provides methods to configure channels, up and download documents and monitor performance.


DocFlows Printerdriver

For easy communication with DocFlows choose our DocFlows PrinterDriver. Download it from the Windows Store, install it and send your document to DocFlows from within any Windows App.