Solution blueprint
Introduction
This is a high-level diagram of our solution, detailing the primary modules and the steps involved in processing and accessing the information.
Platform Architecture Overview
AI Data Processor Module: This module is responsible for interpreting the uploaded files and extracting the relevant information. The type of information extracted depends on the file type; currently, we support utility invoices exclusively. Our development efforts are focused on expanding capabilities to include certificates, purchase orders, invoices, and more complex documents.
Data Validation & Heuristic Module: This module executes various heuristics and validations on the information extracted from documents. Its primary objective is to ascertain the accuracy of the information and maintain consistency across all processed documents.
Auditable Storage Module: This module manages the storage of all processed information. It creates a data point that encompasses the structured information extracted and the original document from which this information was sourced. At the moment, we handle utility data points that encompass electricity, gas, and water utilities. Any alterations to a data point are meticulously tracked to document what changes were made to the information and who made them. Optionally, we can leverage blockchain technology to enable public auditing of this information.
Audit View: Data Points are accessible via our Document AI API or through a link that generates an HTML page. This feature is particularly beneficial for users who wish to make the information available for auditing purposes. The Audit View displays the entire change history of a data point, along with the original document.
Infrastructure
Our infrastructure is divided in 2 main components:
Studio / API: This piece of technology is running in fly.io and its build with Phoenix and Elixir.
Our AI module use Open AI APIs and we use the AWS Textract Service.
Tesseract OCR and Google Document AI OCR (coming soon)
Authentication
We design our APIs for backend access by our users. To utilize these, you must have a valid access token. You can choose the duration for which the token remains valid; available options are 30, 60, or 90 days. This token allows you to execute all operations and access all our endpoints. You may also revoke the token at any time.
Last updated