Skip to content

Discover the latest updates within Document AI

Google Cloud

During Google Cloud Next, Google came out with some major updates! Meet Google’s new AI Agents. These updates aim to drive business results faster using the Translation Hub, DocumentAI package & Contact Centre AI. The updates are available as of 12/10. Within Document AI, Google announced two new features: Document AI Workbench & Document AI Warehouse. In this blog post, we take a closer look at the updates around Document AI and in particular Document AI Warehouse.

Discover the two new features within Document AI: 

  • Document AI Workbench: a tool that breaks down the barriers around creating custom document processors & extracting fields that companies want to use. The new feature ensures less data training is needed and the interface to label data has been made more straightforward.
  • Document AI Warehouse: a tool that fuses Google Search and DocumentAI together in an API & UI. This makes documents searchable, enabling a host of new workflows. On this feature, we’ll take a closer look.

Why is Document AI Warehouse such a cool new feature?

Document AI Warehouse activates your documents, AI-extracted data & metadata, all in a single platform. In the platform, they are searchable & manageable with the interface and API.

Technologies:

  • Google semantic search technology
  • Document AI for classification and data extraction
  • ElasticSearch managed service
  • Compute, storage & databases

Key features

Let’s take a look at the key features of Warehouse. Some new features are not yet available to the public, we’ll keep you updated on their launch date.

  • Warehouse is created with the core idea of building everything API first. Everything available in the UI or features that are described, are available through their API. This makes it possible to build workflows which are integrable with our Document AI Proxy.
  • All the features of DocumentAI Warehouse use IAM. As a result, we have fine-grained access control. Documents and folders can be user or group assigned. You create the permissions by using Cloud Identity, or imported via an LDAP or identity provider.
  • One of the key features is the use of Google Searchtechnology. Everything is queryable (searchable data: results such as the OCR of the documents, metadata, ai extraction results, … ). By using text search, data filtering with schemas, synonyms search to extend searches (like Googling something), … . You perform things like:
    • Requesting all documents uri’s with a query: “invoices that contain a tax value higher than 50 euros AND in the previous year”.
    • Finding all the Google Invoices in the UI using a keyword filter with exact match: “Google Cloud Invoices”
  • Document AI Warehouse stores the documents & results. The folders structure certain data for a user-application. This doesn’t affect the search results, unless certain folder names are filtered or users do not have access to the folder containing a document.
  • Besides the API, of course, there is also an interface available. With the possibility to search through the documents, use filters and other organisation options. Processors manually handle the documents. This gives an integrated version of Human in the Loop available for any required changes.
  • Connectors are completely new for DocumentAI. Here Google gives the option to make a Warehouse connector available, which can be linked to different repositories. There are also a number of out-of-the-box connectors such as: Sharepoint, Amazon S3, IBM FileNet, …
  • Document Workflows allows you to bring some extra automation to your documents. You create a pipeline of human inspection & failures of documents. We add our own conditional notifications. This triggers a PubSub interaction, for example.
  • We add 3 new file formats to our list. Previously, the option was to use pdfs, images (TIFF, jpg, png). Now Google supports the following Office extensions: DOCX, PPTX & XSLX.

Result of the key features

The key features give you the ability to manage documents, their properties and workflows using a single API. You structure and restructure the documents, stored in an interface that allows human inspection and manages failures. 

Easily search and organise your documents faster. With the simple UI, you explore, view, update & optimise your documents into folders. Together with the strong governance and control management access, or on-premise-/cloud storage. You manage large repositories of documents on a large Elastic, cloud-scale document platform.

Experts’ opinion within the DocAI team

The DocAI team is always happy when they try out new technology. What we really love is the fact that we can create a host of new workflows & integrations with DocumentAI & Warehouse. This enables us to automate various new features using our DocumentAI Proxy. We’ll keep you posted on the use of these new features in future blog posts, stay tuned!

Do you want to discuss your own Document AI project with us? 

Get in touch with

Mark De Winne

Google Cloud Business Developer at Devoteam G Cloud