Enhanced Email Extractor for IBM Datacap

Posted on July 15th, 2021 Blog

In a post-COVID world, we’re seeing an increase in electronic communication. Due to an abundance of digital content and a dispersed workforce, our customers need a more robust way to ingest and manage their growing volume of email data.  

MagicLamp developed the Enhanced Email Extractor to work with IBM Datacap to better capture and manage the flow of email data in an organization. 

With an ever-growing number of companies sending and receiving digital documents, it only makes sense that we begin to see more and more documents become ingested via email vs physical scanning.  

Email Ingestion vs. Email Scanning

There are many benefits to email ingestions vs. scanning that make it an easy business decision when possible. Email ingestion allows you ensure the quality of the source document used during processing vs. physical scanning as scanning an image can either introduce some degradation in the image due to either an improperly configured or maintained scanner.  

Email ingestion also has the added benefit of being an automated task, where the system will automatically monitor the desired mailbox waiting for a new email to appear so that it can be ingested into the system. This is a huge benefit vs. physical scanning which requires both operators to use the physical scanners as well as individuals who maintain the physical scanner on a regular basis. 

Email ingestion however is not without some common pitfalls such as requiring the system to handle a wide variety of file types, handling corrupt files, handling email files which are received from a wide variety of email programs (Outlook, Mac Mail, Gmail, Yahoo Mail, Thunderbird etc.) all of which can generate and attach files to the email in slightly different manners and handling different configurations of mailboxes (standalone vs shared mailbox, traditional login vs OAuth). 

Enhanced Email Extractor

This is where the Enhanced Email Extractor comes into play. Developed with years of experience in triaging and resolving various email ingestions issues, the extractor can handle 99% of the most common email ingestion pitfalls.  

We leverage multiple methods of email parsing to ensure that all source documents contained on the EML are all properly identified and ingested into the system regardless of whether they are attached traditionally or inline, whether they originate from Mac Mail or Outlook. Enhanced Email Extractor also features robust fault detection which allows for the system to be able to identify emails that would traditionally either lock up a system or cause it to degrade.  

One such example is when a corrupt email is detected but fails to be ingested into the system causing it to remain in the inbox locking up the system. The Enhanced Email Extractor will identify this issue and automatically move the email into a ‘Problem’ folder within the mailbox where it can be triaged by the administration team allowing for the normal flow/processing of the effected mailbox. 

The Enhanced Email Extractor also offers the ability to do a ‘round robin’ email pulling, leveraging a database table. The database table contains the encrypted login information for one to many different mailboxes that are then leveraged by the Enhanced Email Extractor to cycle through each entry, checking each mailbox for possible emails to be ingested. What this means is that instead of having to configure a processing thread for each mailbox, a single processing thread can be leveraged to achieve the same results.