Microsoft Flow Examples - Extracting Email Attachements | Serverless360 Blogs

How to properly extract email attachment with Microsoft Flow

|  Posted: August 1, 2018  |  Categories: Microsoft Flow

Earlier on my blog- “Flow’s to the help: How to extend SmartDocumentor OCR receive processing points (or locations) easily with Microsoft Flow”, I explained how we can easily extract attachments from emails for them to be processed by SmartDocumentor OCR. Furthermore, it can be any type of product or any other type of implementation. And if you remember, I mentioned that I intentionally want to keep that first approach as simple as possible, simulating a bit of what a regular business user would do, but by doing so the approach will have some technical limitation, mainly while dealing with HTML emails with pictures in signatures:

  • You can send multiple Evaluation Form (surveys) pictures in attachment, but you shouldn’t send any other type of pictures in attachment (that aren’t surveys). Again, we may easily create some flow rules to validate, for example only allowing certain extension’s types like PNG, JPEG, TIFF or PDF. But in this case, we will not implement these kinds of rules.
  • And avoid sending email signature with pictures, the process will work but it ends up having some SmartDocumentor and Logic Apps runs with failure – in this article, we will address this issue and how you can implement these types of validations.

How can you properly process email attachments ignoring signature pictures?

At first sight, you may think that is simple, but let me tell you now that it doesn’t exist as an out-of-the-box functionality to allow you to do this, mainly because how the connector works, and, in this case, we will be using the Gmail connector.

  • The connector will receive the email body as:
    • HTML string if it is an HTML email
    • or as a text string if is a Plain Text email or a Rich text email
  • It will receive files included in attachments:
    • obviously, all the attachments that you include in the email.
    • All the pictures that are part of your email signature, like Facebook, twitter icons, and so on.

This last step can be a problem because it may induce a lot of noise/trash in your processes like unnecessary Flow or Logic App triggers and actions (both of them pay per use) or even unnecessary product process (CPU, RAM) that we can easily avoid.

The optimized solution

To optimize the attachment extraction from email to further optimize the performance of your entire solution you need to:

  • Access the Microsoft Flow portal: https://flow.microsoft.com/ and sign in with your account
  • In microsoft.com, select “My flows” in the top navigation bar, and then select “+ Create from blank” to create a Flow from scratch.
    Create from blank
  • On the “Create a flow from blank”, select “Create from blank” to start the process of creating an empty Flow
    Create a flow

Configure the GMail Connector

  • On the Flow designer, on the Trigger event enter “Gmail” and select the option “Gmail – When a new email arrives”
    GMail Connector
  • We want that all the email received with the “Subject: SmartDocumentorSurvey” will trigger this flow, so on the Gmail Trigger configuration:
    • Confirm that “Inbox” is configured in the “Label” property
    • Specify the filter, in my case all email that has the “Subject” equal to “SmartDocumentorSurvey”
    • And confirm that “Include Attachments” property is set to “Yes”
      GMail Connector
Be aware that “Has Attachments” is a Boolean attribute that allows you to define if you want to receive only emails with attachments or not. This is the first filtering rule since we are looking for emails with Surveys in attachments.
“Include Attachments” property is another Boolean this one allows you to specify if you indeed want to retrieve attachments along with the email.

Set global variables

  • Just to avoid having multiple actions with the same configuration in different conditions branch, which may lead to mismatch configurations, we will be using variables, nevertheless, this is optional.
    • Add the next step by clicking the “+New step” button and then choose the “Add an action” option
      Set global variables
    • Go to the “Choose an action” window, enter “Variables” and select the action “Variables – Initialize variable”
      Initialize variables
    • Under the Variable action configuration:
      • “Name” property, type “IsValidAttachment”
      • “Type” property, specify the type as “Boolean”
      • “Value” property, type “false”
        Initialize variables
Initialize variable needs to be performed on Top Level, in other words, outside any cycle operation (Do Until or For Each). Therefore, initialize variable inside cycles are not permitted.
  • Now we will iterate for each attachment and for that we need to: Add the next step by clicking the “+New step” button and then choose “… More” and finally select the “Add an apply to each” option
    Apply global variables
  • On the Apply to each action configuration:
    • On the “Select an output from previous steps” property, select from the list of tokens the “Attachments” token from the “When a new email arrives” trigger
      Add attachment
All further actions and conditions will be applied inside the “Apply to each” cycle.
  • The first step inside the “Apply to each” cycle is to reset the variable “IsValidAttachment” to false, we can do this by:
    • Clicking “Add an action” button that is in the bottom of the “Apply to each” and on the “Choose an action” window, enter “Variables” and select the action “Variables – Set variable”
      Set variables
    • On the Variable action configuration:
      • Under the “Name” property, select from the combo box the “IsValidAttachment” variable
      • On the “Value” property, type “false”
        Reset Invalid Attachment

Check the type of Email, is it HTML or Text? and then process the images

The next step will be adding a condition that will be responsible to check what type of email we are handling:

  • If it’s an HTML email we need to be sure that the current attachment is not part of the email signature;
  • if is text, we will assume that is a valid attachment.

For that we need to:

  • Inside the “Apply to each” cycle, click “Add a condition” option and apply the following configurations on the Condition Action:
    • On the left “Choose a value” property, select from the list of tokens the “Is HTML” token from the “When a new email arrives” trigger
    • On the condition type, set as “is equal to” option
    • And On the right “Choose a value” property type “true”
      Add Condition
  • Inside the “If no” branch, once again click “Add an action” option and on the “Choose an action” window, enter “Variables” and select the action “Variables – Set variable”
  • This time, on the Variable action configuration:
    • On the “Name” property, select from the combo box the “IsValidAttachment” variable
    • On the “Value” property, type “true
      Set Invalid Attachment Variable

If the email is in HTML format, then we need to have more intelligence and care in the logic that we will apply. There isn’t any default action or property from the Gmail trigger that will help you decide if it is a valid attachment that we want to process or if it is just a picture that is included in the email signature. Therefore, to be able to decide this I end up creating an Azure Function (Generic C# Webhook) that receive a simple JSON message composed as:

  • “EmailBody” element: that will contain the entire HTML email body
  • “PictureName” element: that will contain the name of the attachment

Something like this:

{
"EmailBody":"<body>asasas</body>",
"PictureName":"img.png"
}

Therefore, the function will check for all the IMG tags inside the HTML to check if some of the existing tags include the name of that picture:

  • If yes, then will return a Boolean as “true” saying that it is a signature picture
    • We will know if it is a signature picture because will check for IMG tags inside the HTML to see if that picture name is present;
    • Attachment files will not be present inside the body of the message inside an IMG tag;
  • If not, will return a Boolean as “false” saying that is a valid attachment that needs to be processed

The code will look like this:

using System.Net;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System.Net.Http;
using System.Threading.Tasks;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.Azure.WebJobs.Host;
using HtmlAgilityPack;
using System;
namespace CheckIfIsSignaturePicture
{
public static class CheckIfIsSignaturePicture
{
[FunctionName("CheckIfIsSignaturePicture")]
public static async Task<object> Run([HttpTrigger(WebHookType = "genericJson")]HttpRequestMessage req, TraceWriter log)
{
log.Info($"Webhook was triggered!");
bool isSignature = false;
try {
string jsonContent = await req.Content.ReadAsStringAsync();
dynamic data = JsonConvert.DeserializeObject(jsonContent);
var htmlDoc = new HtmlDocument();
htmlDoc.LoadHtml(data["EmailBody"].Value);
var htmlNodes = htmlDoc.DocumentNode.SelectNodes("//img");
foreach (var node in htmlNodes)
{
if (node.OuterHtml.Contains(data["PictureName"].Value))
{
isSignature = true;
break;
}
}
}
catch(Exception ex)
{
log.Info(ex.Message);
}
JObject eval =
new JObject(
new JProperty("isSignature", isSignature));
return req.CreateResponse(HttpStatusCode.OK, new
{
MsgEval = eval
});
}
}
}

The response will be something like this:

{
"MsgEval": {
"isSignature": false
}
}

This Azure Function was created from Visual Studio. Also, we will not address the creation of this Function in this article. In this case, we will assume that you are familiar with Azure Functions. You can read this blog post by Steef Jan to know more on Azure Functions

Invoking Azure Function from Microsoft Flow

The next step then is to invoke this Azure Function from our Flow. Unfortunately, Azure Functions are not yet integrated with Flow. So we will not have a first-class experience connecting to it (it is planned on the roadmap). At the moment, the only way for us to connect to Azure Functions is to use the HTTP connector. To do that we need:

  • In the “If yes” branch, once again click “Add an action” option and on the “Choose an action” window, enter “HTTP” and select the action “HTTP – HTTP” action
    Choose an action
  • On the HTTP action configuration:
    • In the “Method” property, select from the combo box the “POST” method
    • On the “URI” property, type the proper URI of your Azure Function that you can retrieve from the Azure Portal.
    • On the “Headers” property, type:
      • “Content-Type” on the “Enter Key” property
      • And “application/json” on the “Enter value” property
    • On the “Body” property, specify the JSON message as shown in the picture below. The “Body” and “Name” tokens are from the “When a new email arrives” trigger
      HTTP Method
  • We then will use a “Parse JSON” action to tokenize the response from the previous HTTP action. Once again click “Add an action” option and on the “Choose an action” window, enter “Parse” and select the action “Data Operations – Parse JSON” action
    Parse JSON
  • On the Data Operation action configuration:
    • On the “Content” property, select from the list of tokens the “Body” token from the previous “HTTP – HTTP” action
    • And on “Content” property, use the JSON response described above as sample payload to generate the JSON schema
      JSON Schema
  • Now we will set our control variable according to the output of our Azure Function by clicking “Add an action” option. And also, on the “Choose an action” window, enter “Variables” and select the action “Variables – Set variable”
  • This time, on the Variable action configuration:
    • In the “Name” property, select from the combo box the “IsValidAttachment” variable
    • On the “Value” property, select from the list of tokens the “isSignature” token from the previous “Parse JSON” action
      Parse JSON action

Decide if the attachment has to be processed?

Also, to finalize our Flow, we need to decide if the Attachment is to be processed – adding it to a Dropbox folder and notify the client by email – of it is to be ignored. To do that we need to:

  • On the bottom of the “Apply to each” cycle, select the “Add a condition” option
    Add a condition
  • And on the Condition Action:
    • On the left “Choose a value” property, select from the list of tokens the “IsValidAttachment” variable
    • On the condition type, set as “is equal to” option
    • And on the right “Choose a value” property type “true”
      Condition Action
  • Inside the “If yes” branch, click “Add an action” option and on the “Choose an action” window, enter “Dropbox” and select the trigger “Dropbox – Create file”
    Dropbox – Create file
  • On the Dropbox action configuration:
    • Specify the folder in which you want to store the file, in my case: /publicdemos/smartdocumentorsurvey
    • On the “File Name” property, select from the list of tokens the “Name” token from the “When a new email arrives” trigger
    • On the “File Name” property, select from the list of tokens the “Content” token from the “When a new email arrives” trigger
      Dropbox action configuration

Send email notification to the User

The last step, on the “If Yes branch”, is the email notification to the user that is testing the SmartDocumentor Survey solution that his email was received and is being processed, to do that we need to:

  • Add the next step by clicking the “+New step” button and then choose the “Add an action” option
  • In the “Choose an action” window, enter “Gmail” and select the trigger “Gmail – Send email”
    GMail Trigger
  • On the Gmail action configuration:
    • Click the “To” property, select from the list of tokens “From” token from the “When a new email arrives” trigger
    • On the “Subject” property, specify the email subject, in my case: “SmartDocumentor submitted document”
    • On the “Body” property, we will specify the name of the file we are processing as well as the URL where you can consult the Survey Report. So, we are combining static text with some tokens provided from previous actions as you see in the picture below.
      Gmail action configuration

Finally, the flow will look like this:
Microsoft Flow

And once is triggered:

  • It will copy the attachment to a Dropbox folder. That is being synchronized with the use of the Dropbox desktop client to our local SmartDocumentor server;
  • SmartDocumentor OCR will process this survey picture and it will do the process that was described in my previous post;
  • And finally, will send an email to the user that is testing this SmartDocumentor OCR solution
    Email

This blog helps to automate certain processes in extracting the attachments in a simple and time-efficient way with Microsoft Flow. Hence, it doesn’t take much to write heavy code. Hope this helps!

Serverless360 is a one platform tool to operate, manage and monitor Azure Serverless components. It provides efficient tooling that is not and likely to be not available in Azure Portal. Try Serverless360 free for 30 days!

Free-Trial

Author: Sandro Pereira

Sandro Pereira lives in Portugal and works as a consultant at DevScope. In the past years, he has been working on implementing Integration scenarios both on-premises and cloud for various clients, each with different scenarios from a technical point of view, size, and criticality, using Microsoft Azure, Microsoft BizTalk Server and different technologies like AS2, EDI, RosettaNet, SAP, TIBCO etc. He is a regular blogger, international speaker, and technical reviewer of several BizTalk books all focused on Integration. He is also the author of the book “BizTalk Mapping Patterns & Best Practices”. He has been awarded MVP since 2011 for his contributions to the integration community.