In my recent RPA project for an insurance & finance major, we caught up with situation where the format for the input found to be different in production environment with respect to the UAT. This deviation was mainly the change of the input file type from MSG to EML.
Let’s talk about EML file format; EML is file extension for an email message saved to a file in the MIME RFC 822 standard format by Microsoft Outlook Express as well as some other email programs. It contains email header, subject, body, hyperlinks and attachments etc.
MSG is file extension for email message used as default by Microsoft outlook and exchange. It contains email header, subject, sender, body, hyperlinks, attachments etc.
Earlier during the development phase, it was only MSG files we received from the source application and we were able to parse it using Microsoft Office Interoperability framework. However during go-live we discovered that all inputs received are of the type EML and therefore the solution failed to parse it.
Microsoft Office Interoperability framework does not allow parsing the EML files and therefore the existing logic for reading the message files (msg) can’t be deployed. There are many such 3rd party libraries available online for parsing an EML file but same could impose security or compliance threat and licensing would again add up to the solution cost.
Through a great deal of online research we figured out a solution for parsing an EML files and that too without requiring any 3rd party libraries in place. Below article turned out to be very useful for achieving the desired result.
1. Open new project in Pega Robotics Studio, under solution explorer navigate to the project’s reference and do a right click and select “Add Reference”
2. Go to “COM” tab and look for the library “Microsoft CDO for Windows 2000 Library” and click to select and add to the project references.
3. Again right click on References and select “Add Reference”, navigate to “.NET” tab and select assembly reference “adodb”
4. Add a script component to your Global Container and add the below methods in it
//This script will allow loading of the EML file and returning a message object
Parameters: string emlFileName (full path for the eml file)
public CDO.Message LoadEmlFromFile(string emlFileName)
CDO.Message msg = new CDO.Message();
ADODB.Stream stream = new ADODB.Stream();
//This method below will call the above method to get message object that will expose the required properties and methods such as Subject, Sender, TextBody, Attachments etc.
Parameters: string msgfile (full path for the eml file), string Directory (folder location for the eml file, out DataTable details (This table will be populated with eml file details, out string MsgSubject (for receiving message subject, out string MBody (for reading message body)
public bool ReadMsgFile(string msgfile, string Directory, out DataTable details, out string MsgSubject, out string MBody)
CDO.Message msg = LoadEmlFromFile(msgfile);
//Reading the email subject
MsgSubject = msg.Subject.ToUpper();
//Reading sender name from message
string sendermail = msg.Sender;
//Reading the body of the message
MBody = msg.TextBody;
//Initialzing DataTable object
details = new DataTable();
//Checking if email message contains attachments
if (msg.Attachments.Count > 0)
for (int i = 1; i<=msg.Attachments.Count; i++)
// Creating a path for saving of the attachments
string savepath = string.Concat(Directory, msg.Attachments[i].FileName);
//Saving the attachments to path created above
//Retrieving the extension for the attachment saved above
string ext = System.IO.Path.GetExtension(savepath);
//Adding all the acquired details into DataTable
details.Rows.Add(sendermail, msg.Sender, savepath, ext, MsgSubject);