Automation and processing of email based on content in email

wfninpa
wfninpa used Ask the Experts™
on
Hello fellow geniuses.

In my best efforts to save time I need to come up with a solution for the following scenario:

1. Email message is received, it has an attached PDF file that contains an order number in the following format 123-456789-0123456 and a single UPS, FedEx or UPS tracking number somewhere in the PDF.

2. I need to be able to automatically extract the order number and the tracking number.  For example we will say that our data looks like this somewhere in the PDF:

123-456789-0123456
9400 4461 0122 2046 4902 26

The tracking number may or may not contain spaces for USPS and FedEx formats.  So I need to find a tracking number inside the PDF that is a UPS, FedEx or USPS tracking number.

3. I need to save the extracted information into a mySQL database for later use.

My thoughts were to have some sort of application that can receive the emails at a dedicated email address so the data can be extracted and saved.

What is the shortest path to a solution this?  What can the solution be?
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
I have used software by http://www.abbyy.com/

Although I am not sure it can interrogate information in a PDF from WITHIN outlook.

They have got software that can interrogate info in documents, with a fuzzy match (for order numbers etc) and integrate into SQL.

Flexicapture may do this.

http://www.abbyy.com/flexicapture_engine/

It may be worth talking to them to see if what you need is possible. It is not free software though.
You might what to have a look at the free automation software from Autoit.  This tool can help move the data from a file into a DB.

http://www.autoitscript.com/site/

Thanks
JC
If you don't mind to pay for the solution, you could try
Rowby Goren Makes an Impact on Screen and Online

Learn about longtime user Rowby Goren and his great contributions to the site. We explore his method for posing questions that are likely to yield a solution, and take a look at how his career transformed from a Hollywood writer to a website entrepreneur.

You could use AutoIt to monitor a mailbox, extract the attached pdf file, make a call to a free pdftotext command line utility, and RegEx the txt file for your number.

http://www.autoitscript.com/site/
http://www.colorpilot.com/pdf2text-command-line.html
http://gummydev.com/regex/ (see common patterns at bottom)

Author

Commented:
AutoIt looks good although I really need a solution that is not GUI-based.  I need something that basically checks for new emails, downloads them with the PDF attachment and parses both the email send, subject, body and the PDF attachment.
AutoIt was originally designed to make Windows GUI automation very easy. It has however matured into a power scripting language that is easy to learn. I said that to make the point that GUI automation to not the only project that you can take on with AutoIt.

If I had some additional details I could provide some proof of concept code. Is this script going to run on a personal computer or a server? If pc, then what mail client is in use? If server, what mail server is in use? Can you provide an example message format and pdf attachment?
You should try the email parser / attachment parser from mailparser.io and then you can send the data to your DB.

Blog article is linked above. They have a lot of flexibility in their parser to extract and options to send the data via native integrations/webhooks/Zapier.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial