Link to home
Start Free TrialLog in
Avatar of tonelm54
tonelm54

asked on

Read MSG

Ive been looking for a way to be able to read MSG files from outlook.

Im developing a IT helpdesk and at the moment I have users dropping emails into my webapp which uploads the file which works fine, however I would like to be able to read the MSG file and then extract Body etc to display in the page.

Ive had a look at google and found seveal readers, but havnt managed to get any to work.

As normal I have ZERO budget, so looking for a free solution if possible.

Ive even looked at using something like ZamZar to upload the MSG and convert to EML which I can then handle, but Ive found it quite expensive and like said earlier I have ZERO budget for testing :-(

Any ideas

Thank you
Avatar of Scott Fell
Scott Fell
Flag of United States of America image

I think it would be an easier workflow to read the email as it comes in https://www.php.net/manual/en/book.imap.php
Avatar of tonelm54
tonelm54

ASKER

I already read the email as it comes into the mailbox for the helpdesk, but users still email directly the engineers. At the moment they drag the email across and put a synopsis of the email into the ticket, then if anyone wants to read the email they download it and read it.

I just throught it would be nice to be able to read the email and turn it into searchable text and display it in the ticket.
Have you tried a C# solution?
There is this NuGet package which claims to be able to read MSG files without opening Outlook

https://www.nuget.org/packages/MSGReader/

I don't know of any already-made MSG file readers for PHP. Since usually MSG files are accessed within the context of Windows, there is more stuff available in the Windows world, like the package Julian mentioned.

That said, I know MSG is just an extension of the compound file binary format (CFBF) that Microsoft uses for a bunch of stuff. Basically it's just a bunch of streams/blocks.

If you're not familiar with reading dynamic-length files, the gist is that there is a handful of bytes that represent the length of the data, so the file reader knows how much to read.

It's like me telling you a story but you don't really know when I've stopped telling the story because for all you know, the words "The End" and everything I say afterwards could also be part of the story I'm telling you. So before I start telling you the story, I tell you, "This story has exactly 123 words." So then I start telling you the story and after the 123rd word, you know the story has ended and now I'm just talking to you normally again.

This is the technique used in a lot of file formats (e.g. PDF files work the same way).

In addition to the header data that says "the data is # bytes long" there is usually other flags to read (e.g. in ZIP files, there are flags to indicate properties like the file's modified timestamp, and the compression method).

It works mostly the same way with MSG files. You've got a bunch of streams and properties for each one. And each stream in a MSG file can be either an "email" type, a "recipient" type, or an "attachment" type.

The catch is that a MSG file is hierarchical so you could have nested messages (think of emails that contain forwarded emails as attachments), so you have to ask yourself how you want to handle that situation.

In any event, if you've never parsed a file with streams before, it might be good practice to give it a try here and build your own parser. It's pretty valuable experience, given how often that technique is used in most file formats that we all use day to day.

Microsoft publishes the specifications of their file formats including the MSG file (I think it's referred to as MS-OXMSG). Their support blog also published a two-part series on it quite a while ago, which is how I learned it and did my own parser in C#. In any event, it's a good exercise in both binary file reading and OOP PHP.
I've been researching gr8gonzo comment on 2021-01-07 with CFBF as MS-OXMSG, but struggling reading the data in PHP, does anyone of any helping directions?
What parts specifically are you struggling with? The methods/functions to read the file?
If it helps, here's a generic class I created that was a base for some other file-format-reading classes that I wrote a long time ago. I haven't modified it in about 10 years, but I think it should still work.

class FileStream
{
   public $fp = null;
   public $unpackFormat = "";
   
   public function __construct($fp,$unpackFormat = "H*")
   {
      if(!is_resource($fp))
      {
         throw new Exception("First parameter to FileStream should be a file handle from fopen!");
      }
      $this->fp = $fp;
      $this->unpackFormat = $unpackFormat;
   }
   
   public function getChars($numChars, $raw = false)
   {
      $data = "";
      for($i = 0; $i < $numChars; $i++)
      {
         $data .= fgetc($this->fp);
      }
      if($raw)
      {
         return $data;
      }
      else
      {
         return array_shift(unpack($this->unpackFormat,$data));
      }
   }
   
   public function nextChar($raw = false)
   {
      return $this->getChars(1,$raw);
   }
   
   public function getCharsUntil($untilChar = "00", $raw = false)
   {
      $data = "";
      while(($nextChar = $this->nextChar($raw)) != $untilChar)
      { 
         $data .= $nextChar;
      }
      return $data;
   }
   
   public function getLong($numBytes = 4)
   {
      $return = 0;
      for($i = 0; $i < $numBytes; $i++)
      {
         $multiplier = pow(2, 8*$i); // Per RFC, least-to-most significant bytes, so we square as we go
         $nextChar = hexdec($this->nextChar());
         $return += ($nextChar) * $multiplier;
      }
    return $return;
   }
}

Open in new window


Example:
$fp = fopen("somefile.msg","r+");
$FS = new FileStream($fp);
$firstTenChars = $FS->getChars(10);
$thenALong = $FS->getLong();

Open in new window


I don't even know if this is the latest version of that class (and the endian order is sort of forced in getLong), but it should give you a general idea.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.