Read JPEG File Summary Fields with PHP

Posted on 2007-04-04
Last Modified: 2013-12-20
In PHP4, I would like to display the "Summary" tab of a file's properties.  For example, in Windows if you right click a JPEG and go to properties, the last tab is labeled "Summary".  It displays Title, Subject, Author, Category, Keywords, and Comments.  Can I somehow read these fields via PHP into variables?  I am trying to place those fields in an online photo album.  I have found a script that reads the EXIF data but the fields above are not included in the output.  Any help would be appreciated!  Thanks!

I am using PHP Version 4.3.11.
Question by:eflanigan
LVL 10

Expert Comment

ID: 18853386
Well actually you cant. Simply by using php. This functionality is not even "default"  available for languages like vbs / wmi etc. There is actually an object in windows that will allow you to access this "Metadata" in windows, its called the windows DSOfile object that needs to be registered in windows. There are allot of items on the subject, just look for "DSOfile + PHP".

The next question you should ask yourself is "where" is this information stored. We know that allot of "metadata" on files is stored in the NTFS filesystem. Stuff like security objects and others. Copying files from one network system to the other will show you that this information is determined by that machine on wich the file resides. This means that allot of this security information is not "preserved" while being moved or copied even over different physical disks within the same system.  In not sure the summary is actually stored in the filesystem like all this other metadata or coded inside the physical file.

Anywayz you might want to experiment with it using the DSOfile.dll module.

Good luck! :D
LVL 50

Accepted Solution

Steve Bink earned 250 total points
ID: 18853903
Chris_Gralike was correct in that there is no BUILT-IN functionality in PHP to read the tags, but we can manipulate the file once we know the structure.  This involves reading the file in a binary fashion and parsing through it.  Yes, this is the hard way....

It IS the EXIF fields you are looking for, and these links are going to be very important for you:

The first is the standard for JPG.  It turns out that the EXIF data in JPG is pretty strictly based on the TIFF6.0 format, which makes this pretty simple.  Page 17 of the first link describes the overall format of the JPG wrapper.  Notice what the 4th field is in the Application Marker?  Once you find the TIFF header, it's time to look at the second link, starting on page 13.  In the JPG I was using to experiment, the TIFF header started at byte 0Eh.  It is important to keep the offset of the BEGINNING of the TIFF header in mind...all other TIFF offsets are from the beginning of that header, or <beginning of file>+0Eh.  According to my sample file, I have two bytes for the marker (II, or 49h-49h), two bytes a second marker (24h-00h), then the next 4 bytes are the offset FROM THE BEGINNING OF THE TIFF HEADER of the 0th IFD.  My value is 8, so I count 0Eh + 8 = 16h.  So I move my cursor forward to position 16h.

Page 14 of the second link describes the format of the IFD header, and the 12-byte IFD entries you will have to parse.  The first byte is the number of entries.  Skip ahead two bytes (18h) and I find the beginning of the first entry.  Mine looks like this:

9b 9c 01 00 0c 00 00 00 4a 00 00 00

In the entry itself:
bytes 0-1: tag type
bytes 2-3: field type
bytes 4-7: count of "type"
bytes 8-11: file offset

Now to interpret the pointer.  I could not reconcile the tag numbers listed in the specs with the tag numbers I'm seeing in my sample, so here's the table I've figured out for these values:

9b 9c = title
9c 9c = comment
9d 9c = author
9e 9c = keyword
9f 9c = subject

So my first tag is the title, it is of type 'BYTE', has a length of 12 bytes, and begins 61 bytes after the beginning of the TIFF header.  If I move 61 bytes down, I surely do see 'title' buried in the hex.  But why is title 12 bytes long?  Because it is stored as:

74 00 69 00 74 00 6c 00 65 00 00 00

In sorta-plain english:
(t) (null) (i) (null) (t) (null) (l) (null) (e) (null) (null-termination) (null)

I would believe this for UTF encoding or whatnot, though the specification claims only the comment field can handle 2-byte character codes.

Now that I've retrieved the title, I continue to the next IFD entry.  The file said there were 5 entries (go figger...there's 5 fields I'm playing with), and you can follow the pointers to each one.

Good luck!

Author Comment

ID: 18855197
Thank you both for your quick response!  The reason I asked how to do it in PHP4 is that my host is GoDaddy and I *thought* they only allowed PHP4.  Fortunately I just found out that if you change the extension of your PHP file to ".php5" you can use PHP5.  Therefore I am just going to use the handy "exif_read_data();".  For those of you checking out this question you can find more information about the exif function here:

Featured Post

Space-Age Communications Transitions to DevOps

ViaSat, a global provider of satellite and wireless communications, securely connects businesses, governments, and organizations to the Internet. Learn how ViaSat’s Network Solutions Engineer, drove the transition from a traditional network support to a DevOps-centric model.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Get to know the ins and outs of building a web-based ERP system for your enterprise. Development timeline, technology, and costs outlined.
Because your company can’t afford for you to make SEO mistakes, you’ll want to ensure you’re taking the right steps each and every time you post a new piece of content. This list of optimization do’s and don’ts can help you become an SEO wizard.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question