Solved

Decoding SMS PDUs

Posted on 2009-04-12
8
3,049 Views
Last Modified: 2013-12-09
Dear experts,

I am developing a library that can be used to encode and decode SMS text messages. I have attached a debug log from the DLL.

The DLL works perfectly for plain text SMSs  (with no EMS content), including concatenated/multi-part SMSs. My challenge is when the SMS has an EMS attachement like a SmallPicture, PredefinedSound, or Formated Text. It fails to decode the User Data part (SM)  when the Default 7Bit Alphabet has been used... see the attached log for details. However, if you remove some bytes from the beginning of the User Data, you will be able to decode part of the message correctly.

For example, in the log there is the PDU message (the very last message in the log as received from the GSM modem):

07916277010120F4440B916277640266F300009040210145908031090003FC02020B022505907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

All details (MTI, Addresses, Time Stamp, User Data Header, Short Message) are correct, but I am failing to decode the User Data (i.e. Short Message):

907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

When I try to decode this message using my DLL I am getting garbage like this:

"Hvù"?àù$å?'Éw$Fs&ù$å?'ìBÅßì
ùj.fü&NW¥"

whereas I should be getting the text:

"in two parts. 2nd part has a melody: "

If I remove the first 6 characters from the beginning of the user data, I am able to get part of the text like so:

E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

gives me this:  

"two parts. 2nd part has a melody: "

Any Idea where I could be getting it all wrong?
SM-Debug.txt
0
Comment
Question by:bmatumbura
  • 4
  • 2
  • 2
8 Comments
 
LVL 5

Expert Comment

by:xtravagan
Comment Utility
Judging from your data string SMS-DELIVERY TYPE indicates 44, which means the TP-UD contains a TP-UDHI

If I am not off

31 = TP-UDL = 46 bytes
09 = Length of user data header

0003 FC0202 - IE A (concatenated message?)
0B02 2505 - IE B

UD
907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

You said you removed 6 bytes to get your text, but in your example you removed only 3?

I will have to encode your text to be sure what the test should look like
0
 
LVL 11

Author Comment

by:bmatumbura
Comment Utility
Thanks for the timely response xtravagan:

Correct, the User Data Header + SM is:

31090003FC02020B022505907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

Thus the User Data Header is:

0003FC02020B022505

and the SM is:

907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

Take NOTE: I said I removed 6 characters/digits, thus implying 3 bytes/HEX digits.

Let me know if you do succeed in decoding the message correctly.
0
 
LVL 11

Author Comment

by:bmatumbura
Comment Utility
The user data header is being correctly decoded as shown in the extract from the log in the code window below.

Thus the header:

0003FC02020B022505

has two information elements:

0003FC0202, interpreted as follows :-

00 - ConcatenatedShortMessage8BitRef
03 - Data Length
FC - Message Reference
02 - Total number of concatenated parts
02 - Second part of concatenated message
and

0B022505, interpreted as follows:-

0B - PredefinedSound/Melody
02 - Data Length
25 - ???
05 - Predefined Melody number???
+ SM User Data Header +

=========================

Length of UDH: 9

Number of Information Elements: 2

Information Element 1

+ UDH Information Element +

=============================

  IE Identifier: ConcatenatedShortMessage8BitRef

  IE Data Length: 3

  IE Data: FC0202
 

Information Element 1

+ UDH Information Element +

=============================

  IE Identifier: PredefinedSound

  IE Data Length: 2

  IE Data: 2505

Open in new window

0
 
LVL 5

Expert Comment

by:xtravagan
Comment Utility
I don't think I follow now. I would have thought that the SM is the melody in some sort of format and that the first chunk was the above mentioned text?

Because I can't seem to find the text in?
E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520
With or without 907683

This to me in 7bit encoded GSM alphabet is not

"two parts. 2nd part has a melody: "

?

0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 11

Author Comment

by:bmatumbura
Comment Utility
You need to take into account some Fill Bits at the beginning of the SM:

907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

The byte 90 has some fill bits in it as the User Data Header does not end on a septet boundary. Please refer to page 71 - Figure 9.2.3.24 (a)  of the "3GPP TS 23.040" specification (http://www.3gpp.org/ftp/Specs/archive/23_series/23.040/23040-840.zip) for details.

If you attempt to decode the entire Header + SM:

31090003FC02020B022505907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

using the default 7Bit alphabet, you should get something like, taking into account any fill bits in the SM:

"£ç@Æ¿/¡¡é$J›@in two parts. 2nd part has a melody: "

0
 
LVL 39

Expert Comment

by:abel
Comment Utility
@bmatumbura: please do not delete a question when there's an answer available. Instead, post the answer and select "Accept As Solution" for your own comment. I have found this an interesting discussion to follow (had similar problem) and would love to see the question archived with the proper solution.
0
 
LVL 11

Accepted Solution

by:
bmatumbura earned 0 total points
Comment Utility
Thanks abel, here is the solution:

The UDL + UDH + UD PDU (i.e. the entire TPDU payload SM) is:

31090003FC02020B022505907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

This can be broken into:

31 - UDL
090003FC02020B022505 -UDH
907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520 - UD

Now, the UDH is 10 octets, i.e. 10 * 8 = 80 bits in total. This means it doesn't end on a septet boundary as required when decoding the UD using a GSM7Bit decoder. Thus fill bits have to be added to the first octet of the UD which makes it difficult to decode the UD. To work around this problem, I decided to decode the entire UDH + UD; and take out the first 12 characters from the result as they really represent the UDH when decoded (i.e. 80 bits + 5 fill bits to make UDH end on a septet boundary = 84 = 12 septets/GSM7Bit Characters (84/7))

So I decoded:

090003FC02020B022505907683E8F737081E96D3E72E90CC4D06C1C3723A081D9E83C2A07699FD26E77520

after taking out the octet: 31 representing the UDL.
0
 
LVL 39

Expert Comment

by:abel
Comment Utility
Thanks for the extensive follow-up, that will help others well.
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Suggested Solutions

Working settings for French ISP Orange "Prêt à Surfer" SIM cards for data connections only. Can't be found anywhere else !
If you need to start windows update installation remotely or as a scheduled task you will find this very helpful.
This Micro Tutorial will show you how to maximize your wireless card to its maximum capability. This will be demonstrated using Intel(R) Centrino(R) Wireless-N 2230 wireless card on Windows 8 operating system.
Viewers will learn how to connect to a wireless network using the network security key. They will also learn how to access the IP address and DNS server for connections that must be done manually. After setting up a router, find the network security…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now