We help IT Professionals succeed at work.

Encoding information in a string

Tom_Hickerson
on

I am trying to compress/encode a lot of information into the smallest number of alpha numeric characters.  For example I would like to record the date and time down to the seocond in as few as characters as possible.  I can use hexidecimal, but that only takes advatage of 0-9 and a-f is there somehthing that I can use that will use 0-9 and a-z that way I can encode more info.  

 

Comment
Watch Question

In order to get to the minimum size you need to decide how many bits you want to devote to each element.  I used to store dates in three bytes: one for the year, one for the month, and one for the day.  This limited me to 256 years but that is usually enough (I figured I would be dead before it caused a problem).  I also wasted some space storing the month and day in a whole bytes.  There is a trade off in time required to encode/decode the date versus its size (not to mention the programming effort required).  Here's how I stored dates:

given October 16, 2001

MyDateString = chr$(101) & chr$(10) & chr$(16)

I based my years on 1900 (1900+101=2001).  Note that the date is sortable in this format without any decoding.  The resulting string can be decoded using the Mid$ and Asc functions.  The concept could be extended to include the time.

Commented:
Hi!
Not optimal but very simple is to use one character (one byte)for each month,day,year,hours,minutes,seconds, which means 6 char string (6 bytes) for each datetime.

Saso

Commented:
Assuming that you do not want to allow anything to get lost, here's what you need:

seconds 0-59 requires 6 bits (0-63)
minutes 0-59 requires 6 bits (0-63)
hours   0-24 requires 5 bits (0-31)
days    1-31 requires 5 bits (0-31)
months  1-12 requires 4 bits (0-15)
years   0-99 requires 7 bits (0-127)
century ???? requires ? bits

This would be a minimum of 33 bits if you omit the century.  This translates to 5 bytes which might be compressible to 4 bytes if you manipulate the leftover combinations of the above or if you limit your years to a 64-year range.  Also, DOS ignores every other seconds, so you could save a bit that way to get it down to 4 bytes.

Alternately, you can do like most date processors do and calculate number of days since a baseline date, and the decimal corresponds to fractions of a day which can be translated into hours/minutes/seconds.
--
Important question:  since memory is so cheap, why would you want to spend a lot of time reprocessing it to save a few bytes here and there?

Commented:
Tom,
   Whew!  You have quite a question there!!!  I normally make my proposed answers brief and to the point, however you are asking something that, albeit to the point, cannot be explained briefly...

   In your question, you stated:
   "I am trying to compress/encode a lot of information into the smallest number of alpha numeric characters.
   For example I would like to record the date and time ..."

   I think the KEY words in your second paragraph is "...a lot of data..." and "For example..."  I am assuming you have more you want to compress that just dates, correct?  IF this is the case, the algorythm necessary to do this is going to be much more difficult, and I'm not sure if ANYONE will be able to give you a code example of how to do this -- especially when you use the word "data" (it leaves open the question, "What KIND of data?  Text file?  Database file? .jpg file?"  etc.

   I can explain to you how file compression routines work, such as the ones used by WinZip, WinAce, etc.
   I have also taken the time to write you a routine that will EnCode and DeCode dates.  This code is below my explaination of file compression.  It has been tested and works great!  Just copy and paste the code into a form module!

FILE COMPRESSION
   File compression works by coding repetative data.  For example, if you had a text file with the following text:

"How now brown cow. Show me how to mow grass low."
(48 bytes)

Notice all the "ow" within the two sentences.  These two characters are very repetative and are good candidates for compression.  A compressed version of that sentence would look similar to:

"H n brn c. Sh me h to m grass l. "
(40 bytes)

The character, "", would, of course, be decoded as "ow".  You could compress the two sentences even further by coding the ". " between each sentence, and designating a "key code" to the two characters.  

Greater compression results can generally be achieved with sound and picture files because there is A LOT of redundant data.

Your requested example of a date is a pretty simple one.  Numbers are the easiest to compress.  Below is a program I wrote (expecially for you!) that will encode and decode dates.

-----------------------------
INSTRUCTIONS:  Create a form.  Place 3 TextBoxes and 2 CommandButtons on the form.  Don't worry about naming them, as we will be using the default names generated by VisualBasic.

In the first textbox (Text1), enter your date in the following format:  "1/5/01 4:17:08 PM" (17 bytes)

Click the first command button (Command1).

Your encoded date will appear in the second textbox (Text2).  You should see something similar to: "" (6 bytes)

Now, click the second command button (Command2).  The third textbox (Text3) will display the decoded date.
If you have any questions or comments, I'll be glad to assist.

Hope this helps!



Private Sub Command1_Click()
Dim strDate As String
Dim strLookUp As String
Dim dteDate As Date
Dim intPos As Integer
Dim intDelimiter As Integer
Dim EncodedDate As String
Dim intVal As Integer
Dim x As Integer

    strDate = Text1.Text    'Your Date Here  (String data-type)
                            'Input date in this format: "10/16/01 3:30:44 PM"
   
    intPos = 1
    intDelimiter = 0
    strLookUp = "/"
    x = 0
   
    'Encode Date
    Do
        If strLookUp = " " Then
            Exit Do
        End If
       
        If intPos > 4 Then
            strLookUp = " "
        End If
       
        intDelimiter = InStr(intPos, strDate, strLookUp)
       
        If intDelimiter = 0 Then
            Exit Do
        End If
       
        intVal = CInt(Mid(strDate, intPos, intDelimiter - intPos))
        If intVal = 0 Then
            intVal = 254
        End If
           
        EncodedDate = EncodedDate & Chr(intVal)
        intPos = intDelimiter + 1
       
        Debug.Print "|" & intVal & "|"
    Loop
 
'Encode time
    intDelimiter = 0
    intPos = InStr(1, strDate, " ") + 1
    intDelimiter = InStr(intPos, strDate, ":")
   
    intVal = CInt(Mid(strDate, intPos, intDelimiter - intPos))
    If intVal = 0 Then
        intVal = 254
    End If
   
    If UCase(Right(strDate, 2)) = "PM" Then
        EncodedDate = EncodedDate & Chr(intVal + 12)
    Else
        EncodedDate = EncodedDate & Chr(intVal)
    End If
   
    Debug.Print "|" & intVal & "|"
   
    strLookUp = ":"
   
    Do
        x = x + 1
        If strLookUp = " " Then
            Exit Do
        End If
       
        If x = 2 Then
            strLookUp = " "
        End If
   
        intPos = intDelimiter + 1
        intDelimiter = InStr(intPos, strDate, strLookUp)
               
        If intDelimiter = 0 Then
            Exit Do
        End If
       
        intVal = CInt(Mid(strDate, intPos, intDelimiter - intPos))
        If intVal = 0 Then
            intVal = 254
        End If
       
        EncodedDate = EncodedDate & Chr(intVal)
       
        intPos = intDelimiter + 1
       

    Loop

    Text2.Text = EncodedDate

    ''Variable "EncodedDate" is now complete
    ''Place code here to do "your thing" with the encoded date.
    ''You could also make this Sub a function and pass in a date value (just
    ''make sure to convert it to a string prior to executing this sub
    ''i.e. - CStr(YourDateAsDateDataType)
   
End Sub




Private Sub Command2_Click()
Dim strDate As String
Dim tmpDate As String
Dim DecodedDate As Date
Dim intPos As Integer
Dim intDecoded As Integer
Dim strAMPM As String

    strDate = Text2.Text    'Your encoded date here
    intPos = 1
    tmpDate = ""
   
'Decode Date
    For intPos = 1 To 3
        intDecoded = Asc(Mid(strDate, intPos, 1))
        If intDecoded = 0 Then
            tmpDate = tmpDate & "0/"
        Else
            tmpDate = tmpDate & CStr(intDecoded) & "/"
        End If
    Next
   
    tmpDate = Left(tmpDate, Len(tmpDate) - 1) & " "
   
'Decode Time
    For intPos = 4 To 6
        intDecoded = Asc(Mid(strDate, intPos, 1))
        If intPos = 4 And intDecoded > 12 Then
            strAMPM = " PM"
            tmpDate = tmpDate & CStr(intDecoded - 12) & ":"
        ElseIf intPos = 4 And intDecoded < 12 Then
            strAMPM = " AM"
        End If
       
        If intDecoded = 0 And intPos > 4 Then
            tmpDate = tmpDate & "0/"
        ElseIf intDecoded > 0 And intPos > 4 Then
            tmpDate = tmpDate & CStr(intDecoded) & ":"
        End If
       
           
'        tmpDate = Replace(strDate, Chr(intDecoded), CStr(intDecoded) & ":", , 1)
       

    Next
   
    tmpDate = Left(tmpDate, (Len(tmpDate) - 1)) & strAMPM
    DecodedDate = Format(CDate(tmpDate), "mm/dd/yy hh:Nn:Ss AM/PM")
   
    Text3.Text = DecodedDate
           
End Sub

Commented:
Breakdown of the code:

If I haven't taken up enough space with my previous comment, allow me to take more....

EnCoding---
Basically, I have taken each number and created a character with the number's respective ASCII value.  Because each number depicts a whole number within the date, delimiting the encoded date or time is not necessary (I've removed all "/", ":", and spaces).
For the year 2000 (two-digit years), the value would equate to "0".  The ASCII value of "0" is Null.  Since a String data-type cannot contain a null value and will produce an error, I substituted 0 with 254.
To maintain AM/PM setting without explicitly encoding "AM" or "PM", I convert the hour into 24-hr time (a.k.a. military time), and later decipher it in the DeCoding.

Author

Commented:
I guess I should of explained things better.  We are creating a tag number we are putting on our product.  It will need to include date, time, number of pieces, type of product and possibly a few other details.  It has to be alpha numeric characters (sorry CArnold) so it can be printed out and kept as short as possible.  

What I was trying to get at was that hex uses 0-9 and A-F  or 16^n-1 for each digit.  I can use hex function to convert back and forth.  

Is there any way other than creating my own code to use 0-9 and a-z or 36^n-1 for each digit.    

 



Commented:
Hmmm...
This seems like an entirely different question altogether.  However, to answer your question, yes, you can.  We'll have to make slight adjustments to the code I've provided you.  I'll give you a new code listing in the morning.
Commented:
Tom_H,
   This is the code you need for what you are requesting.  The dates are coded into an alpha-numeric format.  For instance:

The date:  10/30/01 10:17:08 AM
Encoded date:  K4BKRI

The date:  12/25/00 9:05:08 PM
Encoded date:  MZAVFI

Follow the same instructions I included with my previous code.  Hope this fulfills your needs.

-----------------------------------------------

Option Explicit

'------------ ENCODE -----------------------
Private Sub Command1_Click()
Dim strDate As String
Dim strLookUp As String
Dim dteDate As Date
Dim intPos As Integer
Dim intDelimiter As Integer
Dim EncodedDate As String
Dim intVal As Integer
Dim intCount As Integer
Dim x As Integer

    strDate = Text1.Text    'Your Date Here  (string data-type)
   
    intPos = 1
    intDelimiter = 0
    strLookUp = "/"
    x = 0
   
    'Encode Date
    Do
        If strLookUp = " " Then
            Exit Do
        End If
       
        If intPos > 4 Then
            strLookUp = " "
        End If
       
        intDelimiter = InStr(intPos, strDate, strLookUp)
       
        If intDelimiter = 0 Then
            Exit Do
        End If
       
        intVal = CInt(Mid(strDate, intPos, intDelimiter - intPos)) + 65
        If intVal > 90 Then
            intVal = intVal - 43
        End If
           
        EncodedDate = EncodedDate & Chr(intVal)
        intPos = intDelimiter + 1
    Loop
   
   
'Encode time
    intDelimiter = 0
    intPos = InStr(1, strDate, " ") + 1
    intDelimiter = InStr(intPos, strDate, ":")
   
    intVal = CInt(Mid(strDate, intPos, intDelimiter - intPos)) + 65
    If intVal > 90 Then
        intVal = intVal - 43
    End If
   
    If UCase(Right(strDate, 2)) = "PM" Then
        EncodedDate = EncodedDate & Chr(intVal + 12)
    Else
        EncodedDate = EncodedDate & Chr(intVal)
    End If
   
    strLookUp = ":"
   
    Do
        x = x + 1
        If strLookUp = " " Then
            Exit Do
        End If
       
        If x = 2 Then
            strLookUp = " "
        End If
   
        intPos = intDelimiter + 1
        intDelimiter = InStr(intPos, strDate, strLookUp)
               
        If intDelimiter = 0 Then
            Exit Do
        End If
       
        intVal = CInt(Mid(strDate, intPos, intDelimiter - intPos)) + 65
        If intVal > 90 Then
            intVal = intVal - 43
        End If
       
        EncodedDate = EncodedDate & Chr(intVal)
       
        intPos = intDelimiter + 1
    Loop

    Text2.Text = EncodedDate
End Sub


'-------------- DECODE -------------------------
Private Sub Command2_Click()
Dim strDate As String
Dim tmpDate As String
Dim DecodedDate As Date
Dim intPos As Integer
Dim intDecoded As Integer
Dim strAMPM As String

    strDate = Text2.Text    'Your encoded date here
    intPos = 1
    tmpDate = ""
   
'Decode Date
    For intPos = 1 To 3
        intDecoded = Asc(Mid(strDate, intPos, 1)) - 65
        If intDecoded < 0 Then
            intDecoded = intDecoded + 43
            tmpDate = tmpDate & CStr(intDecoded) & "/"
        ElseIf intDecoded = 0 Then
            tmpDate = tmpDate & "0/"
        Else
            tmpDate = tmpDate & CStr(intDecoded) & "/"
        End If
       
    Next
   
    tmpDate = Left(tmpDate, Len(tmpDate) - 1) & " "
   
'Decode Time
    For intPos = 4 To 6
        intDecoded = Asc(Mid(strDate, intPos, 1)) - 65
        If intPos = 4 And intDecoded > 12 Then
            strAMPM = " PM"
            tmpDate = tmpDate & CStr(intDecoded - 12) & ":"
        ElseIf intPos = 4 And intDecoded < 12 Then
            strAMPM = " AM"
            tmpDate = tmpDate & CStr(intDecoded) & ":"
        End If
       
        If intDecoded < 0 And intPos > 4 Then
            intDecoded = intDecoded + 43
            tmpDate = tmpDate & CStr(intDecoded) & ":"
        ElseIf intDecoded = 0 And intPos > 4 Then
            tmpDate = tmpDate & "0:"
        ElseIf intPos > 4 Then
            tmpDate = tmpDate & CStr(intDecoded) & ":"
        End If
       

    Next
   
    tmpDate = Left(tmpDate, (Len(tmpDate) - 1)) & strAMPM
    DecodedDate = Format(CDate(tmpDate), "mm/dd/yy hh:Nn:Ss AM/PM")
   
    Text3.Text = DecodedDate
           
End Sub

Commented:
If you are wanting to convert each character to hex notation, you will end up with virtually the same amount of characters.

"What I was trying to get at was that hex uses 0-9 and A-F  or 16^n-1 for each digit.  I can use hex
function to convert back and forth."

IF your goal is to "encode" the dates to take up as little room as possible, using hex would not be any more efficient than using decimal.  Here are a few examples:

-------------------------
NUMBER:     9

Decimal:    57
Hex:        39
My routine: V
--------------------------
NUMBER:     30

Decimal:    (51)(48)
Hex:        (33)(30)
My routine: 4
--------------------------

You COULD use hex and later translate it to dec, but it would prove counterproductive to your goal of "be printed out and kept as short as possible".

For quantity of items, just explicitly ennumerate it.  No encoding -- doing so would be moot.
For item description, why not assign each item an item-number?  Many companies use Part No. and/or Item No. for this.
Create a table with item numbers and their respective item description.  Just reference this table to determine Item No. or Item Description.

Do you want some help setting this up?  I would be happy to assist however I can.

CArnold

Commented:
Tom_H,
  How's it coming along?  Did any of the suggestions help?

You're invited to participate in the discussion at:

http://www.experts-exchange.com/jsp/qManageQuestion.jsp?ta=visualbasic&qid=20207806

Explore More ContentExplore courses, solutions, and other research materials related to this topic.