Solved

HEX in varchar to int or bigint conversion

Posted on 2003-10-22
27
1,462 Views
Last Modified: 2012-06-27
Hi all,

Im desperately trying to convert a hex value into either an int or big int.

The snag is that the hex number is bulk inserted and the sourcefile does not stipulate 0x infront of the number so bulk insert blows up if i use varbinary to import the field.


Thus im bulk inserting the 10 million+ records into a varchar field, I then need to convert this to int.

I cant afford any form of cursors or string manupulation to do this.   because the operation nees to complete within a few seconds.


In short....  I want to be able to convert to an integer a hex value stored in a varchar where the varchar looks something like this:

MyField
---------
f575e
524eea

etc.
0
Comment
Question by:ShogunWade
  • 12
  • 10
  • 2
  • +2
27 Comments
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
I suggest a "brute force" method, similar to:

UPDATE yourTable
SET intColumn =
      (CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn), 1), '0123456789ABCDEF') - 1) +
      CASE WHEN LEN(hexColumn) > 1 THEN
            CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn) - 1, 1), '0123456789ABCDEF') - 1) * 16
            ELSE 0 END +
      CASE WHEN LEN(hexColumn) > 2 THEN
            (CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn) - 2, 1), '0123456789ABCDEF') - 1) * 16 * 16
            ELSE 0 END +
... and so on, for up to 8 chars ...


This assumes a length of at least one; if needed, add a CASE check to the first calc as well.

This also assumes that no bad hex chars are present.  If necessary, add a WHERE to the UPDATE to insure that no invalid values are converted.

0
 
LVL 69

Accepted Solution

by:
ScottPletcher earned 100 total points
Comment Utility
Now that I look at it, I think >= is clearer than >, and I finished the full 8, something like this:

UPDATE ...
SET ... =
      (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn), 1), '0123456789ABCDEF') - 1) +
      CASE WHEN LEN(@hexColumn) >= 2 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 1, 1), '0123456789ABCDEF') - 1) * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 3 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 2, 1), '0123456789ABCDEF') - 1) * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 4 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 3, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 5 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 4, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 6 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 5, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 7 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 6, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 8 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 7, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16 * 16 * 16
            ELSE 0 END
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
Obviously you can normalize the 16 * 16 ... to a fixed number; I was just in a hurry :-).
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Thanks for the suggestion Scott, however I tried a very similar technique but it was to slow.   The thing is that this 10 million records is just from one of many files received to by the server every day.  all the data is denormalised and unvalidates and gets cleaned normalised aggregated and archived all within a very time timeline, I was kind of hoping that there would be some form of cast or convert type method of solving the problem.
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
If there is no faster alternative then I will accept your answer.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
Hmm, given today's CPUs, I wouldn't have that logic would be *that* slow.  You could use C++ (or assembler) to create a .DLL and use that as an extended SP to do the conversion.  That approach seems to follow what MS does when they need more speed :-)
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
I agree that it is not v slow but given the very tight timeline to perform all the processing the task must complete within a few seconds max.
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Having re-thought over the problem what i actually need is not necesarily a true conversion....

Il try to explain....

Each record in the file consists of a number of fields   the combination of 4 of them must be unique for a "dialogue" (ie a series of records that relate to each other)   one field is a hex value but it is stored purely for "threading the dialogues together" (ie joining) but must be availble or recalculatable for queries later.    Each "dialogue" is assigned a newid() upon initial normalisation and thus the hex reference is purely informational after this stage.

I thought maybe doing CONVERT(varbinary(20),myhexfield)  

myhexfield is max 9 hex characters at present

however the ultimately I need to be able to store the resultant value in an int datatype.   if i do CONVERT(int,CONVERT(varbinary(20),myhexfield)  
)  i will overflow max int.

In other words I need a crude and quick method of taking a hex value in a varchar field and converting it to any int value that is reversable.   It doesntmatter how slow the reversal is so long as it can be done and the initial convertion is lightning fast
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
The chance of the other three fields in a given dialogue colliding is is small  about 0.5%

0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
I strongly suspect that I am oversimplifying the problem, but perhaps you could resort to dynamic SQL as in:

exec('select cast(' + myhexfield + ' as bigint) from table1')

Anthony
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
When you said you were storing the result in an INT, I assumed the max length was 8, since an INT can only handle up to x'7FFFFFFF'.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
You could use dynamic SQL, but whether or not that would be quicker is problematic, because I think you would have to use sp_executeSQL with an OUTPUT parameter to get the translated value back.
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Sorry,  im suffereing from brain death today i think......  the hex values i get are unsigned so i planned to offset the value by min int but event still, U are correct my value wont fit into an int so no algorthm applied to it can possibly be truely reversable.

thus i must either use a "lossy" function" or a bigint.

AC, let me thing for a touch on what you are suggesting and see if it will do the trick.
0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
You could use a BIGINT.  And it's easy to convert the value from INT back to HEX, you just reverse the process above.  It would be slower, but you said that's not an issue during reversal.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
Oh, btw, BIGINT will slow it down :-(
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
AC, im not quite following,    

Myhexfield is a field within table1.. so i dont entirely follow how this would work.
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
A bit more info and a bit more thought.......


The most significant bits of the hex number can be lossed without to much of a deremental effect in order to squeze the result into an int.  

Perhaps I should have mentioned also that disk space is at a premium,  though cheap, I filled 1.5TB since sepember so saving 4bytes per record is pretty important.
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
If you can strip the value down to eight chars, then of course you will be OK going to an int.  You could even convert eight chars to an int, and the remaining char to a tinyint -- 5 bytes instead of 8 and likely just as fast, or faster, than a BIGINT.

0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
As Scott has suggested you would have to use sp_executeSQL with OUTPUT, but this is sort of what I had in mind:

declare @X as varchar(20)
Set @X = '0xffffffffff'
exec('select cast(' + @X + ' as bigint)')

Again, I am sure I am not fully understanding what you need, so feel free to ignore this comment <g>

Anthony
0
 
LVL 69

Expert Comment

by:ScottPletcher
Comment Utility
You could also, of course, create a table with hex values between 00 and FF and the corresponding decimal values and use a join to get the values.  I originally thought that this would be slower than the calculation method because it would require a join for every group of two bytes in the hex string, in this case 5 joins (since it's 9 chars, the last join would of course be '0' + the ninth char).  But I guess you never know, in theory I guess it could be quicker, since it would be only one data page, although I'd be very surprised.
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Hmm,  I must admit the same thought crossed my mind, but i havent tested it.    

Ill give it a try in the morning (UK time) and let you know how it goes.   Thanks for the input so far will pick up on this tomorrow.

Ta
0
 
LVL 26

Expert Comment

by:Hilaire
Comment Utility
Hi,

I've been following this thread with much interest since the beginning.

Beeing very curious about extending SQL server capabilities, and having previous experience in creating open server in Sybase (looks like SQL server's ODS) to add interface with MOMs (IBM MQseries), i tried to write a piece of C code that could do the job.

It should be quite fast since it uses >> bit-shift operators to build and int value from a HEX string (length 0-8 cars)

The problem I'm facing now is that I don't know how to make this function available in SQL server.
From what I have understood so far, xp_* user-defined procedures are supposed to return recordsets, but I'm not sure they can be used like a normal function in a statement like

update mytable set intvalue=xp_HexToInt(strHexValue)

If the workaround is to call the sp with output parameter inside a cursor, it will take ages and It's not compatible with ShogunWade's requirements

ScottPletcher was suggesting to use a dll, any ideas how to implement it ??

If anybody cares, I can post the C code

Hilaire
0
 
LVL 26

Assisted Solution

by:Hilaire
Hilaire earned 25 total points
Comment Utility
Hi,

from what I have read so far,
it sounds like using an extended stored procedures is not a suitable solution in this situation, since the only way to call it would be

exec xp_strHexToInt(@strIn varchar(8), @intValue int output)

for each row .... :((

I also tried to do the job with a function declared as a method in a active X dll. Calling it using sp_OA* is even worse ...

I think (I may be wrong) that creating user-defined external scalar functions is not possible for the moment.

It seems that the new SQL server version (Yukon ?) due in 2004 will allow to extend SQL Server built-in scalar functions with user libraries made with .Net Studio (.net Framework).

The syntax should be something like

Create assembly MyLib from '\myPath\myDll'

Create function ext_udf_MyFunction(@param1 type, ....) returns output_datatype
external name 'MyLib.MyFunction' deterministic

We'll have to wait ...

Nobody seems to care, but here's my code (its' written anyway)
(no #includes needed (just <stdio.h>))

int strHexToInt(char *val) {
      int retval;
      retval=0;
      switch(*val){
            case 'A':
            case 'a':
                  retval=10; break;
            case 'B':
            case 'b':
                  retval=11; break;
            case 'C':
            case 'c':
                  retval=12; break;
            case 'D':
            case 'd':
                  retval=13; break;
            case 'E':
            case 'e':
                  retval=14; break;
            case 'F':
            case 'f':
                  retval=15; break;
            case '0':
            case '1':
            case '2':
            case '3':
            case '4':
            case '5':
            case '6':
            case '7':
            case '8':
            case '9':
                  retval=(*val)-48; break;
            default:
                  return 0;
      }
      while (*(++val))
      {
            retval <<= 4;
            switch(*val){
                  case 'A':
                  case 'a':
                        retval+=10; break;
                  case 'B':
                  case 'b':
                        retval+=11; break;
                  case 'C':
                  case 'c':
                        retval+=12; break;
                  
                  case 'D':
                  case 'd':
                        retval+=13; break;
                  case 'E':
                  case 'e':
                        retval+=14; break;
                  case 'F':
                  case 'f':
                        retval+=15; break;
                  case '0':
                  case '1':
                  case '2':
                  case '3':
                  case '4':
                  case '5':
                  case '6':
                  case '7':
                  case '8':
                  case '9':
                        retval+=(*val)-48; break;
                  default:
                        return 0;
            }
      }
      return retval;
}





I hope my contribution is not completely off-topic.
Forgive me and my crappy english if it is.

Cheers

Hilaire
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Hi Hilaire,

Sorry i didnt respond a yesterday,   I was off sick.   Im just catching up on a few things and hopefully in a couple of hours or so will get back onto the thread

Regards

Mike
0
 
LVL 34

Expert Comment

by:arbert
Comment Utility
Just a suggestion, if you only really need to return and integer to link the original "hex" values together couldn't you use the CHECKSUM function around the field you want to get the int/bigint for?  Not sure how slow it would be....

Brett
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Hi All,

I havent forgotten about this thread,  Im just a touch tied up with other stuff at the mo.

Arbert,   as a temporary workaround I have been using checksum, but unfortunately I an getting to many colisions and it takes about 3 mins to run.  

Havent benchmarked all the other suggestions yet,  hopefully may have time to get back onto this later today.
0
 
LVL 18

Author Comment

by:ShogunWade
Comment Utility
Hi All,

ive been doing some benchmarking and am still struggling with the performance issue.  I suspect that I may need to rethink the problem.

There are some good suggestions and help you have given me. Rather than keeping the question open,  I thought I would close it and split the pointsas below
If this is acceptable to you all, I would suggest I give:

100 to Scott
and 25 to Hilaire

0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

Suggested Solutions

JSON is being used more and more, besides XML, and you surely wanted to parse the data out into SQL instead of doing it in some Javascript. The below function in SQL Server can do the job for you, returning a quick table with the parsed data.
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Via a live example, show how to extract insert data into a SQL Server database table using the Import/Export option and Bulk Insert.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now