Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
Solved

# HEX in varchar to int or bigint conversion

Posted on 2003-10-22
Medium Priority
1,493 Views
Hi all,

Im desperately trying to convert a hex value into either an int or big int.

The snag is that the hex number is bulk inserted and the sourcefile does not stipulate 0x infront of the number so bulk insert blows up if i use varbinary to import the field.

Thus im bulk inserting the 10 million+ records into a varchar field, I then need to convert this to int.

I cant afford any form of cursors or string manupulation to do this.   because the operation nees to complete within a few seconds.

In short....  I want to be able to convert to an integer a hex value stored in a varchar where the varchar looks something like this:

MyField
---------
f575e
524eea

etc.
0
[X]
###### Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

• Help others & share knowledge
• Earn cash & points
• 12
• 10
• 2
• +2

LVL 70

Expert Comment

ID: 9598869
I suggest a "brute force" method, similar to:

UPDATE yourTable
SET intColumn =
(CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn), 1), '0123456789ABCDEF') - 1) +
CASE WHEN LEN(hexColumn) > 1 THEN
CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn) - 1, 1), '0123456789ABCDEF') - 1) * 16
ELSE 0 END +
CASE WHEN LEN(hexColumn) > 2 THEN
(CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn) - 2, 1), '0123456789ABCDEF') - 1) * 16 * 16
ELSE 0 END +
... and so on, for up to 8 chars ...

This assumes a length of at least one; if needed, add a CASE check to the first calc as well.

This also assumes that no bad hex chars are present.  If necessary, add a WHERE to the UPDATE to insure that no invalid values are converted.

0

LVL 70

Accepted Solution

Scott Pletcher earned 300 total points
ID: 9598898
Now that I look at it, I think >= is clearer than >, and I finished the full 8, something like this:

UPDATE ...
SET ... =
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn), 1), '0123456789ABCDEF') - 1) +
CASE WHEN LEN(@hexColumn) >= 2 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 1, 1), '0123456789ABCDEF') - 1) * 16
ELSE 0 END +
CASE WHEN LEN(@hexColumn) >= 3 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 2, 1), '0123456789ABCDEF') - 1) * 16 * 16
ELSE 0 END +
CASE WHEN LEN(@hexColumn) >= 4 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 3, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16
ELSE 0 END +
CASE WHEN LEN(@hexColumn) >= 5 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 4, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16
ELSE 0 END +
CASE WHEN LEN(@hexColumn) >= 6 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 5, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16
ELSE 0 END +
CASE WHEN LEN(@hexColumn) >= 7 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 6, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16 * 16
ELSE 0 END +
CASE WHEN LEN(@hexColumn) >= 8 THEN
(CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 7, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16 * 16 * 16
ELSE 0 END
0

LVL 70

Expert Comment

ID: 9598906
Obviously you can normalize the 16 * 16 ... to a fixed number; I was just in a hurry :-).
0

LVL 18

Author Comment

ID: 9598927
Thanks for the suggestion Scott, however I tried a very similar technique but it was to slow.   The thing is that this 10 million records is just from one of many files received to by the server every day.  all the data is denormalised and unvalidates and gets cleaned normalised aggregated and archived all within a very time timeline, I was kind of hoping that there would be some form of cast or convert type method of solving the problem.
0

LVL 18

Author Comment

ID: 9598940
If there is no faster alternative then I will accept your answer.
0

LVL 70

Expert Comment

ID: 9598989
Hmm, given today's CPUs, I wouldn't have that logic would be *that* slow.  You could use C++ (or assembler) to create a .DLL and use that as an extended SP to do the conversion.  That approach seems to follow what MS does when they need more speed :-)
0

LVL 18

Author Comment

ID: 9599081
I agree that it is not v slow but given the very tight timeline to perform all the processing the task must complete within a few seconds max.
0

LVL 18

Author Comment

ID: 9599429
Having re-thought over the problem what i actually need is not necesarily a true conversion....

Il try to explain....

Each record in the file consists of a number of fields   the combination of 4 of them must be unique for a "dialogue" (ie a series of records that relate to each other)   one field is a hex value but it is stored purely for "threading the dialogues together" (ie joining) but must be availble or recalculatable for queries later.    Each "dialogue" is assigned a newid() upon initial normalisation and thus the hex reference is purely informational after this stage.

I thought maybe doing CONVERT(varbinary(20),myhexfield)

myhexfield is max 9 hex characters at present

however the ultimately I need to be able to store the resultant value in an int datatype.   if i do CONVERT(int,CONVERT(varbinary(20),myhexfield)
)  i will overflow max int.

In other words I need a crude and quick method of taking a hex value in a varchar field and converting it to any int value that is reversable.   It doesntmatter how slow the reversal is so long as it can be done and the initial convertion is lightning fast
0

LVL 18

Author Comment

ID: 9599457
The chance of the other three fields in a given dialogue colliding is is small  about 0.5%

0

LVL 75

Expert Comment

ID: 9599657
I strongly suspect that I am oversimplifying the problem, but perhaps you could resort to dynamic SQL as in:

exec('select cast(' + myhexfield + ' as bigint) from table1')

Anthony
0

LVL 70

Expert Comment

ID: 9599669
When you said you were storing the result in an INT, I assumed the max length was 8, since an INT can only handle up to x'7FFFFFFF'.
0

LVL 70

Expert Comment

ID: 9599695
You could use dynamic SQL, but whether or not that would be quicker is problematic, because I think you would have to use sp_executeSQL with an OUTPUT parameter to get the translated value back.
0

LVL 18

Author Comment

ID: 9599817
Sorry,  im suffereing from brain death today i think......  the hex values i get are unsigned so i planned to offset the value by min int but event still, U are correct my value wont fit into an int so no algorthm applied to it can possibly be truely reversable.

thus i must either use a "lossy" function" or a bigint.

AC, let me thing for a touch on what you are suggesting and see if it will do the trick.
0

LVL 70

Expert Comment

ID: 9599829
You could use a BIGINT.  And it's easy to convert the value from INT back to HEX, you just reverse the process above.  It would be slower, but you said that's not an issue during reversal.
0

LVL 70

Expert Comment

ID: 9599832
Oh, btw, BIGINT will slow it down :-(
0

LVL 18

Author Comment

ID: 9599853
AC, im not quite following,

Myhexfield is a field within table1.. so i dont entirely follow how this would work.
0

LVL 18

Author Comment

ID: 9599936

The most significant bits of the hex number can be lossed without to much of a deremental effect in order to squeze the result into an int.

Perhaps I should have mentioned also that disk space is at a premium,  though cheap, I filled 1.5TB since sepember so saving 4bytes per record is pretty important.
0

LVL 70

Expert Comment

ID: 9599982
If you can strip the value down to eight chars, then of course you will be OK going to an int.  You could even convert eight chars to an int, and the remaining char to a tinyint -- 5 bytes instead of 8 and likely just as fast, or faster, than a BIGINT.

0

LVL 75

Expert Comment

ID: 9600079
As Scott has suggested you would have to use sp_executeSQL with OUTPUT, but this is sort of what I had in mind:

declare @X as varchar(20)
Set @X = '0xffffffffff'
exec('select cast(' + @X + ' as bigint)')

Again, I am sure I am not fully understanding what you need, so feel free to ignore this comment <g>

Anthony
0

LVL 70

Expert Comment

ID: 9600145
You could also, of course, create a table with hex values between 00 and FF and the corresponding decimal values and use a join to get the values.  I originally thought that this would be slower than the calculation method because it would require a join for every group of two bytes in the hex string, in this case 5 joins (since it's 9 chars, the last join would of course be '0' + the ninth char).  But I guess you never know, in theory I guess it could be quicker, since it would be only one data page, although I'd be very surprised.
0

LVL 18

Author Comment

ID: 9600359
Hmm,  I must admit the same thought crossed my mind, but i havent tested it.

Ill give it a try in the morning (UK time) and let you know how it goes.   Thanks for the input so far will pick up on this tomorrow.

Ta
0

LVL 26

Expert Comment

ID: 9605393
Hi,

I've been following this thread with much interest since the beginning.

Beeing very curious about extending SQL server capabilities, and having previous experience in creating open server in Sybase (looks like SQL server's ODS) to add interface with MOMs (IBM MQseries), i tried to write a piece of C code that could do the job.

It should be quite fast since it uses >> bit-shift operators to build and int value from a HEX string (length 0-8 cars)

The problem I'm facing now is that I don't know how to make this function available in SQL server.
From what I have understood so far, xp_* user-defined procedures are supposed to return recordsets, but I'm not sure they can be used like a normal function in a statement like

update mytable set intvalue=xp_HexToInt(strHexValue)

If the workaround is to call the sp with output parameter inside a cursor, it will take ages and It's not compatible with ShogunWade's requirements

ScottPletcher was suggesting to use a dll, any ideas how to implement it ??

If anybody cares, I can post the C code

Hilaire
0

LVL 26

Assisted Solution

Hilaire earned 75 total points
ID: 9612850
Hi,

from what I have read so far,
it sounds like using an extended stored procedures is not a suitable solution in this situation, since the only way to call it would be

exec xp_strHexToInt(@strIn varchar(8), @intValue int output)

for each row .... :((

I also tried to do the job with a function declared as a method in a active X dll. Calling it using sp_OA* is even worse ...

I think (I may be wrong) that creating user-defined external scalar functions is not possible for the moment.

It seems that the new SQL server version (Yukon ?) due in 2004 will allow to extend SQL Server built-in scalar functions with user libraries made with .Net Studio (.net Framework).

The syntax should be something like

Create assembly MyLib from '\myPath\myDll'

Create function ext_udf_MyFunction(@param1 type, ....) returns output_datatype
external name 'MyLib.MyFunction' deterministic

We'll have to wait ...

Nobody seems to care, but here's my code (its' written anyway)
(no #includes needed (just <stdio.h>))

int strHexToInt(char *val) {
int retval;
retval=0;
switch(*val){
case 'A':
case 'a':
retval=10; break;
case 'B':
case 'b':
retval=11; break;
case 'C':
case 'c':
retval=12; break;
case 'D':
case 'd':
retval=13; break;
case 'E':
case 'e':
retval=14; break;
case 'F':
case 'f':
retval=15; break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
retval=(*val)-48; break;
default:
return 0;
}
while (*(++val))
{
retval <<= 4;
switch(*val){
case 'A':
case 'a':
retval+=10; break;
case 'B':
case 'b':
retval+=11; break;
case 'C':
case 'c':
retval+=12; break;

case 'D':
case 'd':
retval+=13; break;
case 'E':
case 'e':
retval+=14; break;
case 'F':
case 'f':
retval+=15; break;
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
retval+=(*val)-48; break;
default:
return 0;
}
}
return retval;
}

I hope my contribution is not completely off-topic.
Forgive me and my crappy english if it is.

Cheers

Hilaire
0

LVL 18

Author Comment

ID: 9612940
Hi Hilaire,

Sorry i didnt respond a yesterday,   I was off sick.   Im just catching up on a few things and hopefully in a couple of hours or so will get back onto the thread

Regards

Mike
0

LVL 34

Expert Comment

ID: 9623624
Just a suggestion, if you only really need to return and integer to link the original "hex" values together couldn't you use the CHECKSUM function around the field you want to get the int/bigint for?  Not sure how slow it would be....

Brett
0

LVL 18

Author Comment

ID: 9626130
Hi All,

Arbert,   as a temporary workaround I have been using checksum, but unfortunately I an getting to many colisions and it takes about 3 mins to run.

Havent benchmarked all the other suggestions yet,  hopefully may have time to get back onto this later today.
0

LVL 18

Author Comment

ID: 9627919
Hi All,

ive been doing some benchmarking and am still struggling with the performance issue.  I suspect that I may need to rethink the problem.

There are some good suggestions and help you have given me. Rather than keeping the question open,  I thought I would close it and split the pointsas below
If this is acceptable to you all, I would suggest I give:

100 to Scott
and 25 to Hilaire

0

## Featured Post

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A Stored Procedure in Microsoft SQL Server is a powerful feature that it can be used to execute the Data Manipulation Language (DML) or Data Definition Language (DDL). Depending on business requirements, a single Stored Procedure can return differeâ€¦
What if you have to shut down the entire Citrix infrastructure for hardware maintenance, software upgrades or "the unknown"? I developed this plan for "the unknown" and hope that it helps you as well. This article explains how to properly shut down â€¦
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
###### Suggested Courses
Course of the Month6 days, 22 hours left to enroll