Solved

HEX in varchar to int or bigint conversion

Posted on 2003-10-22
27
1,479 Views
Last Modified: 2012-06-27
Hi all,

Im desperately trying to convert a hex value into either an int or big int.

The snag is that the hex number is bulk inserted and the sourcefile does not stipulate 0x infront of the number so bulk insert blows up if i use varbinary to import the field.


Thus im bulk inserting the 10 million+ records into a varchar field, I then need to convert this to int.

I cant afford any form of cursors or string manupulation to do this.   because the operation nees to complete within a few seconds.


In short....  I want to be able to convert to an integer a hex value stored in a varchar where the varchar looks something like this:

MyField
---------
f575e
524eea

etc.
0
Comment
Question by:ShogunWade
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 12
  • 10
  • 2
  • +2
27 Comments
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9598869
I suggest a "brute force" method, similar to:

UPDATE yourTable
SET intColumn =
      (CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn), 1), '0123456789ABCDEF') - 1) +
      CASE WHEN LEN(hexColumn) > 1 THEN
            CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn) - 1, 1), '0123456789ABCDEF') - 1) * 16
            ELSE 0 END +
      CASE WHEN LEN(hexColumn) > 2 THEN
            (CHARINDEX(SUBSTRING(hexColumn, LEN(hexColumn) - 2, 1), '0123456789ABCDEF') - 1) * 16 * 16
            ELSE 0 END +
... and so on, for up to 8 chars ...


This assumes a length of at least one; if needed, add a CASE check to the first calc as well.

This also assumes that no bad hex chars are present.  If necessary, add a WHERE to the UPDATE to insure that no invalid values are converted.

0
 
LVL 69

Accepted Solution

by:
Scott Pletcher earned 100 total points
ID: 9598898
Now that I look at it, I think >= is clearer than >, and I finished the full 8, something like this:

UPDATE ...
SET ... =
      (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn), 1), '0123456789ABCDEF') - 1) +
      CASE WHEN LEN(@hexColumn) >= 2 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 1, 1), '0123456789ABCDEF') - 1) * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 3 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 2, 1), '0123456789ABCDEF') - 1) * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 4 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 3, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 5 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 4, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 6 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 5, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 7 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 6, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16 * 16
            ELSE 0 END +
      CASE WHEN LEN(@hexColumn) >= 8 THEN
            (CHARINDEX(SUBSTRING(@hexColumn, LEN(@hexColumn) - 7, 1), '0123456789ABCDEF') - 1) * 16 * 16 * 16 * 16 * 16 * 16 * 16
            ELSE 0 END
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9598906
Obviously you can normalize the 16 * 16 ... to a fixed number; I was just in a hurry :-).
0
U.S. Department of Agriculture and Acronis Access

With the new era of mobile computing, smartphones and tablets, wireless communications and cloud services, the USDA sought to take advantage of a mobilized workforce and the blurring lines between personal and corporate computing resources.

 
LVL 18

Author Comment

by:ShogunWade
ID: 9598927
Thanks for the suggestion Scott, however I tried a very similar technique but it was to slow.   The thing is that this 10 million records is just from one of many files received to by the server every day.  all the data is denormalised and unvalidates and gets cleaned normalised aggregated and archived all within a very time timeline, I was kind of hoping that there would be some form of cast or convert type method of solving the problem.
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9598940
If there is no faster alternative then I will accept your answer.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9598989
Hmm, given today's CPUs, I wouldn't have that logic would be *that* slow.  You could use C++ (or assembler) to create a .DLL and use that as an extended SP to do the conversion.  That approach seems to follow what MS does when they need more speed :-)
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9599081
I agree that it is not v slow but given the very tight timeline to perform all the processing the task must complete within a few seconds max.
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9599429
Having re-thought over the problem what i actually need is not necesarily a true conversion....

Il try to explain....

Each record in the file consists of a number of fields   the combination of 4 of them must be unique for a "dialogue" (ie a series of records that relate to each other)   one field is a hex value but it is stored purely for "threading the dialogues together" (ie joining) but must be availble or recalculatable for queries later.    Each "dialogue" is assigned a newid() upon initial normalisation and thus the hex reference is purely informational after this stage.

I thought maybe doing CONVERT(varbinary(20),myhexfield)  

myhexfield is max 9 hex characters at present

however the ultimately I need to be able to store the resultant value in an int datatype.   if i do CONVERT(int,CONVERT(varbinary(20),myhexfield)  
)  i will overflow max int.

In other words I need a crude and quick method of taking a hex value in a varchar field and converting it to any int value that is reversable.   It doesntmatter how slow the reversal is so long as it can be done and the initial convertion is lightning fast
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9599457
The chance of the other three fields in a given dialogue colliding is is small  about 0.5%

0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 9599657
I strongly suspect that I am oversimplifying the problem, but perhaps you could resort to dynamic SQL as in:

exec('select cast(' + myhexfield + ' as bigint) from table1')

Anthony
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9599669
When you said you were storing the result in an INT, I assumed the max length was 8, since an INT can only handle up to x'7FFFFFFF'.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9599695
You could use dynamic SQL, but whether or not that would be quicker is problematic, because I think you would have to use sp_executeSQL with an OUTPUT parameter to get the translated value back.
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9599817
Sorry,  im suffereing from brain death today i think......  the hex values i get are unsigned so i planned to offset the value by min int but event still, U are correct my value wont fit into an int so no algorthm applied to it can possibly be truely reversable.

thus i must either use a "lossy" function" or a bigint.

AC, let me thing for a touch on what you are suggesting and see if it will do the trick.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9599829
You could use a BIGINT.  And it's easy to convert the value from INT back to HEX, you just reverse the process above.  It would be slower, but you said that's not an issue during reversal.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9599832
Oh, btw, BIGINT will slow it down :-(
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9599853
AC, im not quite following,    

Myhexfield is a field within table1.. so i dont entirely follow how this would work.
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9599936
A bit more info and a bit more thought.......


The most significant bits of the hex number can be lossed without to much of a deremental effect in order to squeze the result into an int.  

Perhaps I should have mentioned also that disk space is at a premium,  though cheap, I filled 1.5TB since sepember so saving 4bytes per record is pretty important.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9599982
If you can strip the value down to eight chars, then of course you will be OK going to an int.  You could even convert eight chars to an int, and the remaining char to a tinyint -- 5 bytes instead of 8 and likely just as fast, or faster, than a BIGINT.

0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 9600079
As Scott has suggested you would have to use sp_executeSQL with OUTPUT, but this is sort of what I had in mind:

declare @X as varchar(20)
Set @X = '0xffffffffff'
exec('select cast(' + @X + ' as bigint)')

Again, I am sure I am not fully understanding what you need, so feel free to ignore this comment <g>

Anthony
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 9600145
You could also, of course, create a table with hex values between 00 and FF and the corresponding decimal values and use a join to get the values.  I originally thought that this would be slower than the calculation method because it would require a join for every group of two bytes in the hex string, in this case 5 joins (since it's 9 chars, the last join would of course be '0' + the ninth char).  But I guess you never know, in theory I guess it could be quicker, since it would be only one data page, although I'd be very surprised.
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9600359
Hmm,  I must admit the same thought crossed my mind, but i havent tested it.    

Ill give it a try in the morning (UK time) and let you know how it goes.   Thanks for the input so far will pick up on this tomorrow.

Ta
0
 
LVL 26

Expert Comment

by:Hilaire
ID: 9605393
Hi,

I've been following this thread with much interest since the beginning.

Beeing very curious about extending SQL server capabilities, and having previous experience in creating open server in Sybase (looks like SQL server's ODS) to add interface with MOMs (IBM MQseries), i tried to write a piece of C code that could do the job.

It should be quite fast since it uses >> bit-shift operators to build and int value from a HEX string (length 0-8 cars)

The problem I'm facing now is that I don't know how to make this function available in SQL server.
From what I have understood so far, xp_* user-defined procedures are supposed to return recordsets, but I'm not sure they can be used like a normal function in a statement like

update mytable set intvalue=xp_HexToInt(strHexValue)

If the workaround is to call the sp with output parameter inside a cursor, it will take ages and It's not compatible with ShogunWade's requirements

ScottPletcher was suggesting to use a dll, any ideas how to implement it ??

If anybody cares, I can post the C code

Hilaire
0
 
LVL 26

Assisted Solution

by:Hilaire
Hilaire earned 25 total points
ID: 9612850
Hi,

from what I have read so far,
it sounds like using an extended stored procedures is not a suitable solution in this situation, since the only way to call it would be

exec xp_strHexToInt(@strIn varchar(8), @intValue int output)

for each row .... :((

I also tried to do the job with a function declared as a method in a active X dll. Calling it using sp_OA* is even worse ...

I think (I may be wrong) that creating user-defined external scalar functions is not possible for the moment.

It seems that the new SQL server version (Yukon ?) due in 2004 will allow to extend SQL Server built-in scalar functions with user libraries made with .Net Studio (.net Framework).

The syntax should be something like

Create assembly MyLib from '\myPath\myDll'

Create function ext_udf_MyFunction(@param1 type, ....) returns output_datatype
external name 'MyLib.MyFunction' deterministic

We'll have to wait ...

Nobody seems to care, but here's my code (its' written anyway)
(no #includes needed (just <stdio.h>))

int strHexToInt(char *val) {
      int retval;
      retval=0;
      switch(*val){
            case 'A':
            case 'a':
                  retval=10; break;
            case 'B':
            case 'b':
                  retval=11; break;
            case 'C':
            case 'c':
                  retval=12; break;
            case 'D':
            case 'd':
                  retval=13; break;
            case 'E':
            case 'e':
                  retval=14; break;
            case 'F':
            case 'f':
                  retval=15; break;
            case '0':
            case '1':
            case '2':
            case '3':
            case '4':
            case '5':
            case '6':
            case '7':
            case '8':
            case '9':
                  retval=(*val)-48; break;
            default:
                  return 0;
      }
      while (*(++val))
      {
            retval <<= 4;
            switch(*val){
                  case 'A':
                  case 'a':
                        retval+=10; break;
                  case 'B':
                  case 'b':
                        retval+=11; break;
                  case 'C':
                  case 'c':
                        retval+=12; break;
                  
                  case 'D':
                  case 'd':
                        retval+=13; break;
                  case 'E':
                  case 'e':
                        retval+=14; break;
                  case 'F':
                  case 'f':
                        retval+=15; break;
                  case '0':
                  case '1':
                  case '2':
                  case '3':
                  case '4':
                  case '5':
                  case '6':
                  case '7':
                  case '8':
                  case '9':
                        retval+=(*val)-48; break;
                  default:
                        return 0;
            }
      }
      return retval;
}





I hope my contribution is not completely off-topic.
Forgive me and my crappy english if it is.

Cheers

Hilaire
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9612940
Hi Hilaire,

Sorry i didnt respond a yesterday,   I was off sick.   Im just catching up on a few things and hopefully in a couple of hours or so will get back onto the thread

Regards

Mike
0
 
LVL 34

Expert Comment

by:arbert
ID: 9623624
Just a suggestion, if you only really need to return and integer to link the original "hex" values together couldn't you use the CHECKSUM function around the field you want to get the int/bigint for?  Not sure how slow it would be....

Brett
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9626130
Hi All,

I havent forgotten about this thread,  Im just a touch tied up with other stuff at the mo.

Arbert,   as a temporary workaround I have been using checksum, but unfortunately I an getting to many colisions and it takes about 3 mins to run.  

Havent benchmarked all the other suggestions yet,  hopefully may have time to get back onto this later today.
0
 
LVL 18

Author Comment

by:ShogunWade
ID: 9627919
Hi All,

ive been doing some benchmarking and am still struggling with the performance issue.  I suspect that I may need to rethink the problem.

There are some good suggestions and help you have given me. Rather than keeping the question open,  I thought I would close it and split the pointsas below
If this is acceptable to you all, I would suggest I give:

100 to Scott
and 25 to Hilaire

0

Featured Post

Salesforce Has Never Been Easier

Improve and reinforce salesforce training & adoption using WalkMe's digital adoption platform. Start saving on costly employee training by creating fast intuitive Walk-Thrus for Salesforce. Claim your Free Account Now

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
SQL 2012 Report Builder 3.0 query 2 18
Sql does not recognize null value from Oracle source 8 23
sql trace 4 22
invoke-sqlcmd help 5 27
Introduction In my previous article (http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/SSIS/A_9150-Loading-XML-Using-SSIS.html) I showed you how the XML Source component can be used to load XML files into a SQL Server database, us…
In the first part of this tutorial we will cover the prerequisites for installing SQL Server vNext on Linux.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Viewers will learn how to use the SELECT statement in SQL to return specific rows and columns, with various degrees of sorting and limits in place.

737 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question