Solved

UDF for Fastest Possible IP translation

Posted on 2003-11-21
8
381 Views
Last Modified: 2008-02-07
hey i am working on a LARGE analysis services/datamart project where i need to group by different networks of ip addresses.

i have more than 64,000 IP addresses (especially for the destination fields)

i need to:

1) store these the most efficiently (smallest).  isnt integer the smallest way to do this?
2) convert these (to the destination format the most efficiently)
3) roll these up into different networks (ie i think that i want a table where i store:
IP_ADDY           PARENT_IP
192.168.5.112   192.168.5.0
192.168.5.113   192.168.5.0

my preferences are to not use any DTS/vba for this (because i am using bcp/bulk insert)

and the UDF that I have previously used seems WWWWWAAAAAAAAYYYYYYY tooo slow.

i have seen people use an extended stored proc to do this (written in C++ i assume)-- is this utterly the fastest way to do this?

i am talking about 300million new records every day.

thanks!!
0
Comment
Question by:__Holly__
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 34

Expert Comment

by:arbert
ID: 9801939
I assume that you would have 4integer fields to store each part of the IP?  If not, you would have to use a bigint field to be able to store the maximum possible IP address (that, and I'm not sure how you would know whether it was 192.168.5.0 or 192.16.85.0 when you go to convert it back)

As far as DTS, you can also use the Bulk Insert task within DTS :)

Have you thought about simply keeping the IPs stored as an integer and putting a view on top of the data for the users that add the "IP formatting" on the fly?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 9802634
One approach you can take is to store the IP as a varchar column and use the T-SQL PARSENAME function to pull out individual parts of the IP address.  Yes, I realize this use of the PARSENAME function is not very orthodox as it is not intended for that purpose, but does do the job very well.

Anthony
0
 
LVL 9

Accepted Solution

by:
PaulBarbin earned 400 total points
ID: 9804682
I like the parsename approach.  I have used it and it works well.  

My question would be, do you need the individual octets or do you just need the complete IP address?  You have to be concerned about processing time as well as storage space.  For example, you could use four tinyint fields and store this in 4 bytes (I think that's the most storage efficient), but its definitely more processor intensive and costs time to separate it to insert and combine it to select.  

Another question to consider is: Are you having to compare previous data to see if the row already exists?  Or are you just recording new data every day?  Are you storing each day in one table and then doing a nightly merge?

Also, don't see the point of storing the parent address.  Is it always something you can calcuate?  That is, is the subnet the same?  Or could you just store the subnet value (another tinyint) and calcuate the parent when/if you need it.  If you often need this value, then I would suggest storing it.  I don't know whether it makes sense with this much volume to try and go the strict relational route which would indicate a foreign key to the parent IP address table.  Just thinking out loud...

Final note:
300 million records is a LOT of IP addresses!  Seems like you'd run out of unique IP addresses after a few weeks at that pace!  Can I ask what this is for?

Paul
0
NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9804695
Sorry, just read your post again and I see that this is for OLAP (should have figured).

Btw, if you go the varchar route, then no UDF is necessary.  Otherwise, you could do something like this:

Insert INTO IPTable   (
     firstOctet,
     secondOctet,
     thirdOctet,
     fourthOctet)

Select
     ParseName(fullIPAddress, 1),
     ParseName(fullIPAddress, 2),
     ParseName(fullIPAddress, 3),
     ParseName(fullIPAddress, 4)
From MyTable

Paul
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 10232037
Do you still need help with this old open question?
0
 
LVL 34

Expert Comment

by:arbert
ID: 10232074
She's "deadly with SQL Server & Access"  but not very good at maintaining her questions....
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 10232120
LOL, but we can only hope.
0
 
LVL 1

Author Comment

by:__Holly__
ID: 10239505
sorry guys.

THANKS A TON that is exactly what i needed; and i am sure it is faster than all that other BS i was writing by hand.

i work for the big M and i report on internal traffic... the destination IPs have a high cardinality, but the source IPs are a lil less dynamic.

i think that i have something like 800k client ip addresses; but im still not too familiar with the real shape of it.
0

Featured Post

NAS Cloud Backup Strategies

This article explains backup scenarios when using network storage. We review the so-called “3-2-1 strategy” and summarize the methods you can use to send NAS data to the cloud

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Nowadays, some of developer are too much worried about data. Who is using data, who is updating it etc. etc. Because, data is more costlier in term of money and information. So security of data is focusing concern in days. Lets' understand the Au…
Having an SQL database can be a big investment for a small company. Hardware, setup and of course, the price of software all add up to a big bill that some companies may not be able to absorb.  Luckily, there is a free version SQL Express, but does …
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.
Viewers will learn how to use the INSERT statement to insert data into their tables. It will also introduce the NULL statement, to show them what happens when no value is giving for any given column.

807 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question