?
Solved

UDF for Fastest Possible IP translation

Posted on 2003-11-21
8
Medium Priority
?
385 Views
Last Modified: 2008-02-07
hey i am working on a LARGE analysis services/datamart project where i need to group by different networks of ip addresses.

i have more than 64,000 IP addresses (especially for the destination fields)

i need to:

1) store these the most efficiently (smallest).  isnt integer the smallest way to do this?
2) convert these (to the destination format the most efficiently)
3) roll these up into different networks (ie i think that i want a table where i store:
IP_ADDY           PARENT_IP
192.168.5.112   192.168.5.0
192.168.5.113   192.168.5.0

my preferences are to not use any DTS/vba for this (because i am using bcp/bulk insert)

and the UDF that I have previously used seems WWWWWAAAAAAAAYYYYYYY tooo slow.

i have seen people use an extended stored proc to do this (written in C++ i assume)-- is this utterly the fastest way to do this?

i am talking about 300million new records every day.

thanks!!
0
Comment
Question by:__Holly__
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 34

Expert Comment

by:arbert
ID: 9801939
I assume that you would have 4integer fields to store each part of the IP?  If not, you would have to use a bigint field to be able to store the maximum possible IP address (that, and I'm not sure how you would know whether it was 192.168.5.0 or 192.16.85.0 when you go to convert it back)

As far as DTS, you can also use the Bulk Insert task within DTS :)

Have you thought about simply keeping the IPs stored as an integer and putting a view on top of the data for the users that add the "IP formatting" on the fly?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 9802634
One approach you can take is to store the IP as a varchar column and use the T-SQL PARSENAME function to pull out individual parts of the IP address.  Yes, I realize this use of the PARSENAME function is not very orthodox as it is not intended for that purpose, but does do the job very well.

Anthony
0
 
LVL 9

Accepted Solution

by:
PaulBarbin earned 1600 total points
ID: 9804682
I like the parsename approach.  I have used it and it works well.  

My question would be, do you need the individual octets or do you just need the complete IP address?  You have to be concerned about processing time as well as storage space.  For example, you could use four tinyint fields and store this in 4 bytes (I think that's the most storage efficient), but its definitely more processor intensive and costs time to separate it to insert and combine it to select.  

Another question to consider is: Are you having to compare previous data to see if the row already exists?  Or are you just recording new data every day?  Are you storing each day in one table and then doing a nightly merge?

Also, don't see the point of storing the parent address.  Is it always something you can calcuate?  That is, is the subnet the same?  Or could you just store the subnet value (another tinyint) and calcuate the parent when/if you need it.  If you often need this value, then I would suggest storing it.  I don't know whether it makes sense with this much volume to try and go the strict relational route which would indicate a foreign key to the parent IP address table.  Just thinking out loud...

Final note:
300 million records is a LOT of IP addresses!  Seems like you'd run out of unique IP addresses after a few weeks at that pace!  Can I ask what this is for?

Paul
0
 [eBook] Windows Nano Server

Download this FREE eBook and learn all you need to get started with Windows Nano Server, including deployment options, remote management
and troubleshooting tips and tricks

 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9804695
Sorry, just read your post again and I see that this is for OLAP (should have figured).

Btw, if you go the varchar route, then no UDF is necessary.  Otherwise, you could do something like this:

Insert INTO IPTable   (
     firstOctet,
     secondOctet,
     thirdOctet,
     fourthOctet)

Select
     ParseName(fullIPAddress, 1),
     ParseName(fullIPAddress, 2),
     ParseName(fullIPAddress, 3),
     ParseName(fullIPAddress, 4)
From MyTable

Paul
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 10232037
Do you still need help with this old open question?
0
 
LVL 34

Expert Comment

by:arbert
ID: 10232074
She's "deadly with SQL Server & Access"  but not very good at maintaining her questions....
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 10232120
LOL, but we can only hope.
0
 
LVL 1

Author Comment

by:__Holly__
ID: 10239505
sorry guys.

THANKS A TON that is exactly what i needed; and i am sure it is faster than all that other BS i was writing by hand.

i work for the big M and i report on internal traffic... the destination IPs have a high cardinality, but the source IPs are a lil less dynamic.

i think that i have something like 800k client ip addresses; but im still not too familiar with the real shape of it.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

In this article we will learn how to fix  “Cannot install SQL Server 2014 Service Pack 2: Unable to install windows installer msi file” error ?
When trying to connect from SSMS v17.x to a SQL Server Integration Services 2016 instance or previous version, you get the error “Connecting to the Integration Services service on the computer failed with the following error: 'The specified service …
Using examples as well as descriptions, and references to Books Online, show the documentation available for date manipulation functions and by using a select few of these functions, show how date based data can be manipulated with these functions.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

718 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question