Solved

UDF for Fastest Possible IP translation

Posted on 2003-11-21
8
383 Views
Last Modified: 2008-02-07
hey i am working on a LARGE analysis services/datamart project where i need to group by different networks of ip addresses.

i have more than 64,000 IP addresses (especially for the destination fields)

i need to:

1) store these the most efficiently (smallest).  isnt integer the smallest way to do this?
2) convert these (to the destination format the most efficiently)
3) roll these up into different networks (ie i think that i want a table where i store:
IP_ADDY           PARENT_IP
192.168.5.112   192.168.5.0
192.168.5.113   192.168.5.0

my preferences are to not use any DTS/vba for this (because i am using bcp/bulk insert)

and the UDF that I have previously used seems WWWWWAAAAAAAAYYYYYYY tooo slow.

i have seen people use an extended stored proc to do this (written in C++ i assume)-- is this utterly the fastest way to do this?

i am talking about 300million new records every day.

thanks!!
0
Comment
Question by:__Holly__
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +1
8 Comments
 
LVL 34

Expert Comment

by:arbert
ID: 9801939
I assume that you would have 4integer fields to store each part of the IP?  If not, you would have to use a bigint field to be able to store the maximum possible IP address (that, and I'm not sure how you would know whether it was 192.168.5.0 or 192.16.85.0 when you go to convert it back)

As far as DTS, you can also use the Bulk Insert task within DTS :)

Have you thought about simply keeping the IPs stored as an integer and putting a view on top of the data for the users that add the "IP formatting" on the fly?
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 9802634
One approach you can take is to store the IP as a varchar column and use the T-SQL PARSENAME function to pull out individual parts of the IP address.  Yes, I realize this use of the PARSENAME function is not very orthodox as it is not intended for that purpose, but does do the job very well.

Anthony
0
 
LVL 9

Accepted Solution

by:
PaulBarbin earned 400 total points
ID: 9804682
I like the parsename approach.  I have used it and it works well.  

My question would be, do you need the individual octets or do you just need the complete IP address?  You have to be concerned about processing time as well as storage space.  For example, you could use four tinyint fields and store this in 4 bytes (I think that's the most storage efficient), but its definitely more processor intensive and costs time to separate it to insert and combine it to select.  

Another question to consider is: Are you having to compare previous data to see if the row already exists?  Or are you just recording new data every day?  Are you storing each day in one table and then doing a nightly merge?

Also, don't see the point of storing the parent address.  Is it always something you can calcuate?  That is, is the subnet the same?  Or could you just store the subnet value (another tinyint) and calcuate the parent when/if you need it.  If you often need this value, then I would suggest storing it.  I don't know whether it makes sense with this much volume to try and go the strict relational route which would indicate a foreign key to the parent IP address table.  Just thinking out loud...

Final note:
300 million records is a LOT of IP addresses!  Seems like you'd run out of unique IP addresses after a few weeks at that pace!  Can I ask what this is for?

Paul
0
Online Training Solution

Drastically shorten your training time with WalkMe's advanced online training solution that Guides your trainees to action. Forget about retraining and skyrocket knowledge retention rates.

 
LVL 9

Expert Comment

by:PaulBarbin
ID: 9804695
Sorry, just read your post again and I see that this is for OLAP (should have figured).

Btw, if you go the varchar route, then no UDF is necessary.  Otherwise, you could do something like this:

Insert INTO IPTable   (
     firstOctet,
     secondOctet,
     thirdOctet,
     fourthOctet)

Select
     ParseName(fullIPAddress, 1),
     ParseName(fullIPAddress, 2),
     ParseName(fullIPAddress, 3),
     ParseName(fullIPAddress, 4)
From MyTable

Paul
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 10232037
Do you still need help with this old open question?
0
 
LVL 34

Expert Comment

by:arbert
ID: 10232074
She's "deadly with SQL Server & Access"  but not very good at maintaining her questions....
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 10232120
LOL, but we can only hope.
0
 
LVL 1

Author Comment

by:__Holly__
ID: 10239505
sorry guys.

THANKS A TON that is exactly what i needed; and i am sure it is faster than all that other BS i was writing by hand.

i work for the big M and i report on internal traffic... the destination IPs have a high cardinality, but the source IPs are a lil less dynamic.

i think that i have something like 800k client ip addresses; but im still not too familiar with the real shape of it.
0

Featured Post

Learn how to optimize MySQL for your business need

With the increasing importance of apps & networks in both business & personal interconnections, perfor. has become one of the key metrics of successful communication. This ebook is a hands-on business-case-driven guide to understanding MySQL query parameter tuning & database perf

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

For both online and offline retail, the cross-channel business is the most recent pattern in the B2C trade space.
In part one, we reviewed the prerequisites required for installing SQL Server vNext. In this part we will explore how to install Microsoft's SQL Server on Ubuntu 16.04.
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.
Via a live example combined with referencing Books Online, show some of the information that can be extracted from the Catalog Views in SQL Server.

615 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question