Efficient Access Query Design?

So I am working on a database project for my office, and for now, I am using MS Access 2003 for things.  The idea down the road is to upscale it into SQL Server, or redesign it for some other database platform, but at least Access will let me layout a schema of sorts to make sure I'm doing things somewhat sanely.

So the problem I am running into, some of my queries are not executing as fast as I'd like.  Right now, one of my queries, which is one of the core ones, executes one time in about ~0.05 seconds.  I built a profiling function using the QueryPerformanceCounter API to count the ticks in the system while a function executes, hence the granularity of my execution time.

But that's vague, so some background.

The database will store a lot of things, each in their own tables.  Many of the tables are small and simple, maybe 2-3 fields, tops, but they will all contain data that will be combined into a larger dataset.  I chose this design path to avoid repeating data too many places.

The most complex data stored in tables is for networking information, namely, IP addresses and ports.  So I am using these as my example in this write up.  If I can make those run efficiently, the rest should be a piece of cake.  A visual representation of this is below in the code section for reference.

For addresses, I have two tables, one for IPv4 and another for IPv6.  For IPv4, I convert the IP address to a long integer using a custom LongToIP (more on this later), and store that in the database, plus its CIDR block and a autonumber for the primary key.

IPv6 is tougher.  Access, let alone VBA, can't handle an IPv6 address converted to its numeric format (which is massive).  So I opted instead to first expand the IPv6 address out to its full form, eight hextet groups, then split that address into for components.  Each component can be converted to a long integer and stored in an Access table.  I lose the ability to properly sort, but I think that's a fair tradeoff.

Of these two tables, I have standard SELECT queries that format them to return four columns of data: KeyId, Address, CIDR, Type.  Illustrated in code below, these four columns are as follows:

KeyId = Id + three-letter prefix, IP4 for IPv4, IP6 for IPv6, and NUL for my one NULL value.  This is because of a Union query (described below), and because Access cannot guarantee uniqueness of Autonumbers (not even RepIDs will suffice) across different tables.

Address - The address column from IPv4, and the four Address columns from IPv6, concatenated together.

CIDR - Self-explanatory.

Type - The Data type for that record.  IPv4, IPv6, or "NULL".

As noted, I have a hidden table that has one row in it and one column.  The table is called __tblNULL, the one column is called "NULL", and it stores a single Byte, 0.  This is to allow me to create a single "NULL" row in my union query for cases where a given address is not selected, and not have to worry about adding in code to handle real database nulls.  It also allows me to create multi-field primary keys, since the Address columns will always have a value in some of the other tables (which are not described here in detail).  This mimes a capability I read about in Oracle that lets you declare constants using SQL syntax, and a "FROM dual" clause, where 'dual' a hidden Oracle table with exactly one row/one column.

As mentioned, these three queries (IPv4, IPv6, and NULL), are combined using a Union Query.  This Union query allows me to then make available this IP address data to other tables in my database, and other queries to create relationships and such.  The Port tables and data are constructed in a very similar fashion, but my Ports list tab;e has about 600 of the most commonly-used ports, gleaned from IANA's data and my own knowledge of network services seen when I was once a sysadmin.


Now, if anyone is still reading, here's where the fun begins.  I have a query which opens up one of these data tables that links into the Union query.  I use a relationship between the stored value and the Union query's KeyId field to lookup the proper address value.  Using the Type column, I determine which code module to initialize.  I have two class modules, both of which employ IP calculator functionality (one is an IPv4 calculator, and the other an IPv6 one).  The core of both modules have very similar functions (IPv6's is more complex, obviously, though), and one of them is FromNumeric().  It's a Let Property that accepts either a Long integer for IPv4 or a string for IPv6, and converts that to the IP address, stored internally.

From that, we can then do all sorts of things with the address, but I generally just output the address back for display, sometimes with the CIDR appended.

This is where some of the slowdown kicks in, however.  That Union query can, depending on the number of rows it is fetching, take anywhere from 0.002 to 0.05s to execute.  The version for Ports is even longer, usually 0.02-0.05, mostly because the master Ports table is ~600+ records (And that will grow with time).

Now 0.05s seems lighting fast, but that's just for one execution of the query.  If I execute this data query 100 times (assuming 100 rows in my data table), that's 5 seconds.  As the number of IPv4 or IPv6 addresses (or Ports) go up, this will become a larger and larger value, pretty quickly by my estimates.

So I have to wonder, IS a Union query one of the best ways?

I've also tried optimizing things other ways, including forgoing the use of JOINs in the data tables to the Union query, and using small VBA functions to do SQL lookups (using stored queries w/ Parameters) to fetch the values, etc.  But so far, JOINs with the Union queries have been some of the best performers, as reported by my timing function.


Some of the other things I've looked at are Make Table queries to store the result of the Union query, and some kind of mechanism to force-update it when the data in the Union query changes, arcane uses of ADO (and even DAO), optimzing my conditionals and loop structures in my VBA code (I use niether Variants or Objects at all), etc..

And searching Google is frustrating without the right sequence of keywords.  WAY too much advertising and incorrect assumptions returned!  So I'm looking for expert advice here.  Based on the lengthy explanation I've given, can anyone suggest some really off-the-wall ideas for speeding Access up *without* using SQL Server?



tblIPv4List:
(127.0.0.1/32, 203.0.113.0/24, & 172.16.0.0/12)

  Id   |   Address   | CIDR
----------------------------
000001 |  2130706433 |  32
000002 |  -889163520 |  24
000003 | -1408237568 |  12



tblIPv6List:
(3ffe:1900:4545:3:200:f8ff:fe21:67cf/128, 2001:0db8:85a3:0000::/64, ::ffff:0:0/96)

  Id   |   Address1  |   Address2  |   Address3  |   Address4  | CIDR
----------------------------------------------------------------------
000001 |  1073617152 |  1162149891 |    33618175 |   -31365169 | 128
000002 |   536939960 | -2052915200 |           0 |           0 |  64
000003 |           0 |           0 |          -1 |           0 |  96



__tblNULL:
 NULL
------
    0



qryS_Values_IPv4List:
SELECT "IP4" & [Id] AS KeyId, Address, CIDR, "IPv4" AS Type
FROM tblIPv4List

  KeyId   |    Address  | CIDR | Type
--------------------------------------
IP4000001 |  2130706433 |  32  | IPv4
IP4000002 |  -889163520 |  24  | IPv4
IP4000003 | -1408237568 |  12  | IPv4



qryS_Values_IPv6List:
SELECT "IP6" & [Id] AS KeyId, Address1 & ":" & Address2 & ":" Address3 & ":" & Address4 AS Address, 
       CIDR, "IPv6" AS Type
FROM tblIPv6List

  KeyId   |                 Address                  | CIDR  | Type
--------------------------------------------------------------------
IP6000001 | 1073617152:1162149891:33618175:-31365169 |  128  | IPv6
IP6000002 |                536939960:-2052915200:0:0 |   64  | IPv6
IP6000003 |                                 0:0:-1:0 |   96  | IPv6



Public Function Dummy() As Integer
	Const DUMMY_VALUE As Integer = -2
	Dummy = DUMMY_VALUE
End Function

qryS_Values_NULL:
SELECT "NUL" & "000000" AS KeyId, "NULL" AS Address, Dummy() AS CIDR, "NULL" AS Type
FROM __tblNULL

  KeyId   |   Address   | CIDR | Type
--------------------------------------
NUL000000 |     NULL    |  -2  | NULL



qryN_Data_AllIPValues (Union Query):
SELECT * FROM qryS_Values_NULL
UNION ALL
SELECT * FROM qryS_IPv4List
UNION ALL
SELECT * FROM qryS_IPv6List
ORDER BY Type, KeyId;

  KeyId   |                 Address                  | CIDR  | Type
--------------------------------------------------------------------
IP4000001 |                               2130706433 |   32  | IPv4
IP4000002 |                               -889163520 |   24  | IPv4
IP4000003 |                              -1408237568 |   12  | IPv4
IP6000001 | 1073617152:1162149891:33618175:-31365169 |  128  | IPv6
IP6000002 |                536939960:-2052915200:0:0 |   64  | IPv6
IP6000003 |                                 0:0:-1:0 |   96  | IPv6
NUL000000 |                                     NULL |   -2  | NULL


Public Function _
LongToIP4(ByRef ip4_num As Long, ByRef ip4_cidr As Byte) As String
	Dim IP As New clsIPv4

	IP.FromNumeric = ip4_num
	IP.CIDR = ip4_cidr

	If ip4_cidr = 32 Then
		LongToIP4 = IP.Address
	Else
		LongToIP4 = IP.Network
	End If
End Function

Public Function _
LongToIP6(ByRef ip6_num As String, ByRef ip6_cidr As Byte) As String
	Dim IP As New clsIPv6

	IP.FromNumeric = ip6_num
	IP.CIDR = ip6_cidr

	If ip6_cidr = 128 Then
		LongToIP6 = IP.Address
	Else
		LongToIP6 = IP.Network
	End If
End Function

Open in new window

LVL 2
Kumba42Asked:
Who is Participating?
 
aikimarkConnect With a Mentor Commented:
you could break the IPV6 pieces in to four Long Integer variables and store/retrieve these in a database

Replace the BigInt with a VeryBigInt (twice the size of a BigInt).

Depending on how it comes in, you might consider using the CopyMemory API to transfer the bytes into the VeryBigInt data structure.

============
My appologies on suggesting currency.  I misunderstood the protocol and thought it was only six bytes big.
0
 
BitsqueezerConnect With a Mentor Commented:
Hi,

maybe the attached file can help you with this.
First thing is: If you want to stay fast with SELECTs then never use any selfmade VBA functions inside of a query because this costs a lot of time in any selected row.
Next is, you don't need to create a dummy table, you can use any table to create constant values because Access only wants to have a table - but if all columns are constant then it doesn't matter which table you use. So the NULL table can be exchanged on this way:

SELECT TOP 1 "NUL000000" AS KeyId,"NULL" AS Address,-2 AS CIDR, "NULL" AS Type FROM tblIPv4List

where "tblIPv4List" is one of the tables in the demo database, you can use any table you want - nothing is really selected from the table.

Moreover: Why do you calculate so complicate ways just to save some bytes per rows? In the demo below I simply used one byte/one column for any part of the IPv4 table (4 columns) and one long integer/one column for any of the IPv6 pieces (8 columns). Long integer is needed as Integer only ranges to 32767.

With this method your IP is fully sortable and you can simply convert it by using the VB "Hex" function. If you save a record into the IP table you only need to convert any 16 bit part of the IPv6 address into a decimal value.

The queries to get the addresses back are:

SELECT "IP4" & right(string(6,"0") & ID,6) AS KeyId,
Adr1 & "." & Adr2 & "." & Adr3 & "." & Adr4 AS Address,
CIDR, "IPv4" AS Type
FROM tblIPv4List;

and

SELECT "IP6" & right(string(6,"0") & ID,6) AS KeyId,
Hex(Adr1) & ":" & Hex(Adr2) & ":" & Hex(Adr3) & ":" & Hex(Adr4) & ":" & 
Hex(Adr5) & ":" & Hex(Adr6) & ":" & Hex(Adr7) & ":" & Hex(Adr8) AS Address,
CIDR, "IPv6" AS Type
FROM tblIPv6List;

Then you can use the UNION SELECT to assemble the table:

SELECT KeyId,Address,CIDR,Type FROM qryIPv4
UNION ALL
SELECT KeyId,Address,CIDR,Type FROM qryIPv6
UNION ALL
SELECT TOP 1 "NUL000000" AS KeyId,"NULL" AS Address,-2 AS CIDR, "NULL" AS Type FROM tblIPv4List
ORDER BY Type, KeyId;

and in the last step copy the result into a temporary table if the result table is very large and you often need to do any kind of JOIN between the result table and other tables - this is faster than using the UNION SELECT any time. This depends how often your table changes and how up-to-date your data should be.

Cheers,

Christian

IPv6.zip
0
 
Kumba42Author Commented:
Bitsqueezer: Interesting ideas.  I figured storing an IPv4 as a single long integer worked, because should I upscale to any other database platform, such as MySQL, PostgresSQL, or even SQL Server down the road, there's a plethora of Ip2Long/Long2Ip implementations available for IPv4.

I also went with using one long for IPv4 over four individual bytes as a long is x86's native mode, and from some resources I read at this site: http://www.aivosto.com/vbtips/loopopt.html#data , I figured that loading a single long integer into one long variable would be more efficient to VBA than declaring and using four bytes (or an array of four bytes).

For IPv6, a lot of that was custom-designed, using ideas I'd dug up on IPv4.  For IPv6, I do use a bit of logic to combine two hextets into a single long integer and encode/decode them, but I figured that was more efficient than storing all eight hextets as eight long integers.  Also, since technically each hextet is fully representable as a normal unsigned integer, there are tricks you have to do in VBA's Val() function to force it to decode properly as a hex integer (otherwise, you get hex longs and such).  VBA and Access' use of signed integers really makes things frustrating.  I'm really looking forward to when they move VBA into the .NET world so I can have access to unsigned data types and short-circuited logical operators.

My Class modules also mimic my database design somewhat, meaning the code to create an object and load it will be easy to write and maintain, since the table fields for each object will mimic that object's properties.  Admittedly, though, my IPv4 class module DOES store an IPv4 address as an array of four bytes.  But I made that decision long before I designed the table.  It's on my list to switch it to a single long.

As for the database/SQL side, I'm well aware of the performance hit of calling custom VBA functions.  Generally, Office 2003 seems pretty optimized.  I benchmarked, using my profiler code, both a query generating its own constant, and a query calling my Dummy() function, and their execution times are both within a normal margin of error.  The Dummy() function just lets me centralize that -2 hardcode in one place, instead of having dozens of queries scattered around that call it, on the offchance I ever have to change it for some reason.  Maybe Access detects that and optimizes it somehow.  I am going to play with Jet's SHOWPLAN debug mode to study my tables later on.

That said, I've gone to some lengths to optimize my code as much as I can, which is no easy task in VBA, because by its nature, it's a difficult language to optimize.  Pity MS doesn't provide any sane way to to pointer dereferencing...

Regarding SELECT query optimization, does Access' Rushmore stuff know how to optimize queries doing string concatenation and calling internal VBA functions?  String concatenating is pretty expensive, and my class modules avoid it as much as possible, generally preferring to Join() am array together instead of concatenating (though, there is a very fine line between small concatenations and Join()).

Also, Hex() returns a variant.  Have to use Hex$() to get an actual BSTR back.  Ug, variants...

Last inquiry, regarding make table and DDL queries, is there any good info on properly creating those?  I was experimenting with DDL queries, but it looks like CREATE TABLE syntax is only supported half-way.  You can't do CREATE TEMPORARY TABLE or even use WITH COMP via the Access query designer, else you get an error.  Apparently that level of functionality is only available via other Jet interfaces.  So It looks like to properly build a temporary table, I'll either have to do a SELECT INTO or CREATE TABLE, plus several CREATE INDEX DDL queries, and I wasn't sure if spending the time to invoke the various stored queries for these functions was more costly than just pulling from a UNION query all the time.

I anticipate any temp table generated from a UNION query to get updated frequently, and so the decision is, handling the logic of making sure said table gets updated whenever its constituent tables change, or just letting the UNION do all the heavy lifting.

Granted, at this point, it's looking more and more like I might need to move, at least temporarily, to SQL Server.  But I've never worked with that before, and getting the management approval to get such a server stood up will take some time.
0
Cloud Class® Course: MCSA MCSE Windows Server 2012

This course teaches how to install and configure Windows Server 2012 R2.  It is the first step on your path to becoming a Microsoft Certified Solutions Expert (MCSE).

 
Kumba42Author Commented:
I also should add, that while I didn't go into a lot of detail on it, the Address handling code is really pretty quick.  I can definitely optimize it more.

It's my tables and code for ports that seems to be slow.  I was using a table holding ~5800 rows of ports, culled from IANA's master list, but that query was taking almost 2 full seconds for 2 executions (or ~0.2s per execution).  I trimmed that table down to ~680 of the most commonly used ports, and it sped things up by about 400%, but it's still way, way slower than the address stuff by some pretty significant margins.

What it does highlight, though, is that as my tables grow in size, these slowdowns will become more and more apparent.  So optimizing now will keep things under control until I can design something better in a more suitable environment.

And for reference, the ports table is just two columns: Id (also the port #), and the port name.  I do Format$() calls on the Port Id to get it into the PRT000XXX mode similar to IP addresses for their usage, since they are combined with a NULL value via a Union query.  It's THAT Union query that's really sluggish.
0
 
Kumba42Author Commented:
Er, last comment, 2nd paragraph, that's 10 executions, not 2.
0
 
Vadim RappConnect With a Mentor Commented:
I think, if you are trying to squeeze milliseconds, you'll need to implement memory lookups - create arrays and look them up, and with multitasking. Usually optimization of sql queries is about much larger durations, so I think, with these milliseconds there are too many other factors in the game, like computer hardware, to make any reliable conclusions; so I doubt you'll gain much on this path of query optimization, even if you move to sql server. When you want milliseconds, the data must be really close to the application.
0
 
Jim Dettman (Microsoft MVP/ EE MVE)Connect With a Mentor President / OwnerCommented:
Although not a comment directly on your question, I think this will help.  First, keep in mind that with JET, all processing occurs client side, so proper indexing is critical.
Second is that you want to have a look at this:
Access 2002 Desktop Developer's Handbook, Chapter 15: Application Optimization
http://msdn.microsoft.com/en-us/library/aa188211(office.10).aspx
And more specifically, check out:
1. The JET SHOWPLAN tool - this shows you how your queries are being costed and if indexes are used or not for a given query
2. The ISAMStats method - gives you some stats on actual query execution.
3. Read the section "Speeding Up Queries and Recordsets"
4. Look at Configuring the Jet Engine - you may want to try modifying the MaxBuffer setting depending on the size of your tables and available memory.
Last, don't forget to try running the Performance analyzer (Tools/Analyze/Performance) and see what suggestions it makes.  While generally you want to follow them, if you do make changes, understand why it is your making a change.
JimD.
0
 
aikimarkCommented:
You can easily store IPV6 numeric value in a Currency column (and VB data type)
0
 
BitsqueezerCommented:
Hi Kumba42,

sorry, but you write very long texts but indeed you don't say anything about the real problem.
If I understood right what you wrote then you have some IP addresses in two tables and some ports in a further table - with only 600 records. Your demo is of course only three IPs per table but about how many addresses do we really speak here? How many records do you guess you will get in maximum + 20%?

Then you talk about an object structure in VBA representing the data structure in the tables - that's a very good method in .NET where you have LINQ or in other object oriented languages. Here we talk about VBA which is a better macro language with some little object oriented structures, not more. VBA was made as a possibility to automate applications to get a kind of program, but it's far away of being a real program language. It's slow and it's very old, no big development of the language since years. As MS invest the time to create better macros (see Access 2010) than in developing VBA you will wait a very long time to get a .NET similar quality language.
But if you are fit in programming in .NET - why don't you do it? .NET has no problem to work with an Access database.

I for myself also use the method to store some data of some tables in objects so I can access the data with methods and properties in the VBA code. But I use this only in cases of very small amount of data or data which can be hold in memory to avoid extra queries, but the very most things you can better do in SQL. If anything is possible in SQL then always do it there and not in VBA. I often see code of people where they filling a recordset and doing some loops to do anything with the data. And in many cases the same can be done with some SQL commands.

So the question here is: Why do you lay so much work into creating an object layer, creating algorithms to do complicate conversions - when you simply can store a value, get it back and the result can be loaded with simple SQL? No line of code were needed to create the demo database I attached above.
I have no idea why for example you create a alphanumeric key column which costs a lot time to format it that way - a simple numeric column, an autoincrement ID would give you a very much faster ID. Next: Why do you create two tables, one for IPv4 and one for IPv6? You can use one and the same table to store both if you use the table structure of tblIPv6List of the demo database above: Simply only use the columns Adr1-Adr4 to store the IPv4 addresses and leave the rest empty (or set it to 0, whatever) and then store the type of address. Then you have only one table, one numeric autoincrement key which can be simply used to do any kind of JOIN to other tables.
Yes, I know that means that you store a record with 4 unused columns and a long integer type which is used for byte values in the first four columns for any IPv4 addresses which are probably the most you need at the moment - but hey, do you think you will explode a 2GB database limit of Access with this? Not really. You must at least write a database for maintaining all addresses of a big Internet provider to do this...:-)
And if you don't have enough space in 2GB then you REALLY should think about using a SQL Server database which I would recommend - even in SQL Server Express which is for free you can have 4GB. Or a free database like MySQL.
The biggest performance you can get at all is not, how can you store such simple data as an IP address in less bytes by coding it together, the biggest performance is done by storing it in a simple datatype without any conversion. The link you posted about VBA optimization is correct: Using variables which are not necessary costs performance - but that's the same (and a lot more) with using objects in between the data and the frontend. This is not necessary in Access because any form and report are already class objects able to work with the data directly and performant. Any try to create a layer in between slows down performance. VBA is (my opinion) only the way to create a working frontend, to say what should happen if a user clicks a button - but not for heavy data processing.

So back to the question of performance: If you run a query 100 times then maybe here is the first bottleneck. Running one query 100 times means you make this in VBA - so maybe there is a way to create the loop in SQL instead with a subquery and you'll see that there is no performance problem anymore.
I work daily with a database (having a SQL Server 2005 as backend and Access 2007 as frontend) which works with at least one million records and Access is able to handle it very fast, after a little bit performance tuning. And it is not (only) SQL Server which makes this fast, Access itself with it's Rushmore technology which comes from the formerly fastes desktop database FoxPro is very fast with Jet data as long as it is stored in the same file. So performance tuning is a good idea in Access, too. But if you talk about 600 records in one table and let it be 10,000 records in an IP address table - hey, that's NOTHING for Access. But if you use those two tables and process them through VBA or creating a lot complicate and unnecessary data conversions and encoding/decoding you can slow down any table you want.

Keep it as simple as possible and avoid any additional VBA if it can be also done with SQL - that's the story of success with Access. Follow the general well-known performance tips and tweak queries and you get what you want. For anything else it would be helpful to get an explanation why for example you need a key like "IPv4000001" or a record to create NULL values which any database can handle fast and easy on it's own. These things are the real performance killers.

Cheers,

Christian
0
 
Kumba42Author Commented:
vadimrapp1: Memory lookups?, pity there wasn't a way to easily wire memcache into the project.  That'd help a lot.  The whole reason to squeeze as much as I can per-query is because I anticipate that as the database actually starts to hold information, these queries will be called upon in report generation, so they'll be executing in a loop, once per record, to fill out the report(s).  right now, with just the basics of the info, including the address and port data (not including all the info I have yet to design into tables), a single row takes ~0.5-1s to output.  Multiply times say, 100+ rows, and it'll be a slow report.

JDettman: SHOWPLAN is definitely something I need to checkout.  The problem is I can't make registry tweaks to the machines at my office (IT policies).  So that's something I'll have to experiment at home with.  ISAMStats Method...which object is that a part of?  I'll read that developer guide some, too.  Thanks for the link!

aikimark: Not completely...you'll lose precision and thus run into the possibility of an IPv6 address getting truncated or calculated improperly.  A full-on IPv6 address in BigInt form is 39 digits long, to the left of the decimal.  A Currency data type only holds about 14 digits to the left, and even a Decimal maxes out at 28 digits to the left.  I technically need a 64bit app that could properly handle those, both for VBA and Access.  While future versions of Access will probably be 64bit, I imagine MS will update the VBA stuff to match .NET standards at some point.  Which will mean a complete redesign of the code anyways.

Bitsqueezer: Yeah, I tend to be really verbose...It's been a bad habit long since I was a kid.  Old habits die hard, ya know? :)  I'll need to read your post in greater detail tonight/tomorrow, but the reason for the split tables and the text-based key (IP4XXXXXX, IP6XXXXXX, etc..), is because there will be a third table that will be appended eventually that cannot be stored in a numeric field at all.  Call it a variables table.  It'll basically have fields corresponding to ID (VARXXXXXX), a Name column for the variable (i.e., $LOOPBACK_ADDR, minus the '$'), and the type of variable (Address variable, or port variable).  This is where the union query comes into play to link these together.  So yeah, that variables table is basically the red-headed stepchild of the group.  It's pretty simple in its design, hence why I didn't bother to mention it.

About the only way I can think of to fit that into your proposed design is if there is a way to convert a string into a pure numeric format that can be read back as a string.  SoundEx is one I looked at, but even that still incorporates letters.  I'm unsure of any other known algorithms out there that could possibly convert text of arbitrary length into a digit that could fit into a single long integer data type.

And because of that, the ubiquitous NULL table I devised.  One of my data tables will store many combinations of the values from the union query of those tables, but it will NOT be unique per row based only on one column.  It'll be tied into a one-to-many relationship, and so to make that efficient, I created a composite primary key on it.  But that means I cannot have real NULLs in the table, because Access forbids those in a primary key.

And my apologies about leaving out bits of info like that.  It's the drawing of the line about how much I can give away about my design that I'm being careful on.
0
 
Vadim RappCommented:
> pity there wasn't a way to easily wire memcache into the project

I'd think, for the project with these serious requirements to performance, maybe another tool would work better. Access after all is still is looked at as productivity application. How about visual studio.

> so they'll be executing in a loop, once per record

I admit that I only looked at the long description of the thing, but maybe it would be possible to execute one big query that would process all the rows, rather than call the query for each one? this would be quite real in sql server database, where you could write a stored procedure, so even if you can't put everything into one SQL, still per-row execution would be going on in internal code on server side.

0
 
Kumba42Author Commented:
vadimrapp1: I'll have to look into this and see if my logic can maybe grab the entire recordset and parse it into the format I'm going for rather than per-record.  Unsure, though.

aikimark: I actually do this already.  Given the address:
2001:0db8:85a3:0000:0000:8a2e:0370:7334

I split it as such:
20010db8 85a30000 00008a2e 03707334

Convert each to a long, and store it.  It's getting it back out that proves to be fun.  When converting from Hex to aeither a long or and int, you have to pass the symbol for a long datatype to Val on the end of the Hex, or it can sometimes mangle things (i.e., Val("&Hdeadbeef&"))

I'm working through my code modules that handle IPv4 and IPv6 addresses and trying out different optimization ideas.  I already solved the problem with the ports table being slow to changing that database layer code to "convert" any KeyId for a Port number straight into the number, and only invoke a query lookup when pulling in a variable name.

Addresses will be a bit different, though.  Have to see how well I can optimize that code.
0
 
Jim Dettman (Microsoft MVP/ EE MVE)President / OwnerCommented:
Couple of comments:
<<IPv6 is tougher.  Access, let alone VBA, can't handle an IPv6 address converted to its numeric format (which is massive).  So I opted instead to first expand the IPv6 address out to its full form, eight hextet groups, then split that address into for components.  Each component can be converted to a long integer and stored in an Access table.  I lose the ability to properly sort, but I think that's a fair tradeoff.>>
I'm left wondering why you simply didn't store the IPv6 address as a string.  ie.:
3ffe:1900:4545:3:200:f8ff:fe21:67cf
 In the table?
<<The database will store a lot of things, each in their own tables.  Many of the tables are small and simple, maybe 2-3 fields, tops, but they will all contain data that will be combined into a larger dataset.  I chose this design path to avoid repeating data too many places.>>
  I'm also left wondering if you have a proper relational design at this point as it sounds like you've decomposed the tables.  Your going to find out quickly that JET (or any RDBMS for that matter) really needs a proper design in order to function correctly.  Often  poor SQL query performance is a general indicator of an incorrect DB design.
JimD.
 
0
 
aikimarkCommented:
I recommend using the CLNG() function rather than the & character to cast the value as a long integer.  But I'm not sure that this will affect the performance you are experiencing.

I would like some context on this problem.  How often to you need to run this union query and how are you using the query results?

How dynamic is this data?
0
 
Vadim RappCommented:
Guys, you are really trying to make Corolla to adequately compete in Formula 1.
0
 
aikimarkCommented:
@jdettman

Depending on the use of the data, I was also thinking about 8 integer variables, equal to the IPV6 parts.

One might also prepend "0000:0000:0000:0000:" to the IPV4 data to make it the same length.

================
Depending on the responses, I think we might be able to make this perform quite well, even under the heaviest loads.
0
 
Kumba42Author Commented:
Minor update: I've "modified" my two code modules that handle IPv4 and IPv6 and gotten some marginal speed improvements out of them (had a loop that was really necessary that I eliminated in both that really helped).  Did some other tweaks, and can get the queries to run about...~0.4s faster when tested 10 times in a loop.  Not much, but hey, better than nothing.

When I left off last week, I was working on merging the IPv4 and IPv6 tables together to see what that speed improvement was like, but that'll take an additional day or two next week to finish up.
0
 
Kumba42Author Commented:
Err, last comment, necessary == unnecessary.
0
 
aikimarkCommented:
Please answer my questions in this comment: http:#31740881
0
 
Kumba42Author Commented:
Haven't forgotten about this, just been sidelined by other activities at work.  I'll answer the above comments when I get back into the swing of things with this project.  Thanks for waiting!
0
 
Kumba42Author Commented:
This question isn't abandoned.  I just have been sidetracked at work and haven't really gotten back into the project that this question addresses yet.  If needed, I can close & split the point total four ways to each of the commenters (who all have some useful suggestions).
0
 
aikimarkCommented:
@Kumba42

Thank you for responding.  There is no need to close this prematurely.  Do have an estimate of when you will be able to return your attention to this question?
0
 
Kumba42Author Commented:
I'm going to close and split this into multiple point awards.  I picked up a few good ideas from the various comments offered.

Ultimately, I've done the following:
- Merged my IPv4 and IPv6 tables into one table.

- Kept the Variables table separate.

- Split away the use of Union Queries for per-row lookups.  Instead, I made my custom VBA function run a very small, parametrized SQL lookup to the IPAddr or IPVar tables directly when it needs a value.  Doing this (multiple small SQL calls) over looking stuff up once in a large Union is somewhat equal, but it looks like over multiple calls, the small SQL invokes beat out the Union.

- Basically given up allowing the Query to format the data.  Right now, that, like the tables, will present RAW data to a report.  The report will call various VBA functions to format the data for human reading.  It turns out this not only simplifies a few queries (thus making them faster), but it also works better for some of my VBA class modules that (for now) use private arrays top store some pre-defined values.  My next task is to eliminate these and just pull the values from the database at load up and keep them in global variables for later use (the globals might be in the Class level, though.  Haven't decided yet).


All-in-all, This Access design isn't a permanent solution for my needs anyways.  It will, however, let me lay out a foundation that I can migrate to something more robust, such as SQL Server or maybe a Linux/Solaris database format (like MySQL/PostGres), or maybe something entirely new -- Python + Django framework, for example.  Only time will tell in that regard.

Thanks for everyone's comments!
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.