Database Normalization - File Inventory - Change tracking

Rich Rumble
Rich Rumble used Ask the Experts™
on
Let's say I have 1000 computers, all nearly identical builds, users add and change things here an there, but overall very identical boxes. Let's assume I'm collecting the path, name and file version for each file on a users PC. How do I create an efficient index, I don't want to store c:\windows\system32\notepad.exe a thousand times in the db, I would think each machine could have a pointer instead to that files entry (granted the path/version/name match, lets assume they do)
The concept is as follows...
machine01 has files: 1-200, 344, 421-500, 512-566, 578, 590, 600-700 etc...
Or should it go the other way
notepad.exe is on machines: 1-88, 90-95, 100
calc.exe is on machines: 2-41, 42, 44, 48-90, 92-100

Each machine writes a local inventory, ftp's the file to a server that add's the files/path/version to the db, and would then be able to through and update pointers after all the file data is entered.
Well I guess I'm kind of answering my own question, if I want to search for machines with calc.exe on them, the second example is clearly better... I'm going to ask the question anyway in case I have no idea what I'm talking about.
I'm also wanting to track the changes... machine01 had files 44, 67, 99 changed to 1002, 1003, 1004, and files 2, 19 and 55 deleted. I know I'm all over the place with this question, but it should convey the idea.
-rich


Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Hi Rich,

Store the path names in one table and the file names in another. When you store a file name, store the primary key of the path name in the filename record.

CREATE TABLE pathnames
(
id integer IDENTITY, -- some syntax omitted
name varchar (3000)
);

CREATE TABLE filenames
(
id integer IDENTITY, -- omitted here, too.
pathid integer,
name varchar (150)
);

Assume two paths are in the database

id path
1 c:\
2 c:\temp

Then store the files in the root directory into the filename table

id pathid name
1 1 config.sys
2 1 autoexec.bat
3 1 command.com

And add the files in c:\temp

id pathid name
4 2 readme.txt
5 3 someotherfile


etc....


Kent
Rich RumbleSecurity Samurai
Top Expert 2006

Author

Commented:
OK, I think I'm getting there... below is what I'm looking to match up.
Filename, path, hash's and associate then with a machine_ID
From a search prospective, what is the best way to store and retrieve...
these file names, with these file hash's are on machine_x in these path locations...

Machine 01 c:\windows\explorer.exe ae7a08c05f72a9242734c03230a5cd7f
Machine 01 c:\windows\system32\notepad.exe d7bfdb6d6acf2d7e0ecb25d2756d8fd6
etc...
-rich
file_name_ID		file_name
0		notepad.exe
1		copy of notepad.exe
2		regedit.exe
3		explorer.exe
 
hash_ID		md5	Sha-1		Sha-256
0		d7bfdb6d6acf2d7e0ecb25d2756d8fd6		36ef330f6fb62034096f32e70fda7748adc8c2cd		65d3e1b526ff60415f716fb419f87acea67d8192234c247569d619b1294ba708
1		872a60b75ce6a09033fbe2461d44e696		86425f0e149458903dca936d8ed65d7ec885fa8b		559b33d2bf40231630c96083833d2ceba3a52a28321078300d42cf936cf86e40
2		ae7a08c05f72a9242734c03230a5cd7f		529439656b329a08a3570703e97d37fc114c4b35		c960594228cd932c7769bcc04b9f74858368081b5941b39f434e1100568204f3
 
file_to_hash_ID		file_name_ID		hash_ID
0		0		0
1		1		0
2		2		1
3		3		2
 
path_ID		path
0		c:\windows\system32\
1		d:\windows\system32\
2		c:\windows\
 
folder_ID		path_ID		file_to_hash_ID
0		0		0
1		1		0
2		2		1
3		2		2

Open in new window

Rich RumbleSecurity Samurai
Top Expert 2006

Author

Commented:
I think I worked it out, I awarded points because I was forgetting to separate the path into it's own table
-rich
Hi Rich,

I apologize for not getting back to this thread.  I don't remember ever seeing a message that the thread had been updated.


Kent
Rich RumbleSecurity Samurai
Top Expert 2006

Author

Commented:
Np, I think I'm on my way, making the keysfor the DB's  as we speak...
-rich

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial