easy question - databases

choks99
choks99 used Ask the Experts™
on
I have a database that is setup as a simple txt file. Each column for the database is separated by a tab.

For example
1 five yellow
2 six  red
3 eight blue

etc...

Im using a windows platform and need to do statistics on the data. (Like whats the diff colors used by users 1-3, avg age , user id's etc) The columns in the txt file had no header names. How do I "load the txt file" and how would I do simple operations on the columns (which have no headers)so that I can compute averages and such......
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®

Author

Commented:
I also want to be able to check to see if any of the ids (first column) are not unique etc....
open DATA, "data.txt";
$NumberCol = 0;
$ColorCol = 1;

# Read the database file into a hash
while (<DATA>) {
   chomp;
   ($first, @rest) = split /\t/;
   $Data{$first} = [@rest];   # Values of the hash are references to anonymous arrays
   $CheckForDuplicates{$first}++;
   die "Error: Key $first is duplicated\n" if $CheckForDuplicates{$first} > 1;
}

# To use the values of the hash, dereference the array (${.....}) and select the required element ([...])
printf "Row 3 color is %s\n", ${$Data{3}}[$ColorCol];
printf "Row 2 number is %s\n", ${$Data{2}}[$NumberCol];

Commented:
choks99,

Hope PC_User321 provide a good soln.

sureshp
Fundamentals of JavaScript

Learn the fundamentals of the popular programming language JavaScript so that you can explore the realm of web development.

Author

Commented:
alright, I understand how to read the data in. Im a little confused on the stats so let me tell you exactly what I need to do. I'll give you more points if you can help me do these functions...

The database is like this (w/o headers)
 
ID  Education  Age Country
1    Bachelor's Degree      
3
56

Author

Commented:
my bad, it messed up....(I will give alot of points for these answers)

The database is like this (w/o headers)

ID  Education  Age Country
1    Bachelor   23   US
3    Some Highschool 34 Canada
56    None 37 US

I need to do the 1.mean, 2.median, 3.range for all the numeric columns. Also, a 4.freq (Bachelor - 255 entries etc) for all the columns. And I have another db and I want to check if any of the 5. ids from the one db are in the other. and print them if they are....
close DATA;

$EducCol = 0;
$AgeCol = 1;
$CountryCol = 2;

# Check for IDs in another file
open DATA, "another.txt";
while (<DATA>) {
   ($first, @rest) = split /\t/;
   print "Warning: ID $first is in both files\n" if exists $CheckForDuplicates{$first};
}

# Get bachelor frequency
foreach $ID (%Data) {
     $bachelorCount++ if (${$Data{$ID}}[$EducCol] eq Bachelor);
}

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial