Problem with this code?

I'm trying to parse a file to a database. this is the type of information in the file:
DISCID=a50b390c
DTITLE=Various Artists / Put your Tongue to the Rail (Disc 1)
DYEAR=1999
DGENRE=Data
TTITLE0=Nothing Is True - Mae Pang
TTITLE1=It's too Late - Psyclone Rangers
TTITLE2=Three Sisters - Thorazine
TTITLE3=People Who Died - Bootleg Thorazine Nightmare
TTITLE4=Catholic Boy - Bottom
TTITLE5=City Drops into the Night - Stomaboy w/ Marrow
TTITLE6=I Want the Angel - Jen Hess
TTITLE7=Day and Night - Iota
TTITLE8=Differing Touch - Marah
TTITLE9=Crow - Del Pess
TTITLE10=Plain Division - Mia Johnson
TTITLE11=Wicked Gravity - Sesitive Pricks

I have 4 tables as follow which I think are fully normalised:
CREATE TABLE ARTIST (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(90) NOT NULL
);

CREATE TABLE SONG (
id  INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(90) NOT NULL,
tracknumber INT UNSIGNED NOT NULL,
artist_id INT UNSIGNED NOT NULL
);

CREATE TABLE ALBUM  (
id  INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(90) NOT NULL,
year INT UNSIGNED NOT NULL,
genre VARCHAR(90) NOT NULL,
disc_id VARCHAR(90) NOT NULL,
UNIQUE KEY (disc_id)
);

CREATE TABLE ALBUM_SONG (
song_id INT UNSIGNED NOT NULL,
album_id INT UNSIGNED NOT NULL,
UNIQUE KEY (song_id, album_id)
);

My problem is the code below. there is something wrong with the $album_id variable below, can anyone see why? Also, how can I populate the ALBUM_SONG table?
#!usr/bin/perl
use DBI;
use strict;
use warnings;
use diagnostics;
use FileHandle;

my $db_handle = DBI->connect("dbi:mysql:database=kilkennp_db2;host=mysql.cs.tcd.ie;user=kilkennp;password=yee5iePh")
    or die "Couldn't connect to database: $DBI::errstr\n";

my $fh = new FileHandle;
$fh->open("<$ARGV[0]") or die "could not open file\n";

open(OUTTRACKS,">makefree") or die "could not open trackbigrams\n";

my($album_id,$artist_id,$title,$year,$genre,$disc_id);
while (<$fh>)
      {
            if(/DTITLE=(.*)/)
            {
            my $length = length ($_);
            if ($length > 8)
                  {
                  if( /DTITLE=(.*) \/ (.*)/ )
                        {
                        my $artist = $1;
                        $title = $2;
                        $db_handle->do('INSERT INTO ARTIST (name) VALUES (?);',undef,$artist);
                        $artist_id=($db_handle->selectrow_array("SELECT id FROM ARTIST WHERE name=?;",undef,$artist))[0];
                        }
                  else
                        {
                        $title = "$1";
                        my $artist = "N/A";
                        $db_handle->do('INSERT INTO ARTIST (name) VALUES (?);',undef,$artist);
                        $artist_id=($db_handle->selectrow_array("SELECT id FROM ARTIST WHERE name=?;",undef,$artist))[0];
                        }

                  }
            elsif ($length == 8)
                  {
                  $title = "N/A";
                  my $artist = "N/A";
                  $db_handle->do('INSERT INTO ARTIST (name) VALUES (?);',undef,$artist);
                  $artist_id=($db_handle->selectrow_array("SELECT id FROM ARTIST WHERE name=?;",undef,$artist))[0];

                  }
            }
            elsif( /DYEAR=(.*)/ )
                  {
                  my $length = length ($_);
                  if ($length > 7)
                        {
                        $year = $1;
                        }
                  elsif ($length == 7)
                        {
                        $year = 1;
                        }
            }
            elsif( m/DGENRE=(.*)/ )
                  {
                  my $length = length ($_);
                  if ($length > 8)
                        {
                        $genre = $1;
                        }
                  elsif ($length == 8)
                        {
                        $genre = "N/A";
                        }
                  }
            elsif( m/DISCID=(.*)/ )
                  {
                  my $length = length ($_);
                  if ($length > 8)
                        {
                        $disc_id = $1;
                        }
                  elsif ($length == 8)
                        {
                        $genre = "N/A";
                        }
                  }
            elsif( /TTITLE(\d*)=(.*)/ )
                  {
                  my $value = length($2);
                  if ($value > 0)
                        {
                   $db_handle->do("INSERT INTO SONG (name,tracknumber,artist_id,album_id) VALUES (?,?,?,?);",undef,$2,$1,$artist_id,$album_id);
                        }
                  else
                        {
                        my $trackname = $2;
                        $trackname = "N/A";
                         $db_handle->do("INSERT INTO SONG (name,tracknumber,artist_id,album_id) VALUES (?,?,?,?);",undef,$trackname,$1,$artist_id,$album_id);
                        }
                  }
            
      if( $title && $year && $genre && $disc_id )
            {
            $db_handle->do("INSERT INTO ALBUM (name,year,genre,disc_id) VALUES (?,?,?,?);",undef,$title,$year,$genre,$disc_id);
            $album_id=($db_handle->selectrow_array("SELECT id FROM ALBUM WHERE name=? AND year=? AND genre=? AND disc_id=?;",undef,$title,$year,$genre,$disc_id))[0];
            undef $_ for($title,$year,$genre,$disc_id);
            }
      }

$fh->close();
$db_handle->disconnect();
close(OUTTRACKS);

kilkennpAsked:
Who is Participating?
 
wlfsConnect With a Mentor Commented:
Hi kilkennp,
this is a late answer, hope you still need it.

Off-topic: Make it a habit to *not* post any passwords. You posted the server address and the password for your DB. Quite a potential security leak. Maybe you are using the same password and username for ftp-login. You don't want anyone to try out, do you?

> there is something wrong with the $album_id variable below, can anyone see why?
Yes :). The problem is, it is never set before you use it. It stays undef right until you insert it into the table.

There are other problems: You will insert the same artist again and again if you have many discs of her/him.
  $db_handle->do('INSERT INTO ARTIST (name) VALUES (?);',undef,$artist);
inserts the artist, regardless of her/him being already existing in the database. You should check whether the artists already exists before inserting again.

Are you sure, you will never process the same file twice (even by accident)? Before writing anything to the database you should check whether the disc ID already exists in the database. If yes, abort or delete the respective entries before re-inserting them.

There is also a problem with the tables SONG and ALBUM_SONG. In table SONG you have a field tracknumber. This is obviously only valid for a specific album. Thus, you need multiple entries of one and the same song in case it is contained on multiple different albums.
The correct (normalized) way to solve this, is to move the field tracknumber to the ALBUM_SONG table.
This, in turn, causes the same problem as stated above for the artist. You shouldn't insert the same song (from different albums) into the database twice.

I would suggest the following (tested and verified):


CREATE TABLE ARTIST (
id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(90) NOT NULL UNIQUE KEY
);

CREATE TABLE SONG (
id  INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(90) NOT NULL,
artist_id INT UNSIGNED NOT NULL,
UNIQUE KEY (name, artist_id)
);

CREATE TABLE ALBUM  (
id  INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(90) NOT NULL,
year INT UNSIGNED NOT NULL,
genre VARCHAR(90) NOT NULL,
disc_id VARCHAR(90) NOT NULL,
UNIQUE KEY (disc_id)
);

CREATE TABLE ALBUM_SONG (
song_id INT UNSIGNED NOT NULL,
album_id INT UNSIGNED NOT NULL,
tracknumber INT UNSIGNED NOT NULL,
UNIQUE KEY (song_id, album_id)
);



#!usr/bin/perl
use strict;
use warnings;
use diagnostics;
use DBI;

my $db_handle = DBI->connect("dbi:mysql:database=kilkennp_db2;host=mysql.cs.tcd.ie;user=kilkennp;password=********")
    or die "Couldn't connect to database: $DBI::errstr\n";

# read in file and assign key value pairs to %data hash
# keys are converted to upper case
# this replaces your while(<$fh>) loop
open my $fh, "<$ARGV[0]" or die "could not open file\n";
my %data = map {m/(\w+)\s*=\s*(.*)/; (uc $1, $2)} <$fh>;
close $fh;

# exit if no disc id or no disc title is provided in file
die "no disc id provided\n" unless $data{DISCID} && $data{DTITLE};

# check whether album already exists in DB
my $sth = $db_handle->prepare("SELECT id FROM ALBUM where disc_id=?;") or die ( $DBI::errstr );
$sth->execute($data{DISCID}) or die ( $DBI::errstr );
if (my $album_id = ($sth->fetchrow_array)[0]) {
    # existing album is deleted from DB
    $db_handle->do("DELETE FROM ALBUM where id=?;",undef,$album_id);
    $db_handle->do("DELETE FROM ALBUM_SONG where album_id=?;",undef,$album_id);
}
$sth->finish;

# process DTITLE and ARTIST
if ($data{DTITLE} =~ m|(.+)\s/\s(.+)|) {
    $data{DTITLE} = $2;
    if ("various artists" eq lc $1) {
      $data{ARTIST} = "N/A";
    }
    else {
      $data{ARTIST} = $1;
    }
}
else {
    $data{ARTIST} = "N/A";
}

# insert album into database
$db_handle->do("INSERT INTO ALBUM (name, year, genre, disc_id) VALUES (?,?,?,?);",undef,$data{DTITLE},$data{DYEAR}||1,$data{DGENRE}||"N/A",$data{DISCID});
my $album_id = ($db_handle->selectrow_array("SELECT id FROM ALBUM WHERE disc_id=?;",undef,$data{DISCID}))[0];

# insert artist into database
# and/or retrieve artist id from DB
$sth = $db_handle->prepare("SELECT id FROM ARTIST where name=?;") or die ( $DBI::errstr );
$sth->execute($data{ARTIST}) or die ( $DBI::errstr );
unless ($data{ARTIST_ID} = ($sth->fetchrow_array)[0]) {
    $db_handle->do("INSERT INTO ARTIST (name) VALUES (?);",undef,$data{ARTIST});
    $data{ARTIST_ID} = ($db_handle->selectrow_array("SELECT id FROM ARTIST WHERE name=?;",undef,$data{ARTIST}))[0];
}
$sth->finish;

# loop through all songs
foreach my $track_key (grep m/^TTITLE\d+$/, keys %data) {
    # extract track number (index + 1)
    my $track_number = ($track_key =~ m/^TTITLE(\d+)$/)[0] + 1;

    # if album artist is n/a, try to extract artist from track title
    my $artist;
    my $artist_id;
    if ($data{ARTIST} eq "N/A" && $data{$track_key} =~ m/(.+)\s-\s(.+)/) {
      # artist extracted from track title
      # set new track title
      $data{$track_key} = $1;
      # insert artist into DB and retrieve artist id
      $artist = $2;
      $sth = $db_handle->prepare("SELECT id FROM ARTIST where name=?;") or die ( $DBI::errstr );
      $sth->execute($artist) or die ( $DBI::errstr );
      unless ($artist_id = ($sth->fetchrow_array)[0]) {
          $db_handle->do("INSERT INTO ARTIST (name) VALUES (?);",undef,$artist);
          $artist_id = ($db_handle->selectrow_array("SELECT id FROM ARTIST WHERE name=?;",undef,$artist))[0];
      }
      $sth->finish;
    }
    else {
      # artist not extracted from track title
      # take artist and artist id of disc
      $artist = $data{ARTIST};
      $artist_id = $data{ARTIST_ID};
      # set track title to n/a if its empty
      $data{$track_key} ||= "N/A";
    }

    # insert into SONG table
    my $song_id;
    $sth = $db_handle->prepare("SELECT id FROM SONG where name=? AND artist_id=?;") or die ( $DBI::errstr );
    $sth->execute($data{$track_key},$artist_id) or die ( $DBI::errstr );
    unless ($song_id = ($sth->fetchrow_array)[0]) {
      $db_handle->do("INSERT INTO SONG (name, artist_id) VALUES (?,?);",undef,$data{$track_key},$artist_id);
      $song_id = ($db_handle->selectrow_array("SELECT id FROM SONG WHERE name=? AND artist_id=?;",undef,$data{$track_key},$artist_id))[0];
    }
    $sth->finish;

    # insert into ALBUM_SONG table
    $db_handle->do("INSERT INTO ALBUM_SONG (song_id, album_id, tracknumber) VALUES (?,?,?);",undef,$song_id,$album_id,$track_number);
}

$db_handle->disconnect();
0
All Courses

From novice to tech pro — start learning today.