Parsing XML file

Hello All,

I need a program to parse the xml file. The program should create the insert statements to the database.

Example the below code should create the MySQL insert statements to the book table where the attributes are author, title publisher, address etc etc and same for the article table.

The sample file looks something like this
-----------------------------------------------
<?xml version="1.0" encoding="utf-8" ?>

<bibtex:file xmlns:bibtex="http://bibtexml.sf.net/">
    <bibtex:entry id="aatre1998-862">
        <bibtex:book>
            <bibtex:author>Aatre, Vasudev K. and Varadan, V. K. and Varadan, V. V. and Engineers., Society of Photo-optical Instrumentation</bibtex:author>
            <bibtex:title>Smart materials, structures, and MEMS : 11-14 December 1996, Bangalore, India</bibtex:title>
            <bibtex:publisher>Spie</bibtex:publisher>
            <bibtex:address>Bellingham, Wash., USA</bibtex:address>
            <bibtex:note>98213336 Vasudev K. Aatre, Vijay K. Varadan, Vasundara V. Varadan, chairs/editors ; sponsored by SPIE--the International Society for Optical Engineering ... [et al.]. SPIE proceedings series ; v. 3321. Includes bibliographical references and index. Proceedings of SPIE--the International Society for Optical Engineering ; v. 3321.</bibtex:note>
            <bibtex:keywords>Smart materials Congresses. Smart structures Congresses. Microelectromechanical systems Materials Congresses. Microactuators Materials Congresses.</bibtex:keywords>
            <bibtex:year>1998</bibtex:year>
        </bibtex:book>

<bibtex:article>
            <bibtex:author>L. N. Zhang</bibtex:author>
            <bibtex:title>An explicit Moyal product realization of quantum
 deformation</bibtex:title>
            <bibtex:journal>Comm. Theoret. Phys.</bibtex:journal>
            <bibtex:volume>26</bibtex:volume>
            <bibtex:year>1996</bibtex:year>
            <bibtex:pages>207--212</bibtex:pages>
        </bibtex:article>
    </bibtex:entry>
</bibtex:file>



cutie_smilyAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ps15Commented:
use XML::Simple
use Data::Dumper

my $data = XMLin($file);
print Dumper $data;

# This will show you the internal perl structure, depending on how exactly you want to insert it into a Database, you can go on from here ;p
0
ozoCommented:
use XML::Simple qw(:strict);
my $ref = XMLin(join('',<DATA>),ForceArray=>[qw(bibtex:book bibtex:article)],KeyAttr=>[]);
for$table( qw(book article) ){
  for( map{@$_}@{$ref->{'bibtex:entry'}}{"bibtex:$table"} ){
    $dbh->do("INSERT INTO $table (author,title,publisher,address) VALUES (?,?,?,?)",undef,@{$_}{qw(bibtex:author bibtex:title bibtex:publisher bibtex:address)});
  }
}
__DATA__
<?xml version="1.0" encoding="utf-8" ?>

<bibtex:file xmlns:bibtex="http://bibtexml.sf.net/">
    <bibtex:entry id="aatre1998-862">
        <bibtex:book>
            <bibtex:author>Aatre, Vasudev K. and Varadan, V. K. and Varadan, V. V. and Engineers., Society of Photo-optical Instrumentation</bibtex:author>
            <bibtex:title>Smart materials, structures, and MEMS : 11-14 December 1996, Bangalore, India</bibtex:title>
            <bibtex:publisher>Spie</bibtex:publisher>
            <bibtex:address>Bellingham, Wash., USA</bibtex:address>
    <bibtex:note>98213336 Vasudev K. Aatre, Vijay K. Varadan, Vasundara V. Varadan, chairs/editors ; sponsored by SPIE--the International Society for Optical Engineering ... [et al.]. SPIE proceedings series ; v. 3321. Includes bibliographical references and index. Proceedings of SPIE--the International Society for Optical Engineering ; v. 3321.</bibtex:note>
            <bibtex:keywords>Smart materials Congresses. Smart structures Congresses. Microelectromechanical systems Materials Congresses. Microactuators Materials Congresses.</bibtex:keywords>
            <bibtex:year>1998</bibtex:year>
        </bibtex:book>

<bibtex:article>
            <bibtex:author>L. N. Zhang</bibtex:author>
            <bibtex:title>An explicit Moyal product realization of quantum
 deformation</bibtex:title>
            <bibtex:journal>Comm. Theoret. Phys.</bibtex:journal>
            <bibtex:volume>26</bibtex:volume>
            <bibtex:year>1996</bibtex:year>
            <bibtex:pages>207--212</bibtex:pages>
        </bibtex:article>
    </bibtex:entry>
</bibtex:file>
0
cutie_smilyAuthor Commented:
Hi,

Thanks very much for the code. I am a real beginner in the perl. Below is the error I got

Can't locate XML/Simple.pm in @INC (@INC contains: /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/lib/perl5/site_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0 /usr/lib/perl5/vendor_perl /usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 .) at xml_parser_new.pl line 3.

Hope I will be helped
0
Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

ps15Commented:
you haven't got the required module XML::Simple, try looking for a package that hast a name something like that in your distribution,
or run
perl -MCPAN -e 'install XML::Simple'
(as root if you want it installed globally)
0
cutie_smilyAuthor Commented:
I think I don't have permissions....is there a different way?

$ cd /
$ perl -MCPAN -e 'install XML::Simple'
CPAN: Storable loaded ok
mkdir /root/.cpan: Permission denied at /usr/lib/perl5/5.8.0/CPAN.pm line 2264

thanks
0
ps15Commented:
Do you have root access to the System your working on ?
if you do, execute the command as root, if you don't, tell your administrator to do it ;)
0
cutie_smilyAuthor Commented:
Thanks. I don't have the root permissions. I am just curious what does XML::Simple do? If it is used to connect to the database then probably we can try to create a file with the insert statements where I can load that file into the table.

Thanks
0
ps15Commented:
XML::Simple is a Simple XML Parser, a Perl Module to Read XML Simply into perl
0
ozoCommented:
perldoc CPAN
                 5)  I am not root,  how  can  I  install  a  module  in  a
                     personal directory?
                     You will most probably like something like this:
                       o conf makepl_arg "LIB=~/myperl/lib \
                                         INSTALLMAN1DIR=~/myperl/man/man1 \
                                         INSTALLMAN3DIR=~/myperl/man/man3"
                       install Sybase::Sybperl

                     You  can  make  this setting permanent like all o conf
                     settings with o conf commit.
                     You will have  to  add  ~/myperl/man  to  the  MANPATH
                     environment  variable and also tell your perl programs
                     to look into ~/myperl/lib, e.g. by including
                       use lib "$ENV{HOME}/myperl/lib";

                     or setting the PERL5LIB environment variable.
                     Another thing you should bear  in  mind  is  that  the
                     UNINST  parameter  should  never be set if you are not
                     root.

0
cutie_smilyAuthor Commented:
Thanks for the info.

I aplogize to ask this:
-----------------------
I am not restricted to write the parsing program in a specific language. So, is there any other way we can do this task?

thanks
0
ozoCommented:
# if you know the file will always be in exactly that format,
# you may not need a full XML parser, and you may be able to do something like
$/="</bibtex:entry>";
while( <> ){
    while( /<bibtex:(book|article)>(.*?)<\/bibtex:\1>/sg ){
        my($table,$attributes)=($1,$2);
        my %attributes=();
        while( $attributes =~ /<bibtex:(author|title|publisher|address)>(.*?)<\/bibtex:\1/sg ){
           $attributes{$1}=$2;
        }
        $dbh->do("INSERT INTO $table (".
                 join(",",keys %attributes).
                 ") VALUES (".
                 join(",",("?")x keys %attributes).
                 ")",undef,values %attributes;
    }
}
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
cutie_smilyAuthor Commented:
Thanks very much for the code.

I am getting a syntax error

syntax error at perl_loop.pl line 14, near "%attributes ;"
Execution of perl_loop.pl aborted due to compilation errors.

Please let me know how do I supply my input file.

thanks again
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Web Languages and Standards

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.