bash script to import to mysql

Need a bash script to import two files into mysql.

File 1 is a non formatted text file that looks like;

1 some item description on the rest of this line
2 some item description on the rest of this line
3 some item description on the rest of this line
4 some item description on the rest of this line
5 some item description on the rest of this line
6 some item description on the rest of this line

File 2 looks like;

inv_number 1
inv_number 2
inv_number 3
inv_number 4
inv_number 5
inv_number 6

There is an auto increment ID for the database, then three fields.
The text field is varchar 254.
Both text files sometimes have the two columns separated by one space but sometimes by one tab.
The table looks something like this;

ID       count,       desc,              inventory
01      1                the text         inv_number
 

After file one is imported into the database, file 2 should insert the inv_number for the matching record 'count' previously inserted from file #1.
projectsAsked:
Who is Participating?
 
arnoldCommented:
I do not see what you are dealing with.  All I can do is relying on the information you provide even when it is masqueraded which I understand, to match what the issue is.


To deal with possible appostrophes within the data set, one option is to convert the apostrophe within the assignment to is to try to escape them.
in the file1.pl
or change the the apostrophe's surrounding $data with double quotes.

Or try this
#!/usr/bin/perl
while (<>) {
chomp();
s/^\s+//;
s/\r+//;
($count,$data)=split(/\s+/,$_,2);
$data=~ s/'/\\\'/g; 
print "insert into somedb.sometable (count,desc) values ('$count','$data') on duplicate key update desc='$data';\n";
}

Open in new window

0
 
arnoldCommented:
what part of which item is the data and into what table/column?
0
 
projectsAuthor Commented:
Sorry, let me try to clear that up. I'll change a few things around so please don't bother looking at the first post.

File 1 is a non formatted text file that contains text like the following.

23 this item is big bright and expensive
43 this item is one that we have in stock
***snip***

The first column is the number on the left which goes into the 'count' field in the table.
The second column is the text which goes into the 'desc' field in the table.

Once imported, the DB table would look like;

id    count      inventory          desc
01   23                                      this item is big bright and expensive
02   43                                      this item is one that we have in stock
***snip***

File 2 is also a non formatted text file that contains text like the following.

inv_5473    23
inv_8709-4 43

The inv_xxxx on the right goes into the 'inventory' field in the table.
The second column is the 'count' record created above.
When reading the file 2, we want to update the inv_xxxx into the newly created 'count' which is a record number.

Once inserted into the DB, the final table would look like;

ID       count,      inventory            desc
01      23             inv_5473             this item is big bright and expensive
02      43             inv_8709-4          this item is one that we have in stock
***snip***
0
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

 
arnoldCommented:
What is the significance of count, it seems unique such that the ID/auto increment column is unnecessary.
If you have the table created, make sure count is unique indexed while inventory and desc either allowed to be nulls or default to '' to avoid getting errors when not all data is available.
For handling the count/desc
#!/usr/bin/perl
while (<>) {
chomp();
($count,$data)=split(/ /,$_,2);
print "insert into tablename (count,desc) values ('$count','$data') on duplicate update desc='$data';\n";
}

Open in new window


To handle the inventory.
#!/usr/bin/perl
while (<>) {
chomp();
($inventory,$count)=split(/ /,$_,2);
print "insert into tablename (inventory,count) values ('$inventory','$count') on duplicate update inventory='$inventory';\n";
}

Open in new window


Perl can be used to get both files and build the dataset.

Pass the first file to create the first set of data with the output passed to mysql -u username --password=password dbname
0
 
projectsAuthor Commented:
I'm not sure if an auto incremented 'id' is actually needed as each record will be its own unique record to begin with. However, the number of records could change from day to day.

'count' is actually a reference inventory number which could change.
Can you give me a command line example? I'm not sure where the file name goes?
0
 
arnoldCommented:
the name of the script (provided it is executable (chmod 700) or use perl filename.pl
make sure you name the script to match the file you will be pushing through it.
i.e. file1 should use the first script i.e. count, description
  cat filename_with_data1.txt | perl import_file1.pl | mysql -u username -ppassword databasename
second script is looking for inventory,count
cat filename_with_data2.txt | perl import_file2.pl | mysql -u username -ppassword databasename

Neither validates the input, so if you feed the wrong file into the wrong script, your DB will likely error out when count if defined as a number is being inserted as a text.
0
 
projectsAuthor Commented:
Not sure I am following.

I created the first script as script1 and made it 700.

The contents of file one are

   23 this item is big bright and expensive
   43 this item is one that we have in stock
***snip***

BUT, there are spaces before the first column.

I then ran;

# cat file1.txt | perl file1.txt | mysql -u root -p somedb

Bareword found where operator expected at asn1.txt line 1, near "23 this"
        (Missing operator before this?)
Number found where operator expected at asn1.txt line 2, near "expensive
   43"
        (Do you need to predeclare expensive?)
Bareword found where operator expected at asn1.txt line 2, near "43 this"
        (Missing operator before this?)
Number found where operator expected at asn1.txt line 3, near "stock
    1"
0
 
arnoldCommented:
You need to run
cat file1.txt | perl script1 | mysql -u root -p somedb

You were running the same input file as though it was a script that is where your bareword error is coming from.
0
 
projectsAuthor Commented:
oh ya.. oops

Does it matter that the count is an int
0
 
projectsAuthor Commented:
# cat file1.txt | perl file1.pl | mysql -u root -p somedb
Enter password:
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'update desc='  23 this item is big bright and expensive'' at line 1

In your example, is $count part of the operation code or the db field?

#!/usr/bin/perl
while (<>) {
chomp();
($count,$data)=split(/ /,$_,2);
print "insert into tablename (count,desc) values ('$count','$data') on duplicate update desc='$data';\n";
}
0
 
arnoldCommented:
You need to make sure the scripts phase the correct tablename.  Note the script you pasted is still reflecting tablename versus what your table actually is.

Anything preceded by $ is a variable, the other count is the column name.

First run the script
cat filename1.txt |perl script1 | more look at the insert to make sure the tablename and columns referenced match the definition of your table,.

If you could run in MySQL show tables from somedb.
Then run show create table <tablename> from the list above win to where this data is being inserted/added.
0
 
arnoldCommented:
The split may need to ve replaced

split(/\s+/,$_,2); instead of split(/ /,$_,2)

The difference deals with whether you actually have a single space separating the first column from the second in the text file.
This applies to both scripts. \s+ looks for white space one or more. There are other option it all depends on what the separating between the two.

cat -v file.txt | more may reveal other control characters that could ve adversely impacting the separation attempt.
0
 
projectsAuthor Commented:
Yes, I'm using the correct table and field names :).
I just re-posted your code but I should have pointed that out.

Yes, I mentioned in my question that there is some spacing in front of each line.
I could try editing the white spaces out if that would help.
0
 
projectsAuthor Commented:
I've removed all of the white spaces and any control characters and the result is slightly different.

# cat file1.txt | perl file1.pl | mysql -u root -p <somedb>
Enter password:
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use  at line 1
0
 
arnoldCommented:
ok, missed the white space in the beginning of every line.
add the following line after line 3: chomp0; to both scripts.
s/^\s+//;

This will strip out any white space from the begining of every line.
What control did you have ^M characters in there as well?
s/\r+//g;

To deal with the mysql error above,
add "key" after the "on duplicate" to make it "on duplicate key update"
0
 
projectsAuthor Commented:
There aren't any spaces or control characters in the files anymore. I made sure of that.

Tried it again... same thing...

# cat file1.txt | perl file1.pl | mysql -u root -p somedb
Enter password:
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'update org='this item is big bright and expensive'' at line 1

The code, just to confirm is;

#!/usr/bin/perl
while (<>) {
chomp();
($count,$data)=split(/ /,$_,2);
print "insert into somedb (count,desc) values ('$count','$data') on duplicate update desc='$data';\n";
}

Open in new window


db table somedb looks like;

Field name          Type                  Allow nulls?          Key          Default value  
count                int(12)               Yes                       None       NULL       
inventory         int(24)               Yes                       None       NULL       
desc                varchar(45)       Yes                       None       NULL
0
 
arnoldCommented:
Just to be clear, lets say your database name is somedb and the table into which you want to insert the data is sometable.

The code above as you post, you are seemingly reflecting the entry as being inserted into a Database versus a table.
you can have a notation that explicitly defines the database and the tablename for the insert/delete which goes in the format of databasename.tablename.

I'll repost the scripts with the adjustments for the somedb and sometable using explicit syntax.

The difference deals with the mysql command line you should but are not required to specify the database name.

file1.pl
#!/usr/bin/perl
while (<>) {
chomp();
s/^\s+//;
s/\r+//;
($count,$data)=split(/\s+/,$_,2);
print "insert into somedb.sometable (count,desc) values ('$count','$data') on duplicate key update desc='$data';\n";
}

Open in new window


file2.pl
#!/usr/bin/perl
while (<>) {
chomp();
s/^\s+//;
s/\r+//;
($inventory,$count)=split(/\s+/,$_,2);
print "insert into somedb.sometable (inventory,count) values ('$inventory','$count') on duplicate key update inventory='$inventory';\n";
}

Open in new window


It is simpler to inspect the insert statement that is coming out of the perl script to see where the error might exists.
I've in the above corrected for the line starts with empty spaces. as well as removing ^M if any deals with mainly transfer of text files from windows systems to ....
I also made the adjustment to the on duplicate key update which I suspect is what is causing the above error.  No org is not one of the columns provided in the example.
0
 
projectsAuthor Commented:
I work with databases every single day, exporting, importing, using command line and GUIs to insert and read data. I'm fine with the names, I know the difference between database, tables and so on. I don't always know the correct terminology however since I'm not really a DB person.

Anyhow, that said, yes, I'm entering the correct database name and am trying to insert into the correct table. I simply use words like somedb because I don't want to post too much information publicly.

Now, this time, the first script worked but it died when it came across a 'desc' which contains an apostrophe.
Some of the descriptions contain characters such as apostrophes, comas, dashes, parenthesis, dots, and even ampersands.
0
 
projectsAuthor Commented:
There are thousands of lines, I could not possibly edit each line. I wonder, might there be a way of telling the script that everything after the first column is text, no matter what it contains, to put it into 'desc'?
0
 
arnoldCommented:
The s/ lines in the example above are to deal with issues previously mentioned.
Please run the cat | perl | more such that you can look.
The issue is that the above presumes that everything needed is a single line no matter how long.

You can redirect the data to a file > file1.sql
if you see that everything there is correct, you can cat file1.sql into mysql.

The script though will not handle multiple-line input

It is possible to use the above script with an additional loop to make sure one is handling the complete line, but a clear terminology of what defines the start of the record such that when lines are loading as long as the record start indicator is not present, the presumption will be that this is a continuation of the preceeding line.  The issue starts with the length of desc which you have as 45 characters. ...... one could add a truncate option to add only the first 45 characters.
0
 
projectsAuthor Commented:
Ok well, it all works fine now so I thank you very much for sticking to the question and finding a solution for me.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.