Solved

using regex matcher to match an INSERT statement

Posted on 2006-11-23
11
383 Views
Last Modified: 2012-08-14
I have an insert statement that is of the form

INSERT INTO pubs VALUES (16590,'Header','HEADER1995','1995-00-00',1549,NULL,'tp','NOVEL',NULL,NULL,'$5.95',NULL),(16591,'Headhunter','BKTG16613','1993-00-00',11,'422','pb','NOVEL','0450591123',NULL,'L4.99',NULL),(16592,'Headhunter','BKTG03173','1994-00-00',282,'440','hc','NOVEL','0517598272',NULL,'$23.00',NULL);

However, it's something like a million characters long. It's from a mysql export that I need to import into a different database. My system is failing from stack overflows.

To fix it I want to use a regex Matcher to go through and for every 100 values, insert a new INSERT line. So, if we split it for every one, the result would be to convert this:
  INSERT INTO pubs VALUES (1,2,3),(2,3,4),(4,5,6)
to this:
  INSERT INTO pubs VALUES (1,2,3);
  INSERT INTO pubs VALUES (2,3,4);
  INSERT INTO pubs VALUES (4,5,6)

I don't know how to write a regex that will properly match everything including string values that may contain delimeters. For instance, I have no idea how to match against something like:
  INSERT INTO pubs VALUES ('I (by which I mean we), really like Jill''s face.');

Alternatively, I suppose you could tell me how to get derby to not choke on the original megabyte-long insert statement.
0
Comment
Question by:HappyEngineer
  • 3
  • 2
  • 2
  • +3
11 Comments
 
LVL 5

Accepted Solution

by:
gkishoreji earned 250 total points
ID: 18006081
if u r sure that the string inside the insert statements wont contain ),(

you can use it to split the strings

)[ ]*,[ ]*(   --> This will match   ),(
replace it with
); INSERT INTO pubs VALUES (
0
 
LVL 23

Expert Comment

by:Ajay-Singh
ID: 18006094
use loop around the insert statement,
a. create PreparedStatement
b. in a loop bind the parameters
c. commit.
0
 
LVL 23

Expert Comment

by:Ajay-Singh
ID: 18006128
pls ignore the last comment
0
ScreenConnect 6.0 Free Trial

Check out the updates in one game-changing release, ScreenConnect 6.0, based on partner feedback. New features include a redesigned UI that improves session organization and overall user experience. See the enhancements for yourself!

 
LVL 86

Expert Comment

by:CEHJ
ID: 18006191
>>However, it's something like a million characters long.

You mean one single import statement is? You should be able to run the file against the new db given the correct setup of the target db
0
 
LVL 14

Expert Comment

by:hoomanv
ID: 18006192
That insert seems like the output of a database dump
Most dump tools support two ways of generating insert statements
i.e. musqldump.exe by default uses --extended-insert but you can disable it to have single insert for each tuple
0
 
LVL 92

Expert Comment

by:objects
ID: 18006543
agree with hoomanv, change the way you export the data
0
 

Author Comment

by:HappyEngineer
ID: 18006602
I didn't export the data. I don't have control over the way it is exported.
0
 
LVL 92

Expert Comment

by:objects
ID: 18006642
did u try gkishoreji suggestion, looks like you nned to replace all:

),(

with

);
INSERT INTO pubs VALUES (

(not really a java question, you can do the conversion usiong something like awk)
0
 
LVL 14

Expert Comment

by:hoomanv
ID: 18006651
0
 
LVL 14

Assisted Solution

by:hoomanv
hoomanv earned 250 total points
ID: 18006840
It seems that ZQL does not accept multi value insert statements
OK for you a pretty handy solution is to go here http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl
Paste your statement, and the script will beautify it so each tuple appears in a new line
Now its easy to separate them
0
 

Author Comment

by:HappyEngineer
ID: 18009936
I split it up at the ),( boundaries by first using the pattern:

    "INSERT INTO ([a-zA-Z0-9_]+) VALUES .*;\n"

then for each match I did:
    String insertStr = rowMatcher.group(0);
    String tableName = rowMatcher.group(1);
    insertStr = insertStr.replaceAll("\\),\\(", ");\nINSERT INTO "+ tableName +" VALUES (");

However, the tool that hoomanv mentioned also looks interesting. Sadly, it's unlikely it would work to upload a 50MB file to it. But, if necessary I probably could have contacted the site to get the code.
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
java 8 lambda expresssions exception handling 3 91
hibernate example for saving data 19 43
maven module vs maven project 3 23
CSV file parsing thru Java 13 31
Java had always been an easily readable and understandable language.  Some relatively recent changes in the language seem to be changing this pretty fast, and anyone that had not seen any Java code for the last 5 years will possibly have issues unde…
Introduction This article is the second of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article covers the basic installation and configuration of the test automation tools used by…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…
The viewer will learn how to implement Singleton Design Pattern in Java.

803 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question