Solved

using regex matcher to match an INSERT statement

Posted on 2006-11-23
11
374 Views
Last Modified: 2012-08-14
I have an insert statement that is of the form

INSERT INTO pubs VALUES (16590,'Header','HEADER1995','1995-00-00',1549,NULL,'tp','NOVEL',NULL,NULL,'$5.95',NULL),(16591,'Headhunter','BKTG16613','1993-00-00',11,'422','pb','NOVEL','0450591123',NULL,'L4.99',NULL),(16592,'Headhunter','BKTG03173','1994-00-00',282,'440','hc','NOVEL','0517598272',NULL,'$23.00',NULL);

However, it's something like a million characters long. It's from a mysql export that I need to import into a different database. My system is failing from stack overflows.

To fix it I want to use a regex Matcher to go through and for every 100 values, insert a new INSERT line. So, if we split it for every one, the result would be to convert this:
  INSERT INTO pubs VALUES (1,2,3),(2,3,4),(4,5,6)
to this:
  INSERT INTO pubs VALUES (1,2,3);
  INSERT INTO pubs VALUES (2,3,4);
  INSERT INTO pubs VALUES (4,5,6)

I don't know how to write a regex that will properly match everything including string values that may contain delimeters. For instance, I have no idea how to match against something like:
  INSERT INTO pubs VALUES ('I (by which I mean we), really like Jill''s face.');

Alternatively, I suppose you could tell me how to get derby to not choke on the original megabyte-long insert statement.
0
Comment
Question by:HappyEngineer
  • 3
  • 2
  • 2
  • +3
11 Comments
 
LVL 5

Accepted Solution

by:
gkishoreji earned 250 total points
ID: 18006081
if u r sure that the string inside the insert statements wont contain ),(

you can use it to split the strings

)[ ]*,[ ]*(   --> This will match   ),(
replace it with
); INSERT INTO pubs VALUES (
0
 
LVL 23

Expert Comment

by:Ajay-Singh
ID: 18006094
use loop around the insert statement,
a. create PreparedStatement
b. in a loop bind the parameters
c. commit.
0
 
LVL 23

Expert Comment

by:Ajay-Singh
ID: 18006128
pls ignore the last comment
0
 
LVL 86

Expert Comment

by:CEHJ
ID: 18006191
>>However, it's something like a million characters long.

You mean one single import statement is? You should be able to run the file against the new db given the correct setup of the target db
0
 
LVL 14

Expert Comment

by:hoomanv
ID: 18006192
That insert seems like the output of a database dump
Most dump tools support two ways of generating insert statements
i.e. musqldump.exe by default uses --extended-insert but you can disable it to have single insert for each tuple
0
How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

 
LVL 92

Expert Comment

by:objects
ID: 18006543
agree with hoomanv, change the way you export the data
0
 

Author Comment

by:HappyEngineer
ID: 18006602
I didn't export the data. I don't have control over the way it is exported.
0
 
LVL 92

Expert Comment

by:objects
ID: 18006642
did u try gkishoreji suggestion, looks like you nned to replace all:

),(

with

);
INSERT INTO pubs VALUES (

(not really a java question, you can do the conversion usiong something like awk)
0
 
LVL 14

Expert Comment

by:hoomanv
ID: 18006651
0
 
LVL 14

Assisted Solution

by:hoomanv
hoomanv earned 250 total points
ID: 18006840
It seems that ZQL does not accept multi value insert statements
OK for you a pretty handy solution is to go here http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl
Paste your statement, and the script will beautify it so each tuple appears in a new line
Now its easy to separate them
0
 

Author Comment

by:HappyEngineer
ID: 18009936
I split it up at the ),( boundaries by first using the pattern:

    "INSERT INTO ([a-zA-Z0-9_]+) VALUES .*;\n"

then for each match I did:
    String insertStr = rowMatcher.group(0);
    String tableName = rowMatcher.group(1);
    insertStr = insertStr.replaceAll("\\),\\(", ");\nINSERT INTO "+ tableName +" VALUES (");

However, the tool that hoomanv mentioned also looks interesting. Sadly, it's unlikely it would work to upload a 50MB file to it. But, if necessary I probably could have contacted the site to get the code.
0

Featured Post

What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

Join & Write a Comment

Java Flight Recorder and Java Mission Control together create a complete tool chain to continuously collect low level and detailed runtime information enabling after-the-fact incident analysis. Java Flight Recorder is a profiling and event collectio…
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers learn how to read error messages and identify possible mistakes that could cause hours of frustration. Coding is as much about debugging your code as it is about writing it. Define Error Message: Line Numbers: Type of Error: Break Down…
Viewers will learn about arithmetic and Boolean expressions in Java and the logical operators used to create Boolean expressions. We will cover the symbols used for arithmetic expressions and define each logical operator and how to use them in Boole…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now