[Last Call] Learn about multicloud storage options and how to improve your company's cloud strategy. Register Now

x
?
Solved

using regex matcher to match an INSERT statement

Posted on 2006-11-23
11
Medium Priority
?
413 Views
Last Modified: 2012-08-14
I have an insert statement that is of the form

INSERT INTO pubs VALUES (16590,'Header','HEADER1995','1995-00-00',1549,NULL,'tp','NOVEL',NULL,NULL,'$5.95',NULL),(16591,'Headhunter','BKTG16613','1993-00-00',11,'422','pb','NOVEL','0450591123',NULL,'L4.99',NULL),(16592,'Headhunter','BKTG03173','1994-00-00',282,'440','hc','NOVEL','0517598272',NULL,'$23.00',NULL);

However, it's something like a million characters long. It's from a mysql export that I need to import into a different database. My system is failing from stack overflows.

To fix it I want to use a regex Matcher to go through and for every 100 values, insert a new INSERT line. So, if we split it for every one, the result would be to convert this:
  INSERT INTO pubs VALUES (1,2,3),(2,3,4),(4,5,6)
to this:
  INSERT INTO pubs VALUES (1,2,3);
  INSERT INTO pubs VALUES (2,3,4);
  INSERT INTO pubs VALUES (4,5,6)

I don't know how to write a regex that will properly match everything including string values that may contain delimeters. For instance, I have no idea how to match against something like:
  INSERT INTO pubs VALUES ('I (by which I mean we), really like Jill''s face.');

Alternatively, I suppose you could tell me how to get derby to not choke on the original megabyte-long insert statement.
0
Comment
Question by:HappyEngineer
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
  • +3
11 Comments
 
LVL 5

Accepted Solution

by:
gkishoreji earned 1000 total points
ID: 18006081
if u r sure that the string inside the insert statements wont contain ),(

you can use it to split the strings

)[ ]*,[ ]*(   --> This will match   ),(
replace it with
); INSERT INTO pubs VALUES (
0
 
LVL 23

Expert Comment

by:Ajay-Singh
ID: 18006094
use loop around the insert statement,
a. create PreparedStatement
b. in a loop bind the parameters
c. commit.
0
 
LVL 23

Expert Comment

by:Ajay-Singh
ID: 18006128
pls ignore the last comment
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 86

Expert Comment

by:CEHJ
ID: 18006191
>>However, it's something like a million characters long.

You mean one single import statement is? You should be able to run the file against the new db given the correct setup of the target db
0
 
LVL 14

Expert Comment

by:hoomanv
ID: 18006192
That insert seems like the output of a database dump
Most dump tools support two ways of generating insert statements
i.e. musqldump.exe by default uses --extended-insert but you can disable it to have single insert for each tuple
0
 
LVL 92

Expert Comment

by:objects
ID: 18006543
agree with hoomanv, change the way you export the data
0
 

Author Comment

by:HappyEngineer
ID: 18006602
I didn't export the data. I don't have control over the way it is exported.
0
 
LVL 92

Expert Comment

by:objects
ID: 18006642
did u try gkishoreji suggestion, looks like you nned to replace all:

),(

with

);
INSERT INTO pubs VALUES (

(not really a java question, you can do the conversion usiong something like awk)
0
 
LVL 14

Expert Comment

by:hoomanv
ID: 18006651
0
 
LVL 14

Assisted Solution

by:hoomanv
hoomanv earned 1000 total points
ID: 18006840
It seems that ZQL does not accept multi value insert statements
OK for you a pretty handy solution is to go here http://www.wangz.net/cgi-bin/pp/gsqlparser/sqlpp/sqlformat.tpl
Paste your statement, and the script will beautify it so each tuple appears in a new line
Now its easy to separate them
0
 

Author Comment

by:HappyEngineer
ID: 18009936
I split it up at the ),( boundaries by first using the pattern:

    "INSERT INTO ([a-zA-Z0-9_]+) VALUES .*;\n"

then for each match I did:
    String insertStr = rowMatcher.group(0);
    String tableName = rowMatcher.group(1);
    insertStr = insertStr.replaceAll("\\),\\(", ");\nINSERT INTO "+ tableName +" VALUES (");

However, the tool that hoomanv mentioned also looks interesting. Sadly, it's unlikely it would work to upload a 50MB file to it. But, if necessary I probably could have contacted the site to get the code.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Java had always been an easily readable and understandable language.  Some relatively recent changes in the language seem to be changing this pretty fast, and anyone that had not seen any Java code for the last 5 years will possibly have issues unde…
Introduction This article is the first of three articles that explain why and how the Experts Exchange QA Team does test automation for our web site. This article explains our test automation goals. Then rationale is given for the tools we use to a…
Viewers will learn about the regular for loop in Java and how to use it. Definition: Break the for loop down into 3 parts: Syntax when using for loops: Example using a for loop:
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
Suggested Courses

650 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question