?
Solved

Tool to validate and generate DTD, Schema from a XML document large in size

Posted on 2009-05-05
9
Medium Priority
?
264 Views
Last Modified: 2012-05-06
HI experts,
                  I am looking for a tool that could generate DTD and schema from an XMl document, right now I am using XMLspy but its running out of memory as soon as I try to generate XML schema or DTD.

                  The XML file size I am dealing with is aroung 160mb
     
0
Comment
Question by:aman0711
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 4
9 Comments
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 2000 total points
ID: 24308497
well, the IDE itself is taking away too much of your available memory.

I have used this before
http://www.thaiopensource.com/relaxng/trang.html

I have not tested that on such large files, but it has no problem with things in the 40MB range, that I know
Maybe if you give it enough heapspace...

On the other hand, if you have a 160MB file, I bet there is a lot of repetition in there,
try cutting it, it will likely not have an effect on the schema if you derive a schema from a subset of the file
0
 
LVL 10

Author Comment

by:aman0711
ID: 24308535
Hi Gertone,
                     I am very new to all this.... Is it real tough to use trang??

                     
0
 
LVL 60

Accepted Solution

by:
Geert Bormans earned 2000 total points
ID: 24308592
no, it is easy

make sure you have a java virtual machine and then you can do...
java -jar trang.jar args

it is all in the manual

I mainly use Trang inside an IDE (www.oxygenxml.com)
That is even easier, but it will give you the same memory problems.
That is why I suggest using the command line version of it

What is not in the manual is increasing the heapspace
java -Xms1024M -Xmx1024M  -jar trang.jar args

Make sure you have at least a gigabyte free for this in RAM

But seriously...
cut the file in pieces, make a schema for each piece and compare
it will be a lot better



0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 10

Author Comment

by:aman0711
ID: 24308604
Yes you are right,

cutting out the data from the file itself could help. I can delete the tags in XML spy right?
0
 
LVL 60

Assisted Solution

by:Geert Bormans
Geert Bormans earned 2000 total points
ID: 24308641
you can, allthough 160MB is a bit hard for SPY to edit...
you can try, if it kills Spy, use a text editor for large files
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 24308649
160MB files very often are dumps from a database, If you have two rows instead of a million rows, you would still generate the same schema
0
 
LVL 10

Author Comment

by:aman0711
ID: 24308811
Yeah, tried to open it with Spy.. till now working file.

I trimmed lot of lines till now..
0
 
LVL 10

Author Closing Comment

by:aman0711
ID: 31578189
Thanks :)
0
 
LVL 60

Expert Comment

by:Geert Bormans
ID: 24339077
welcome
0

Featured Post

Free Tool: Path Explorer

An intuitive utility to help find the CSS path to UI elements on a webpage. These paths are used frequently in a variety of front-end development and QA automation tasks.

One of a set of tools we're offering as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Browsing the questions asked to the Experts of this forum, you will be amazed to see how many times people are headaching about monster regular expressions (regex) to select that specific part of some HTML or XML file they want to extract. The examp…
Many times as a report developer I've been asked to display normalized data such as three rows with values Jack, Joe, and Bob as a single comma-separated string such as 'Jack, Joe, Bob', and vice versa.  Here's how to do it. 
NetCrunch network monitor is a highly extensive platform for network monitoring and alert generation. In this video you'll see a live demo of NetCrunch with most notable features explained in a walk-through manner. You'll also get to know the philos…
In this video you will find out how to export Office 365 mailboxes using the built in eDiscovery tool. Bear in mind that although this method might be useful in some cases, using PST files as Office 365 backup is troublesome in a long run (more on t…
Suggested Courses

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question