Solved

How to edit (or split) a Text File -> close to 8 GB?

Posted on 2006-11-24
8
1,106 Views
Last Modified: 2013-11-13
Hi,

I have to process the contents of a text file programmatically in VB.NET ... any code I've yet tried worked up to a file size of apx. 999 MB without crashing the server ... as soon as the file size reaches more the 1 GB the systems runs for hours and then just goes to sleep ... so, what I "always" did was to split the files (manually!) into sizes that could afterwards easily be processed ... but ... now I have to deal with a file size close to 8 GB and to tell, there's no way of splitting, processing or anything else ... I'm able to cut some chunks but after the 15th or some more chunk I have a "memory leak" message ... which seems to be astonishing since I have 4 GB physical memory on that server and another 10 GB virtual memory ...

... to shorten this a little ... there's no way to handle this with programming, nor with using the "editor" ... and not even a way to perform this with UltraEdit32 ... so, what to do?

... well, this is my question ... ;-)) ... what to do?


Best regards,
Raisor
0
Comment
Question by:Raisor
  • 4
  • 2
  • 2
8 Comments
 
LVL 41

Expert Comment

by:HonorGod
ID: 18009840
 What kind of editing do you need to do?  Can you use sed? http://www.cornerstonemag.com/sed/
0
 
LVL 41

Expert Comment

by:HonorGod
ID: 18009864
 How well can you describe what needs to be done?
  If not sed, how about perl?  http://www.perl.org/about.html
  You can retrieve a free implementation of it from http://www.activestate.com/store/productdetail.aspx?prdGuid=81fbce82-6bd5-49bc-a915-08d58c2648ca
0
 
LVL 15

Author Comment

by:Raisor
ID: 18009920
Hi,

Thanks for your suggestion!

To be truth ... I'm not at all into UNIX and Perl ... it's not that I'm not having had a lot of experiences with both ... it's just that I’d prefer a way that offers me an entry to a .NET kind of thing ... the needs are “infact” that I have to import a 8 GB text file with a terrible specification into a SQL Server database ... I don't mind about the interface ... it's the file size that bothers!


Best regards,
Raisor
0
Does Powershell have you tied up in knots?

Managing Active Directory does not always have to be complicated.  If you are spending more time trying instead of doing, then it's time to look at something else. For nearly 20 years, AD admins around the world have used one tool for day-to-day AD management: Hyena. Discover why

 
LVL 8

Accepted Solution

by:
YoungBonzi earned 500 total points
ID: 18009997
0
 
LVL 15

Author Comment

by:Raisor
ID: 18010015
Hi,

This looks very promising on first sight ... it's currently running on the 8 GB file ... will let you know the result!


Thanks a lot so far!
Best regards,
Raisor
0
 
LVL 15

Author Comment

by:Raisor
ID: 18010042
Hi,

It does not only look promising ... it's just perfect!

I've first used the "largest" option ... after only eight minutes the first part was done ... UltraEdit32 even had a problem to open it (665.000 MB) ... I've then killed all related processes and restarted with the 1.140 KB option ... and checked some of the outcomes ... files are not cut in a "structured" way ... but who cares! ... ;-)) ... the files are readable and the files are still in a code page that kept all included languages (Arabic, Russian, Chinese and all other languages!) ... and I can even open them in the "editor" ...


Excellent answer, excellent hint!
Thanks a lot!!!
Best regards,
Raisor
0
 
LVL 8

Expert Comment

by:YoungBonzi
ID: 18010075
Very nice. I'm going to download this myself.
0
 
LVL 15

Author Comment

by:Raisor
ID: 18010147
Hi,

... if you're dealing with large files you surely should ... ;)


Best regards,
Raisor
0

Featured Post

Optimizing Cloud Backup for Low Bandwidth

With cloud storage prices going down a growing number of SMBs start to use it for backup storage. Unfortunately, business data volume rarely fits the average Internet speed. This article provides an overview of main Internet speed challenges and reveals backup best practices.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is about my first experience with programming Arduino.
Whether you’re a college noob or a soon-to-be pro, these tips are sure to help you in your journey to becoming a programming ninja and stand out from the crowd.
With the power of JIRA, there's an unlimited number of ways you can customize it, use it and benefit from it. With that in mind, there's bound to be things that I wasn't able to cover in this course. With this summary we'll look at some places to go…
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question