Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Try to read a huge text file via C# or Java.

Posted on 2013-01-25
5
Medium Priority
?
721 Views
Last Modified: 2013-02-02
Hi there;

This is a hypotetical question. Say that, I am trying to read extremely huge text file via C# or Java. Now, first how can i read a huge file in those languages?

Second, how can i be sure that for each line I read, there won't be a collapse in the reading process?

I need to extract the content per line and work on the extracted atoms which are blank delimited. I am planning to have a thread for this very process for each line but should I join the thread to the main program or what can be the strategy? There is no write operation on the file, and assume that the file is prepopulated.

Can you give me some strategies above this scenario in Java and C#?

Regards.

P.S. The culprit is the huge file size, I have to be sure that it won't collapse once I open the file for read purpose and read. I can also go for C or PHP for this need.
0
Comment
Question by:jazzIIIlove
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
5 Comments
 
LVL 45

Assisted Solution

by:AndyAinscow
AndyAinscow earned 668 total points
ID: 38817851
>>Now, first how can i read a huge file in those languages?
Exactly the same way as a small file - the size makes no difference to the functions you use to read.

>>Second, how can i be sure that for each line I read, there won't be a collapse in the reading process?
What do you mean collapse?

>>I need to extract the content per line and work on the extracted atoms which are blank delimited. I am planning to have a thread for this very process for each line
Awk - do you mean having millions of threads ?
0
 
LVL 29

Accepted Solution

by:
Göran Andersson earned 668 total points
ID: 38817871
For a small file you could just read all of it at once, but for a large file you would want to use a StreamReader.

If you just read lines and start a new thread for each line, you will quickly start a huge number of threads and congest the system. Instead you should start a limited number of threads, and have them polling a synchronised queue to pick up work from. Then you read lines from the file and put in the queue, and when the queue reaches a certain size you just wait for the threads to pick items from it before continuing to read from the file.
0
 
LVL 86

Assisted Solution

by:CEHJ
CEHJ earned 664 total points
ID: 38818251
The size of the file is immaterial. What counts is the size of the buffers into which that file is read. For a BufferedReader the default is 8192 bytes but you can make it any size you like.
Since your file is essentially a csv file you have absolutely no problems with buffering.

I am planning to have a thread for this very process for each line but should I join the thread to the main program or what can be the strategy?
If you're right about a multi-threaded approach being a good one - again there's no problem. One thread could read the file into a queue and that queue could be processed by threads from a thread pool. Of course you would have to justify to yourself that such a relatively complex approach was a good strategy.
0
 
LVL 13

Expert Comment

by:Hugh McCurdy
ID: 38835077
I'm guessing this is related to the desire to create a large file with random data that you since posted.

I'm pretty much with the others, just read the file.  

This comment about collapse puzzles me.  Are you worried about running out of RAM?   If so, why?  I'm wondering if this is more about memory leaks than something else.  Can't tell.
0
 
LVL 12

Author Comment

by:jazzIIIlove
ID: 38847247
Ah,
Thanks for the strategies and comments. I think I resolved this.

Regards.
0

Featured Post

Ask an Anonymous Question!

Don't feel intimidated by what you don't know. Ask your question anonymously. It's easy! Learn more and upgrade.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is a short and sweet, but (hopefully) to the point article. There seems to be some fundamental misunderstanding about the function prototype for the "main" function in C and C++, more specifically what type this function should return. I see so…
Real-time is more about the business, not the technology. In day-to-day life, to make real-time decisions like buying or investing, business needs the latest information(e.g. Gold Rate/Stock Rate). Unlike traditional days, you need not wait for a fe…
This theoretical tutorial explains exceptions, reasons for exceptions, different categories of exception and exception hierarchy.
This tutorial explains how to use the VisualVM tool for the Java platform application. This video goes into detail on the Threads, Sampler, and Profiler tabs.
Suggested Courses

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question