search a pattern in a large file using java

Following is my code in a particular function.
 
     Scanner sc = new Scanner(new File("/exports/nos_issues/9518/aaa_.log"));
      String str="10:37:10.719 [net.jradius.freeradius.FreeRadiusProcessor(29)] DEBUG net.jradius.log.Log4JRadiusLogger - >>> packets";
      Pattern ptr = Pattern.compile(str);
      long toto=0;
      try{
         while(sc.findWithinHorizon(ptr ,0000) != null)
            toto++;
      }finally{
         sc.close();
      }
     
      System.out.println(toto);
     

I get the following error :

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
      at java.nio.HeapCharBuffer.<init>(HeapCharBuffer.java:39)
      at java.nio.CharBuffer.allocate(CharBuffer.java:312)
      at java.util.Scanner.makeSpace(Scanner.java:816)
      at java.util.Scanner.readInput(Scanner.java:771)
      at java.util.Scanner.findWithinHorizon(Scanner.java:1659)
      at Cl.main(Cl.java:38)

in the class Cl the line number 38 points to sc.close();
pvinodpAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

CEHJCommented:
This is confusing - you're looking for what is clearly a timestamped logfile entry. So first of all, why would there be any more than one occurrence?
0
pvinodpAuthor Commented:
the string pattern is just another example... it can be any string.
0
CEHJCommented:
OK. The file is a text file containing lines?
0
Learn SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

pvinodpAuthor Commented:
yes.
What i am trying to do is . divide the file into segments and then search each segment.

I aware of the risk of cutting my pattern while dividing it into segemtns.
0
CEHJCommented:
OK. The file is a text file containing lines?
yes.
In that case, the file makes sense in terms of lines, so i'm wondering why you're trying to span multiple lines ..?
0
pvinodpAuthor Commented:
I want to create a job to search each segment. and in future allocate job to be executed on separate system.
0
CEHJCommented:
Segments don't actually have anything to do with a pattern spanning multiple lines. You can think of a segment of a text file as simply a file with fewer lines. The problem is the same (even if the problem space is smaller)
0
pvinodpAuthor Commented:
but i intend not to do line by line search..
I want to divide the whole file into many portions and then search
0
CEHJCommented:
I want to divide the whole file into many portions and then search
How?
0
CEHJCommented:
Take the following file. Divide it equally into two and have each worker thread search each segment for the phrase 'THE CAT SAT ON THE MAT'. Neither will find it

jSx62KM9qB20iMmM1WdlKsM0tKuRbCwEmsyLcZPJ4dSvOHmDovpdoqxe21RpbafryY1
PBga3epRxM3v25usrWwirvVHiNZ28PnguDEFuZo4bKavq2R0T64vi4hnPIUUXMKagtnyKSNUQNyx
URwwKDGcTjfKtcTzw7uHaWSmmZeen51ZuOzmzNe76LnNumzLCkfyIOm9GYA0VsbGD47zkzoku033HSzCtrHHrs0XDBaOHQWL7hXKLfLLLqlpIGYH0kDctda3 lH2XAYfg0J
b THE CAT SAT ON THE MAT
EDGdSmXv18OTJEyeqPFXBCW7ATHsl66SGaFNNYgC5UvtSDPPr4KwDNRYVQDzWsCkzuPKuQ
5y7URHNjEO8eZ1siQUraAvpdZF1WM
ScBL4zBKQwZsrXD
IU
ie7xzxW14C6hV9olbQLHvuO7ZOU3Iva3fa0JqW9UCuK1fpRMeRBCUrQEffbnXDwMP
hzrrG8TJ

Open in new window

0
pvinodpAuthor Commented:
I think that is because the string is divided across the two parts.
0
pvinodpAuthor Commented:
I dint understand  your question..
HOW?
0
CEHJCommented:

I think that is because the string is divided across the two parts.
 
Yes, so

a. how are you going to divide it?
b. are you going to be looking for patterns such as "MAT.*EDGdSmXv18OT"? And if so - why?
0
pvinodpAuthor Commented:
in my case the pattern is going to appear many times.. and the I just need the number of occurences..
And as it is a log file from a hardware , the number can be huge .
I might get a file of size 1 gb . and if i make a 5 segments of them , i am at a risk of loosing 10 occurences.
But in my case the count can go to 100+ in a single segment.
0
CEHJCommented:
You still haven't told me if you're looking for patterns that span lines, and if so, why
0
pvinodpAuthor Commented:
my pattern is a single word and it cannot span lines.
0
CEHJCommented:
OK, now we can move on. Firstly, have you tried using a BufferedReader with a larger buffer than the default and searching in each line read? If so, please post the code you used.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
pvinodpAuthor Commented:
i think line by line gives some improvement over using scan.
0
pvinodpAuthor Commented:
Thanks for your input
0
CEHJCommented:
:)
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Java

From novice to tech pro — start learning today.