We help IT Professionals succeed at work.

Search text file or long strings throw dll

cubrovic
cubrovic asked
on
Medium Priority
340 Views
Last Modified: 2010-03-31
Hi all
I need some free  dll for search trough text file or string for substring or/and regular expression and a little example of using it through java source code.
Can anyone help me with this?
Comment
Watch Question

CERTIFIED EXPERT
Top Expert 2016

Commented:
Why would you need a dll? - this is Java!
CERTIFIED EXPERT
Top Expert 2016
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
I need some combination with dll to get better performance on windows platforms so i can use some freeware dll to make actual search and java for some other stuff that aren't critical with speed.
Is this make any sense for you?
CERTIFIED EXPERT

Commented:
Not really. Java can do all that, and by means of threads you can get on with something else meantime.
CERTIFIED EXPERT
Top Expert 2016

Commented:
>>Is this make any sense for you?

Not a lot to be honest. Firstly, how have you come to find that the performance won't be adequate?
CERTIFIED EXPERT
Top Expert 2016

Commented:
While you're answering that, and assuming your answer is a good one, your question surely should be 'How do i call a dll containing regular expression functions from Java?' should it not?

Author

Commented:
>>Not a lot to be honest. Firstly, how have you come to find that the performance won't be adequate?
ok.
Maybe I mistaken but as I see searching code in dll has to be faster than the one written in pure java.
Because I had to search a lot of text files I trying to optimize it by finding some dll for some search functionality to be used from java.
So I guess someone here must be in the similiar position once so can give me some pointers to the write direction.
CERTIFIED EXPERT
Top Expert 2016

Commented:
Well i would not make assumptions without testing first. Of course your point should be *generally* correct, but generality is not the issue. Why not give it a try? You can memory-map the file and the whole thing could be done in memory.
CERTIFIED EXPERT
Top Expert 2016

Commented:
btw, FYI some modern and highly-regarded software (e.g. Dreamweaver) implements its application search facility using Java (Script) REs

Author

Commented:
It has to be done in number of files (up to 10000 or full CD) so it has to be as faster as can.
I like to use JavaScript for searching files up to some point but this is is out of Javascripts league and I'm afraid of pure Java league too so I want too combine it with some more native code.

Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT
Top Expert 2016

Commented:
>>Anyone disagree?

Not me ;-) The user probably won't be able to perceive any difference
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION
CERTIFIED EXPERT
Top Expert 2016

Commented:
Test before making any assumptions
Did you?
CERTIFIED EXPERT
Top Expert 2016

Commented:
I haven't made any
CERTIFIED EXPERT

Commented:
Let's put it this way :

If you are processing all that amount of data, although 2bit is right, who on earth is going to expect it to be finished within the kind of tolerance that you seem to be implying is important.

twobitadder: I think the questioner means 10K files on a single CD, not 10K CDs.

Commented:
Since we're discussing disk IO here, isn't that the real bottleneck? Is native versus java at this point even really an issue (assuming all things are coded equally).
>>As the bytecode runs it gets translated to native by the interpreter so native will obviously be faster, even when the >>program is running after the vm 'boots'.

1.You cannot dispute this, this is fact, native is faster than interpreted java.

>>Also I think the hardware you're using will be particularly important if you want to search through 6.5terrabytes :)

2.faster hardware will obviously be important in a search of this magnitude.

>> I don't know much about using jni to run native programs or dll's though sry, never needed to so far.
3. I don't think you're questioning this.

So I can only think you have a problem with :

>>If it's 10,000 * 650Mb's ie. 6.5 terrabytes then I think I would rather use native too especially since the data is probably >>unsorted and any small increase per comparison in using java over native will be amplified by the sheer number of >>comparisons.

Well,
-the calculation is right.
 -*I* would rather use native.
- the data is probably unsorted but I can't test this, I would have to see the data - It is effectively untestable.

Is this the assumption you think I make?
>>any small increase per comparison in using java over native will be amplified by the sheer number of comparisons.
This will always be true.

I'm not sure where I make an assumption any more than saying :
>> The user probably won't be able to perceive any difference

Also unless you have 6.5 terrabytes of data you cannot test, you can only assume.

Please find a fault in my logic so I can correct it.

Commented:
Most of you are probably tired of hearing the standard claim that Java programs are slower than C programs. In reality, the situation is more complex than that trite assertion. Many Java programs are slow, but that is not necessarily an intrinsic characteristic of programs written in Java. Many Java programs can perform just as fast as comparable programs written in C or C++, but only when the designer and programmer pay careful attention to performance issues throughout the development process.

Read the whole article at http://www.javaworld.com/javaworld/jw-11-2000/jw-1117-performance.html
CERTIFIED EXPERT
Top Expert 2016

Commented:
>>Please find a fault in my logic so I can correct it.

It's not a matter of faulty logic. There's not much data to gone on here. There's not much point in an impressionistic debate.

>>1.You cannot dispute this, this is fact, native is faster than interpreted java.

No true, not disputing it. By extension of this argument, the solution should be written in machine code.
Yes, sorry snapped a bit and misread the count in the question, I have no problems with any of the points, just the line :

>>However, once the byte code has been converted , stored in memory, it should be just as fast as a native app.

I wouldn't agree with(with standard jvm and no compiling to native), but it's not important in the overall problem where instruction execution time becomes negligible and DMA/IO controllers take over and the whole problem becomes a throughput issue, as pointed out.
Apologies.
CERTIFIED EXPERT
Top Expert 2016

Commented:
...and don't forget JIT. But i don't want to encourage more of this until more specific details have emerged ;-)

Author

Commented:
Ok guys.

Thanks for help but this topic turn to be debate about java vs native code.
I have about 10.000 files on 1 CD to be searched.
I need some free dll which have some basic and/or advanced search functionality and sample code for using this code in java.
Can someone provide me that please?


CERTIFIED EXPERT

Commented:
As you must know from your advanced programming background, cubrovic, the debate is about native vs java, not only absolutely, but also in this case, because that is exactly the sort of performance issue you introduced to the conversation yourself.

What do you mean by "advanced search functionality "?

Author

Commented:
>>As you must know from your advanced programming background, cubrovic, the debate is about native vs java, not only >>absolutely, but also in this case, because that is exactly the sort of performance issue you introduced to the conversation >>yourself.

Yap but i pointed that I need native solution example not advice on whic approach is better for this solution...

>>What do you mean by "advanced search functionality "?
Something like this:

I need finding a substring in string/text file = basic
   "java"

I need finding a combination of substrings in string/text file = advanced
   "java" and "xml"

and maybe some other optional functionality.
 
CERTIFIED EXPERT
Top Expert 2016

Commented:
>>
I need some free dll which have some basic and/or advanced search functionality and sample code for using this code in java.
>>

Since you have already decided on how you are going to do this cubrovic, my comment above remains:

>>
your question surely should be 'How do i call a dll containing regular expression functions from Java?' should it not?
>>

You are in the wrong TA. When you've found that code (ask in C/C++/Delphi etc) come back again and we can tell you how to call a dll from Java

Author

Commented:
I place this topic here on purpose because I thought that someone of you guys already use some dll like the one I search for and that I will get a pointer where I can find it and example of how are you used that.
CERTIFIED EXPERT

Commented:
Well, so I don't cloud the issue for you - I wont be able to help you on this, sorry, as it's not my bag (just in case you wonder why I wouldnt chip in any more)! ;)

k.
CERTIFIED EXPERT

Commented:
Cubrovic:

you really wanna go to CS and get a refund for this question, and try again, and try to get hold of orangehead911 or webstorm. (That's not saying that CEHJ cant help, but whilst I dont want to speak for him, it may be that his view differs on it fundamentally, dunno). ;)
CERTIFIED EXPERT
Top Expert 2016

Commented:
8-)

Author

Commented:
Sorry for not closing this earlier.
Thanks for help guys.
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.