• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 358
  • Last Modified:

Java thread issues

Hi,

My program is working in a multi-threading environment, and will have to gain access to some same .xml files and some tables in mysql. The program works fine in standalone case, but encountered some concurrent issues in the multi-threading case.

How could I make access to the file and tables is restricted to a thread at a time?

Thanks
0
wsyy
Asked:
wsyy
  • 11
  • 6
  • 6
  • +2
3 Solutions
 
CEHJCommented:
You want to do as little synchronization as possible. What problems did you have?
0
 
for_yanCommented:
If you are dealing with database it is better to levae these synchronization issues to
the dsicretion of database - it is more or less their pofession.
Please, be specific on what kind of problesm you experience.  
0
 
mrjoltcolaCommented:
If your access to the XML files are read-only, threading issues wont be a problem, however, if multiple threads access the same XML file, you need to use the advisory file locking API in Java.

http://download.oracle.com/javase/6/docs/api/java/nio/channels/FileLock.html

As far as concurrent access in MySQL, you need to provide more info. You should be able to design the app such that you don't block threads. If multiple people use the same records simultaneously, you'll need to explain in detail what you mean by "gain access to ... same tables"
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
objectsCommented:
use a seperate connection for each thread ie. don't share connections between threads
and also avoid sharing any member vars and instead use local vars

you should avoid most threading issues then
0
 
wsyyAuthor Commented:
Hi everyone,

Here is more info:

1. I am running my own HTML parser in Nutch, which is running in Windows OS;

2. The HtmlParser.java doesn't consider any thread-handling at the moment. For example, I use Jdom API to open and read local .xml files, and my code doesn't involve any file-locking methods as I haven't had any experience with that. For the same token, I created mysql connection object and used the object to do reading and writing to some same tables. There are no problems if I run my own HtmlParser.java standalone, so my java code for that part is good except for thread handling.

3. Nutch itself produces tens of threads, each of which contains an instance which instead runs HtmlParser.java.

4. When I debugged the process, I was in control of one thread. Exceptions

I will need help some code examples to:

1. use Filelock API;
2. create multiple connection instances in HtmlParser.java(???)
3. anything else I miss.

Thanks
0
 
objectsCommented:
1. http://www.exampledepot.com/egs/java.nio/SetFileLock.html

2. Create a connection when you need it and close it once done
And make sure its a local var

3. be careful using instance variables as they are shared between threads
0
 
wsyyAuthor Commented:
I was reading the File Lock example here(http://www.exampledepot.com/egs/java.nio/SetFileLock.html). One issue came to my mind: what if a file is locked and prevented from access to other threads?

That would cause IOException for other threads, right? I don't think that is good unless there are some other ways to deal with it.
0
 
objectsCommented:
> That would cause IOException for other threads, right? I don't think that is good unless there are some other ways to deal with it.

catch the exception and take the required action, for example wait for it to become available

making sure the threads only hold file open for short period of time would be a good idea
0
 
wsyyAuthor Commented:
objects, could you please be more specfic about "catch the exception and take the required action, for example wait for it to become available"?

How to know when the .xml files or the mysql connections are available?
0
 
for_yanCommented:
As to database connections, you should not wait for them to become available:
the important thing is not to share connection between different thread - open connection within
each thread as local variable accessible only within this thread - do whatever is necessary
with respect to access to databses and then clos connection.
Another thread should open its own conncection and close it. database connections are all independent - you
sdon't need to wait to close one in order to open connection in another thread - database is of course
is capbable of maintaining simultaneously many connections. You just don't one to use one connectiomn
in different thread (in general it is conceivable, say if you keep instance variable for connection
which can be accessible from several threads; that's why you don't want connections tio be instance varaibles, but rather
want them to be local variables).

As to the file, you certainly cannot know when it becomes available,
but what you can do - you can at certain intevals check if file lock
is available, the way it shows in the example. At the point when
you need to access the file which can potentially be locked by another
thread you do something like that:

File file = new File("filename");
    FileChannel channel = new RandomAccessFile(file, "rw").getChannel();
 

int count = 0;
boolean locked = true;
FileLock lock = null;

while(count < 100 ){

  try {
        lock = channel.tryLock();
        if(lock != null) {locked = false;
           break;
       }
    } catch (OverlappingFileLockException e) {
        // File is already locked in this thread or virtual machine
     locked = true;
    }
Thread.currentThread().sleep(1000);
count++;
}
if(locked) {
//report waiting timeout
}
//go forward access the file


It is not very clear from the API if tryLock
throws exception when lock is not avaialable
or returns null - the code
above tests both conditions












about

0
 
CEHJCommented:
>>4. When I debugged the process, I was in control of one thread. Exceptions

Please post stack trace here
0
 
objectsCommented:
> How to know when the .xml files or the mysql connections are available?

well you don't want to be waiting until a connection is available
just create one when you need one and close it once done, best to use a connection pool to manage the connections
regards the file it depends on your architecture
0
 
wsyyAuthor Commented:
CEHJ, there was no stack available when I was debugging. The debug process was halted and the following .class file was brought up:


/*
* @(#)FutureTask.java  1.14 06/07/13
*
* Copyright 2006 Sun Microsystems, Inc. All rights reserved.
* SUN PROPRIETARY/CONFIDENTIAL. Use is subject to license terms.
*/

package java.util.concurrent;
import java.util.concurrent.locks.*;
0
 
CEHJCommented:
Does it run with, or without exceptions?
0
 
wsyyAuthor Commented:
without exception. the process just stopped at the .class file
0
 
CEHJCommented:
When i say 'run', i mean as opposed to running in debug mode
0
 
wsyyAuthor Commented:
In nutch, there was no exception thrown out when in a normal run.
0
 
CEHJCommented:
Then why do you think there's a problem?
0
 
CEHJCommented:
(It sounds to me from what you're describing that you're actually accidentally running in debug mode with at least one breakpoint set)
0
 
wsyyAuthor Commented:
There were problems, but JVM didn't throw out exceptions.
0
 
wsyyAuthor Commented:
for_yan,

I am trying to use your code below to solve my problems:
File file = new File("filename");
    FileChannel channel = new RandomAccessFile(file, "rw").getChannel();
 

int count = 0;
boolean locked = true;
FileLock lock = null;

while(count < 100 ){

  try {
        lock = channel.tryLock();
        if(lock != null) {locked = false;
           break;
       }
    } catch (OverlappingFileLockException e) {
        // File is already locked in this thread or virtual machine
     locked = true;
    }
Thread.currentThread().sleep(1000);
count++;
}
if(locked) {
//report waiting timeout
}
//go forward access the file




But in my case, I have such code which is attached. Then how can I use your code for a directory?

Thanks



File fs = new File(conf.get("xml.template.dir"));
		File[] files = fs.listFiles();
		for(File file: files){
			if(file.getName().startsWith(domain)){
				docs.add(file);
			}
		}

Open in new window

0
 
objectsCommented:
do you even need a FileLock at all?  Nothing you have said would suggest that a FileLock is necessary
0
 
wsyyAuthor Commented:
if i don't need a fileLock, how can a deadlock can't be avoided?

Probably I will need to explain a little bit more clear:

I am running a Nutch crawler, which creates a number of threads each fetching urls for parsing. The fetch processes (threads) parse HTML pages in my own HTMLParser.java, which need to open and read some shared .xml files and mysql to determine if each of the pages need saving in local drives.

To be more specific, the parser will check with .xml files to see if the domain names of the web pages are of interest to us. For those with interesting domains, we save the pages; otherwise, we pass them. The .xml files are stored in local drives, and have only one copy.

For mysql access, the parser will check what is the last time the pages were parsed. If the parsed interval is less than a threshold, then the pages will be passed; otherwise, the parser will save the files to local drives.

I believe the above scenarios need file-locking and single mysql connect for each thread. But the code for for_yan only handles a single file locking, but my case will need to handle a folder which contains a number of .xml files. Then I don't know how to use his code.

For mysql access, I think I need to create each connection instance in each of the fetching processes.
0
 
objectsCommented:
doesn't sound like you need a lock. you're only reading the files.

> For mysql access, I think I need to create each connection instance in each of the fetching processes.

yes, see my earlier comments
0
 
for_yanCommented:
I don't think you need a file lock on the folder.
If one thread is accessing one file in the folder, another can probably access
another file in the same folder.

With database, I thik situation is clears to you - as soon as you have all connections within separate
threads and closing them when you are done and not mixing between the threads you
should be fine.

In general  could you ere-state, what is it that you
experience in running your application which drives you to think about these file locks, etc ?

0
 
for_yanCommented:
The concurrent issue which you encounter - how do they manifest?
0
 
wsyyAuthor Commented:
When I ran HTMLParser.java standalone without threads, it works perfert.

When I ran the Nutch processes, the log shows that the web pages, such as www.apache.org, can't be parsed correctly while the HTMLParser.java had been called to handle the web pages. Nutch throws out exceptions which don't have sufficient info but as simple as "www.apache.org can't be parsed successfully".

So I decided to debug the .java file. I believe I was in control of one of the threads, and could step over a few statements, then the debug process was stop at the following .class page:

package java.util.concurrent;
import java.util.concurrent.locks.*;

That is number one.

Number two, when it was stopped, some other threads were still running.

Number three, I noticed that the debug process always stopped at the attached code.

Based on the three reasons, I believe it is due to the file lock issue.
public XMLFileHandler(Configuration conf, String domain){
		this.conf=conf;
		this.domain=domain;
		this.docs=new ArrayList<File>();
		File fs = new File(conf.get("xml.template.dir"));
		File[] files = fs.listFiles();
		for(File file: files){
			if(file.getName().startsWith(domain)){
				docs.add(file);
			}
		}
	}

Open in new window

0
 
wsyyAuthor Commented:
Some suggestions are pointing to a right direction. For example, Java File Lock is not necessary. But the suggestions vaguely point a clear solution to my case. Thanks
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 11
  • 6
  • 6
  • +2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now