views:

83

answers:

4

Hi, I'm working on a legacy Java 1.4 project, and I have a factory that instantiates a csv file parser as a singleton.

In my csv file parser, however, I have a HashSet that will store objects created from each line of my CSV file. All that will be used by a web application, and users will be uploading CSV files, possibly concurrently.

Now my question is : what is the best way to prevent my list of objects to be modified by 2 users ?

So far, I'm doing the following :

final class MyParser {
    private File csvFile = null;
    private Set myObjects = Collections.synchronizedSet(new HashSet);

    public synchronized void setFile(File file) {
        this.csvFile = file;
    }

    public void parse()
        FileReader fr = null;
        try {
            fr = new FileReader(csvFile);
            synchronized(myObjects) {
                myObjects.clear();
                while(...) { // foreach line of my CSV, create a "MyObject"
                    myObjects.add(new MyObject(...));
                }
            }
        } catch (Exception e) {
          //...
        }
    }    
}    

Should I leave the lock only on the myObjects Set, or should I declare the whole parse() method as synchronized ?

Also, how should I synchronize - both - the setting of the csvFile and the parsing ? I feel like my actual design is broken because threads could modify the csv file several times while a possibly long parse process is running.

I hope I'm being clear enough, because myself am a bit confused on those multi-synchronization issues.

Thanks ;-)

+3  A: 

I think there are multiple issues which are there in this code.

  1. If this class is a singleton, this class should be stateless i.e no state should be present in this class. therefore having setter for the file itself is not the right thing to do. Pass the file object into parse method and let it work on the argument. This should fix your issue of synchronizing across various methods

  2. Though your myObjects Set is private, I am assuming you are not passing this to any other calling classes. In case you are, always return clone of this set to avoid callers making changes to original set.

  3. Synchronized on the object is good enough if all your set changes are within the synchronized block.

Fazal
"If this class is a singleton, this class should be stateless i.e no state should be present in this class". Would you care to elaborate why this is true?
David Soroko
Actually my statement is not totally correct. Let me rephrase.The singleton class should only carry those data members which do not change across multiple threads using this object
Fazal
+2  A: 

Use separate MyParser object for every parse request and you will not have to deal with concurrency (at least not in MyParser). Also, then will you be able to truly service multiple users at a time, not forcing them to wait or erasing the results of previous parsing jobs.

pajton
+2  A: 

Basically you are assuming methods need to setFile first and then call parser. Let us consider this, t1 (with setFile XX) and t2 (with setFile YY) are coming at the same time and t2 set the file to be YY. Then t1 asks for parse() and starts getting records from YY. No amount of synchronised is going to solve this for you and the only way out is to have the parse method take a File parameter or remove the singleton constraint (so that each thread has its own file object). So use a

public void parse(File file) //and add synchronised if you want.
Calm Storm
I thought about that, but my design document says that I must implement this using the Strategy pattern. I have a Strategy interface containing a parse() method with no arguments, and my actual parser is a concrete strategy that implements that interface...
Philippe
@Philippe: Then pass the file object into the constructor of MyParser?
Jeremy
@Jer : MyParser is a singleton... I think I need to speak with the guy that wrote the design :-p
Philippe
Going by the look of it, one of the two 1)constructor or an argument instead of setter 2) singleton has to give up :)
Calm Storm
@Calm Storm : I just sent a message to the person who did the design to ask if I can give up with the singleton...
Philippe
Thanks guys for all your help. I'll just give up with the singleton :-)
Philippe
A: 

The singleton thing is mostly a red herring. Nothing to do with concurrency issues you are considering. As far as synchronisation goes, I think you are ok. Making the method synchronized will also work despite the fact that myObjects is static because it is a singleton.

David Soroko
I don't think this is correct. If two threads use the same Parser instance to parse two files concurrently, then the mutable state in the HashSet will become polluted, whether or not the set is static.
matt b
@matt : is that true despite I clear the hashset at the beginning of my synchronized block ?
Philippe
Didn't see that part, but if you've essentially confined the hashset to within the synchronized block, what need is there for it to be a member level variable? If you want to have the same instance parse multiple files by multiple threads, then the class should have zero state.
matt b
Two threads can not "parse the files concurrently" because of the synchronization block.
David Soroko