tags:

views:

172

answers:

5

Ok, so we have this application we've been forced to install on our servers. It's a single war-file and when you first log into this application it creates a folder on the local disc.

And for some reason the application stores data here. Some kind of statistics and data. This application also uses a database. God knows why they can't store everything there.

Running a multi-server cluster of application servers this annoys me. Copying the folder, backup and general littering of the application servers.

On a scale from 1 to 10. How bad is this design? And would you allow any applications like that on your server?

+1  A: 

First question, can this actually work at all in a cluster?

Does it make sense for this data to be split across the disks of many servers? Will the "answers" still be correct?

If it's some kind scratchpad, intermediate rsults for a single calculation, then I guess that it may be non-ideal but not actively harmful.

It clearly doesn't conform to JEE standards, access local disk directly is one of the no-nos, like starting your own threads. But there's many an app in the world that is prepared to violate the letter of those standards.

Do you need to back those files up, or are they just scratchpad? If they are formal data, loss of which afects consistency then I would be very unhappy to have this in production.

djna
It can run on a cluster, but the folders will be out of sync. As far as I know the files created are used by the application, not intermediate results.
Tommy
If they are used long term then I don't "get" how this app can function correctly in a clustered environment. I would want to challenge the authors to explain, and also to ask about the implications of deleting those files. My guess is that, as is not uncommon, the developers didn't consider the implications of working in a cluster. Unless they have good answers if I had the power to veto deployment I would do so.
djna
+1  A: 

Its pretty bad. Sadly I've seen worse.

Assuming you attach all nodes in the cluster to a shared disk area to avoid syncing the folders. You then have to ask yourself, what would happen if the same user logs into two nodes at the same time? What if they log into node a, then node b, then log out from b and go back to node a? Say you decide to use a the database to check if the user has logged in and is active. What happens if they time out? They log into node a, use it. Go to lunch. The session times out. Are then are blocked from using node b? From node a as well?

That kind of stuff would never fly in a design review around here. Access to any non-transational resource from a server is a red flag and I expect the developer to demonstrate what they did the minimize the risks. Normally, the amount of code you'd write to get around design defects would make maintenance too expensive.

There are work around options you might try. As an example, you could store the files on a file share and use the database to map an application level abstraction of the file name to physical file location. You could use transactions to lock the rows in the database while the files are accessed. You'd have to hold your nose while implementing such a stinker of a system but it would get rid of the complicated user file structure while you work on a better fix.

I worked on an awful servlet app that created PDF files. The creation could take anywhere from 20 minutes to an hour because the report was made up of hundreds of pages filled with charts and tables. Each chart and table had a query or three behind it. At first, jobs were created via a thread spawned from the servlet which wrote a file in the WEB-INF directory. When the user got notified that the file was created, the download servlet sent the file and deleted the file from disk. It was like a j2ee bad practice checklist. We got around this by having the web app write a job request to a table and then have another process poll the table looking for jobs. Once the pdf file was ready and copied to the ftp location, the URL would be written to a table and a link would show up on the user's screen. Downloads where served by a different server as plain old static content.

sal
A: 

Specifically in you case sounds a bad design. I wouldn´t let a system like that on a production server. Said that, I see an example where I prefer to put the files on the filesystem than on the database (almost like sal example): when you generate reports in pdf to the users and they can leave the files in the server. In this case, i dont like to put this files on the database. If they´re lost, it isnt a big deal. They can be rebuild again using the informations from the database. And they dont inlarge the database backup.
One could say that the application that generates the PDF are another process/application but, IMHO, it is part of the system and so should be considered JEE Application. Therefore above scenario it´s an exapmple of a legitimate access to filesystem inside a JEE application.

Leonel Martins
A: 

We are a JEE application and a pretty successful one at that and our entire system revolves around files. Ofcourse we dont store to a local drive. We have a separate machine for storing our files. I dont think there is any harm in dealing with the filesystem if that is your question. Have a central location for the files (ie a shared location for all servers in your cluster) and you should be ok.

OpenSource
A: 

See this.. and deployed it with our backs to the wall :( We used a SAN and mounted it on the various servers in the cluster using the same name as the mount point on each of them; Configured the app to use this mount point as its 'free for all' folder.

It worked. Its not great, but it worked.. and is still working.

To answer your real question: 1. The developer could have a valid reason for designing it this way.. See Leonel Martins answer too. 2. If you are the deployer and it is a mission critical app and you just 'must' get this thing on prod.. then its not impossible.

Ryan Fernandes