tags:

views:

50

answers:

1

what is a proper way of organizing files in a wcm that is using JCR. Let's say the total file count is 100,000+ files and total file size is about 50-70GB. Is it better to organize files by fie types ( and create sub directories to further group the files by some category)

What are the advantages. Does it make any difference while using query api, maintenance, or something.

Proposal 1:
--shared
------images
------pdf
------movies
--location1
------images
------pdf
------movies
--location2
------images
------pdf
------movies

Proposal 2: 
--pdf
-------shared
-------location1
-------location2
--images
--------shared
--------location1
--------location2
.. etc
A: 

Whatever you do, make sure you don't end up with more than a 1000 child nodes under any given node. Just as in any (real) file system, when you want to list a folder with a lot of files/subfolders in it, it can take some time. By default Jackrabbit 2.x will now hash up the user space. ie:

/users/s/sa/sandra
/users/s/si/simong
...

I would personally go for your first proposal as it makes more sense. We have a webapp where all our users can upload/delete/modify their files in JCR and did it this way:

/_users/s/si/simon/public
/_users/s/si/simon/public/My Pictures
/_users/s/si/simon/public/My Pictures/2010/06/Trip to the US
/_users/s/si/simon/public/My Pictures/2010/06/Trip to the US/DC1001.jpg
/_users/s/si/simon/private/account_details.txt
...

We're loosely following the way home folders are done in UNIX-like systems. We try to hash up all the things we (reasonably) can. Like the for example the user space (/s/si/simong) but also things like messages:

/_users/s/si/simong/messages/2009/12/25/ab34ed87dee
/_users/s/si/simong/messages/2010/03/12/e4f1de3cd48
...

However it's up to the individual user to not have more then 1000 child files in a given folder (we do warn them though.) Doing it this way also gives you a nice benefit of exercising Access Control. ie: everthing under ~/private is only read- and writeable by the current user, ~/public is readable by everybody.

Simon