views:

229

answers:

4

Is this a realistic solution to the problems associated with larger .mdb files:

  • split the large .mdb file into smaller .mdb files
  • have one 'central' .mdb containing links to the tables in the smaller .mdb files

How easy would it be to make this change to an .mdb backed VB application?

Could the changes to the database be done so that there are no changes required to the front-end application?

A: 

Hmm well if the data is going through this central DB then there is still going to be a bottle neck in there. The only reason I can think why you would do this is to get around the size limit of an access mdb file.

Having said that if the business functions can be split off in the separate applications then that might be a good option with a central DB containing all the linked tables for reporting purposes. I have used this before to good effect

Kevin Ross
+2  A: 

If you have more data than fits in a single MDB then you should get a different database engine.

One main issue that you should consider is that you can't enforce referential integrity between tables stored in different MDBs. That should be a show-stopper for any actual database.

If it's not, then you probably don't have a proper schema designed in the first place.

David-W-Fenton
By referential integrity, do you mean foreign keys? If so, how can I quickly check whether the mdb contains any foreign keys?
Craig Johnston
In Access, in the immediate window, type "?currentdb.Relations.Count" and hit enter. Any number greater than 0 means you have relationships defined. Or, within Access, when viewing the database window, drop down the TOOLS menu and choose Relationships. In A2007 and A2010 it will be somewhere else, though, as the ribbons are not structured the same as the menus. And I'd assume from the fact that you don't know whether there are relationships defined that it's not your database -- otherwise, er, um, well, you know....
David-W-Fenton
+4  A: 

Edit Start
The short answer is "No, it won't solve the problems of a large database."

You might be able to overcome the DB size limitation (~2GB) by using this trick, but I've never tested it.

Typically, with large MS Access databases, you run into problems with speed and data corruption.

Speed
Is it going to help with speed? You still have the same amount of data to query and search through, and the same algorithm. So all you are doing is adding the overhead of having to open up multiple files per query. So I would expect it to be slower.

You might be able to speed it up by reducing the time time that it takes to ge tthe information off of disk. You can do this in a few ways:

  1. faster drives
  2. put the MDB on a RAID (anecdotally RAID-1,0 may be faster)
  3. split the MDB up (as you suggest) into multiple MDBs, and put them on separate drives (maybe even separate controllers).

(how well this would work in practice vs. theory, I can't tell you - if I was doing that much work, I'd still choose to switch DB engines)

Data Corruption
MS Access has a well deserved reputation for data corruption. To be fair, I haven't had it happen to me fore some time. This may be because I've learned not to use it for anything big; or it may be because MS has put a lot of work in trying to solve these problems; or more likely a combination of both.

The prime culprits in data corruption are:

  1. Hardware: e.g., cosmic rays, electrical interference, iffy drives, iffy memory and iffy CPUs - I suspect MS Access does not have as good error handling/correcting as other Databases do.
  2. Networks: lots of collisions on a saturated network can confuse MS Access and convince it to scramble important records; as can sub-optimally implemented network protocols. TCP/IP is good, but it's not invincible.
  3. Software: As I said, MS has done a lot of work on MS Access over the years, if you are not up to date on your patches (MS Office and OS), get up to date. Problems typically happen when you hit extremes like the 2GB limit (some bugs are hard to test and won't manifest them selves except at the edge cases, which makes the less likely to have been seen or corrected, unless reported by a motivated user to MS).

All this is exacerbated with larger databases, because larger databases typically have more users and more workstations accessing it. Altogether the larger database and number of users multiply to provide more opportunity for corruption to happen.

Edit End

Your best bet would be to switch to something like MS SQL Server. You could start by migrating your data over, and then linking one MDB to to it. You get the stability of SQL server and most (if not all) of your code should still work.

Once you've done that, you can then start migrating your VB app(s) over to us SQL Server instead.

CodeSlave
How does this linking process work? Will the queries from the VB app need to change?
Craig Johnston
Basically the same way you would link to another MS Access database. Create an ODBC connection on the workstation (ControlPanel-> AdministrativeTools-> DataSources). Then create table links inside the MS Access MDB to the ODBC connection (user Machine Data Source instead of the File Data Source).
CodeSlave
@CodeSlave: for simple, short-term accomodation of an MDB database that is between 1GB and 2GB, would splitting the MDB be an easier and more practical solution than migrating to SQL Server?
Craig Johnston
I don't see a point to it. If it worked, you'd see rules of thumb statements all over every MS Access forum saying something like "subdivide your database in to separate MDBs for each table when it gets larger than 500MB"). But we don't.Will it be easier? Slightly. More practical? I doubt it.The easiest way to find out for sure would be actually try it. It'll probably take you less than an hour to move the tables to separate MDB's and recreate the integrity constraints for the linked tables.
CodeSlave
I don't care for your explanations of corruption. Collisions don't have anything to do with it except if they are so bad that the connection times out and Jet/ACE goes into the "not found" condition that is unrecoverable (same as if you pull the network cable). Secondly, you ameliorate the risk of corruption if you use methods that open write locks only for the shortest possible period of time. There is no software method for reducing the vulnerability to network hiccups of a file that is open across that network connection, as is the case for any Jet/ACE back end stored on a file server.
David-W-Fenton
While I do not know if collisions per se create problems for Jet, I do know that anything less than a pristine network connection will increase the odds of corruption enormously. Add replication to the mix and it goes up further. However, if the OP is just talking about splitting the db to a local drive, then corruption shouldn't be an issue.
Thomas
@Craig Johnston - If the db is over the 1GB mostly because of data, then splitting it will only provide a short reprieve depending on the size of the forms and reports (macros, modules and queries should be small). If the data growth is tiny (a couple of kilobytes per year), then go for it. However, if data growth is steady and you already have 1 GB, you are not saving much time by simply splitting the database; it's time to consider a better backend database product.
Thomas
Craig Johnston
@Craig Johnston: You are right. While you can create relationships, you can not create integrity constraints between tables that you've linked to. However, if those tables all reside the same parent database (that you are linking to) you should be able to create integrity constraints there and have them stick, even when you link to them from your child database. Of course if that Parent DB were still MS Access, you wouldn't be any farther ahead.
CodeSlave
@David-W-Fenton: I think I understand your point of view on this. However, funny things start to happen when you've got a very full network. E.g., I remember copying a large MDB between two workstations (A and B) on a very full network. The MDB worked fine on A but was corrupt when it finished copying to B. When I copied it again later on in the evening (when the network had calmed down), the second copy worked fine on B. Given that collisions/etc. shouldn't do anything would suggest that that file fopy should have failed with an error message relating to the network time out.
CodeSlave
However, it didn't. Transition errors can creep in, and they are more likely on a busy network. IP isn't considered a reliable protocol; therefore TCP (over IP) isn't reliable; therefore SMB (over TCP/IP) isn't reliable.
CodeSlave
You are right, there are things we can do to lessen the impact of network errors. Like not running MDBs on network drives or using higher level protocols to complete transactions (ODBC, SQL*Net, etc.) - where the damage would be limited to the record or field level vs. the table or whole database.
CodeSlave
I would say that network traffic level is an edge condition not often encountered. Anything that constrains Jet/ACE's ability to connect to the remote file is going to be an issue, but a network at that level of overload is going to be exhibiting all sorts of problems in other applications as well.
David-W-Fenton
+1  A: 

For reasons more adequately explained by CodeSlave the answer is No and you should switch to a proper relational database.

I'd like to add that this does not have to be SQL Server. Quite possibly the reason why you are reluctant to do this is one of cost, SQL Server being quite expensive to obtain and deploy if you are not in an educational or charitable organisation (when it's remarkably cheap and then usually a complete no-brainer).

I've recently had extremely good results moving an Access system from MDB to MySQL. At least 95% of the code functioned without modification, and of the remaining 5% most was straightforward with only a few limited areas where significant effort was required. If you have sloppy code (not closing connections or releasing objects) then you'll need to fix these, but generally I was remarkably surprised how painless this approach was. Certainly I would highly recommend that if the reason you are reluctant to move to a database backend is one of cost then you should not attempt to manipulate .mdb files and go instead for the more robust database solution.

Cruachan
The reason is that I don't want to have to re-code the VB front end.
Craig Johnston
Jet/ACE *is* a proper relational database. But it has capacity limitations that the original poster is running into. That makes it an inappropriate choice for his data store. That is because of the capacity limitation, not because Jet/ACE is not a "proper relational database."
David-W-Fenton
@Fenton: is having multiple MDBs, instead of one, a solution to the 'capacity limitation' of MDBs?
Craig Johnston
@David-W-Fenton by proper in this case I'm referring to the fact that a rdbms such as mssql or mysql runs under it's own process which the client communicates with, whereas this is not the case with Jet/ACE.
Cruachan
Having multiple MDBs is a piss-poor workaround. The real solution is using an appropriate database. For an app where MDB file size is a problem, you've chosen the wrong database.
David-W-Fenton
@Cruachan: that is not a "proper" definition of RDBMS. It's only a definition of a database server.
David-W-Fenton