views:

445

answers:

8

Summary

I recently had a conversation with the creator of a framework that one of my applications depends on. During that conversation he mentioned as a sort of aside that it would make my life simpler if I just bundled his framework with my application and delivered to the end user a version that I knew was consistent with my code. Intuitively I have always tried to avoid doing this and, in fact, I have taken pains to segment my own code so that portions of it could be redistributed without taking the entire project (even when there was precious little chance anyone would ever reuse any of it). However, after mulling it over for some time I have not been able to come up with a particularly good reason why I do this. In fact, now that I have thought about it, I'm seeing a pretty compelling case to bundle all my smaller dependencies. I have come up with a list of pros and cons and I'm hoping someone can point out anything that I'm missing.

Pros

  • Consistency of versions means easier testing and troubleshooting.
  • Application may reach a wider audience since there appear to be fewer components to install.
  • Small tweaks to the dependency can more easily be made downstream and delivered with the application, rather than waiting for them to percolate into the upstream code base.

Cons

  • More complex packaging process to include dependencies.
  • User may end up with multiple copies of a dependency on their machine.
  • Per bortzmeyer's response, there are potential security concerns with not being able to upgrade individual components.

Notes

For reference, my application is written in Python and the dependencies I'm referencing are "light", by which I mean small and not in very common use. (So they do not exist on all machines or even in all repositories.) And when I say "package with" my application, I mean distribute under my own source tree, not install with a script that resides inside my package, so there would be no chance of conflicting versions. I am also developing solely on Linux so there are no Windows installation issues to worry about.

All that being said, I am interested in hearing any thoughts on the broader (language-independent) issue of packaging dependencies as well. Is there something I am missing or is this an easy decision that I am just over-thinking?

Addendum 1

It is worth mentioning that I am also quite sensitive to the needs of downstream packagers. I would like it to be as straightforward as possible to wrap the application up in a distribution-specific Deb or RPM.

+1  A: 

Just my experience, take it with a grain of salt.

My preference for a couple of open-source libraries that I author is for independence from additional libs as much as possible. Reason being, not only am I on the hook for distribution of additional libraries along with mine, I'm also obliged to update my application for compatibility as those other libraries are updated as well.

From the libraries I've used from others that carry dependencies of "common" libraries, invariably I end up with requiring multiple versions of the common library on my system. The relative update speed of the niche libraries I'm using just isn't that fast, while the common libraries are updated much more often. Versioning hell.

But that's speaking generically. If I can aid my project and users NOW by incorporating a dependency, I'm always looking at the downstream and later-date effects of that decision. If I can manage it, I can confidently include the dependency.

As always, your mileage may vary.

jro
Maybe I'm misunderstanding your first full paragraph. Doesn't distributing the dependency yourself alleviate this problem, as your *not* obliged to update your copy of the dependency unless/until you see fit? Or are you saying, "don't depend to begin with" (if you can avoid it)?
bouvard
Sorry for the ambiguity. It probably depends on the nature of the dependency. If the dependency maintains backward compatibility, then yes it does alleviate the problem; if backward compatibility is broken, then you have a maintenance issue.
jro
And as a rule, I try to avoiding introducing dependencies. It's a trade-off, and again is context dependent. I never want to write code that I don't have to, but I balance that against the implications of bundling another library with my project.
jro
+3  A: 

Can't you just rely on a certain version of those dependencies? E.g. in Python with setuptools you can specify which exact version it needs or even give some conditions like <= > etc. This of course only applies to Python and on the specifc package manager but I would personally always first try not to bundle everything. With shipping it as a Python egg you will also have all the dependencies installed automatically.

You might of course also use a two-way strategy in providing your own package with just links to the dependencies and nevertheless provide a complete setup in some installer like fashion. But even then (in the python case) I would suggest to simply bundle the eggs with it.

For some introduction into eggs see this post of mine.

Of course this is very Python specific but I assume that other language might have similar packaging tools.

MrTopf
Thank you for the intro to Eggs. I was familiar with them, but it was a nice refresher. I think my primary concern with this method of packaging is that I am unclear how it affects downstream packagers. (e.g. Ubuntu) Do they wrap the Egg or do thy have the sort out the dependencies themselves?
bouvard
I never packaged something myself as deb or rpm (mainly because I usually just install easy_install and then let easy_install handle it as it usually gets more recent packages). But I'd assume that you mirror the dependency tree as rpms/debs. Makes it more modular as well.
MrTopf
+6  A: 

I favor bundling dependencies, if it's not feasible to use a system for automatic dependency resolution (i.e. setuptools), and if you can do it without introducing version conflicts. You still have to consider your application and your audience; serious developers or enthusiasts are more likely to want to work with a specific (latest) version of the dependency. Bundling stuff in may be annoying for them, since it's not what they expect.

But, especially for end-users of an application, I seriously doubt most people enjoy having to search for dependencies. As far as having duplicate copies goes, I would much rather spend an extra 10 milliseconds downloading some additional kilobytes, or spend whatever fraction of a cent on the extra meg of disk space, than spend 10+ minutes searching through websites (which may be down), downloading, installing (which may fail if versions are incompatible), etc.

I don't care how many copies of a library I have on my disk, as long as they don't get in each others' way. Disk space is really, really cheap.

DNS
And if the licenses are compatible.
S.Lott
And if you have a "callback" to the sysadmin when a security bug requires to upgrade the bundled library/framework. It can be a nightmare if framework X suddenly requires an urgent security upgrade and if you have N copies of it bundled in various places.
bortzmeyer
+1  A: 

I always include all dependancies for my web applications. Not only does this make installation simpler, the application remains stable and working the way you expect it to even when other components on the system are upgraded.

Daniel Von Fange
+1  A: 

Beware reproducing the classic Windows DLL hell. By all means minimize the number of dependencies: ideally, just depend on your language and its framework, nothing else, if you can.

After all, preserving hard disk space is hardly the objective any more, so users need not care about having multiple copies. Also, unless you have a minuscule number of users, be sure to take the onus of packaging on yourself rather than requiring them to obtain all dependencies!

Pontus Gagge
+2  A: 

For Linux, don't even think about bundling. You aren't smarter than the package manager or the packagers, and each distribution takes approach their own way - they won't be happy if you attempt to go your way. At best, they won't bother with packaging your app, which isn't great.

Keep in mind that in Linux, dependencies are automatically pulled in for you. It's not a matter of making the user get them. It's already done for you.

For windows, feel free to bundle, you're on your own there.

Vadi
I have appreciated your feedback on my blog and such Vadi, but I'm going to have to go with the consensus here and disagree with you. It doesn't have anything to do with "smart", its about convenience and I have yet to see any compelling argument against bundling uncommon dependencies.
bouvard
And given the relative scarcity and low-priority of the dependencies I'm referring it seems like a boon to the packagers to not have to worry about them. The space-on-disk is minuscule, so the only possible point of contention could be purity of methodology. I won't be bundling GTK...
bouvard
A distribution might fix a bug in one of the dependencies they provide before upstream gets it, and your users won't get the benefits. Not many will want to package it either when it is going against all sensible laws. Otherwise, everyone will start bundling, and it'll simply degrade.
Vadi
That is a perfectly valid point, but I think the chances of that happening are so fanatically remote that it does not require sacrificing the practical benefits that I will gain by including the dependencies. I can always reverse the decision later on if I turn out to be wildly off the mark.
bouvard
FYI, I also replied to your pejorative comments on my blog, because I did not believe think they were appropriate for this venue.
bouvard
+2  A: 

If you're producing software for an end-user, the goal is to let the customer use your software. Anything that stands in the way is counter-productive. If they have to download dependencies themselves, there's a possibility that they'll decide to avoid your software instead. You can't control whether libraries will be backwards compatible, and you don't want your software to stop working because the user updated their system. Similarly, you don't want a customer to install an old version of your software with old libraries and have the rest of the system break.

This means bundling is generally the way to go. If you can ensure that your software will install smoothly without bundling dependencies, and that's less work, then that may be a better option. It's about what satisfies your customers.

Dan Goldstein
+1  A: 

An important point seems to have been forgotten in the Cons of bundling libraries/frameworks/etc with the application: security updates.

Most Web frameworks are full of security holes and require frequent patching. Any library, anyway, may have to be upgraded one day or the other for a security bug.

If you do not bundle, sysadmins will just upgrade one copy of the library and restart depending applications.

If you bundle, sysadmins will probably not even know they have to upgrade something.

So, the issue with bundling is not the disk space, it's the risk of letting old and dangerous copies around.

bortzmeyer
This is an excellent point. In my particular case I don't believe this is relevant (as the dependencies in question are not connected to the web, nor are they granted any permissions on the local machine), but its still something everyone needs to keep in mind. I have updated the OP.
bouvard