views:

185

answers:

4

I am building an app that scrapes information from web pages. To do that I have chosen to use an html scraper called Jsoup because it's so simple to use. Jsoup is also dependent on Apache Commons Lang libray. (Together they make up a total of 385kB ). So Jsoup will be used to Download the page and parse it.

My question is if the use of these simplifying libraries, instead of using Androids built-in libraries, will make my app slower? (in terms of downloading data and parsing).

I was thinking that the internal libraries would be optimized for Android.

+1  A: 

The external libraries will also use the internal libraries that are optimized for Android. I guess the real question is: would your custom implementation be faster than the generic implementation of these libraries?

In most cases, third-party libraries solve the problem that you want to solve, but also other problems that you might not need to solve, and it's this part that might hurt performance. You have to find the balance between reinventing the wheel and using optimized code just for your basic needs.

Additionally, if these libraries weren't designed with the Android platform in mind, make sure to test them extensively.

hgpc
No, they were not made for the Android platform.
droidgren
+1  A: 

It's the classical build-vs-buy argument.

If run-time performance is really important for your application then you should consider rolling out your own implementation or optimizing the library (assuming it's open source.) However, before you do that you should know good or bad the performance of the existing library is. You won't know that unless you actually use it and get some data.

As a first step I would recommend using the library and collect data regarding it's performance OR ask someone who has already used this library on Android for performance numbers. The library may be slow but if it's acceptable then I guess it's better than rolling one on your own.

Keep in mind when you create your own implementation it will cost your time and money (design, coding, testing and maintenance.) So you are trading off runtime performance for reuse and reduced development cost.

EDIT: Another important point is that performance is a function of many things. For example, the hardware, the Android version and the network. If your target device is running 2.1 or less and you may get a boost in performance by using 2.2. On the other hand, if you want to target all versions you have to adopt a different strategy.

Soumya Simanta
+3  A: 

The next release of jsoup will not require Apache Commons-Lang or any other external dependencies, which brings down the jar size to around 115K.

Internally, jsoup uses standard Java libraries (URL connection, HashMap etc) which are going to be reasonably well Android optimised.

I've spent a good amount of time optimising jsoup's parse execution time and data extractor methods; and certainly if you find any ways to improve it, I'm all ears.

Jonathan Hedley
I've just released jsoup 1.3.1, with no external dependencies. It's 131K; a bit bigger because I added a Connection interface to make web-scraping easier.
Jonathan Hedley
+1  A: 

If the question is, "Will external libraries INHERENTLY make my app slower than if I wrote the same code myself?", the answer is generally, "Yes, but not very much."

It will take the JVM some time to load an external library. It's likely that the library has functions or features that you aren't using, and loading these or reading past them will take some time. But in most cases this difference will be trivial, and I wouldn't worry about it unless you are in a highly constrained environment.

If what you mean is, "Can I write code that will do the same function faster than an external library?", the answer is, "Almost certainly yes, but is it worth your time?"

The odds are that any external library you use will have all sorts of features that you don't need but are included to accomodate the needs of others. The authors of the library don't know exactly what every user is up to so they have to optimize in a general way. So if you wrote your own code, you could make it do exactly what you need and nothing more, and be optimized to exactly what you are up to.

Whether it's worth the trouble in your particular case is the big question.

Jay