views:

303

answers:

1

We have a data streaming application which uses local caching to reduce future downloads. The full datasets are larger than what is streamed to the end user - i.e. just the bits the end user wants to see. The concept is pretty much like a browser, except the streamed data is exclusively jpg and png.

The usage patterns are sporadic and unpredictable. There are download spikes on initial usage while the cache is populated. What would be the theoretical and practical/experimental means of modelling and measuring the bandwidth usage of this application. We have size values of the source datasets, but little knowledge of the usage patterns.

A: 

There is not enough information to derive a useful theoretical model for bandwidth usage. If you know something about the rollout pattern, you could attempt to model the distribution of spikes. Is this a closed user group that will all get the app within a short period of time? Will you sell to individual customers that in turn will roll out to a number of employees? Are you selling to consumers? All of these will impact the distribution of peaks.

As for the steady-state bandwidth requirements, that depends a great deal on usage patterns (do they frequently re-use the same data or frequently seek new data?) This is a great thing to determine during a beta program. Log usage patterns locally and/or on the server for beta users, and try to get beta users that are representative of the overall user community.

Finally, to manage spikes in consumption, consider deploying your content on a service such as Amazon CloudFront. This allows you to pay for the bandwidth you actually use, but scale as needed to handle peaks in demand.

Eric J.
The initial roll-out is to a group of 10 users each of whom are interested in a certain section of the data but may peek elsewhere out of interest or to advice and collaborate (local managers).I was interested in Java tooling techniques to incorporate into the application for bandwidth usage measurements.We would need to write a custom application to make efficient use of cloud front, that might be something to look into in the future as usage grows.
whatnick
Do you want to measure on the server side or the client side? Do you want to know how much bandwidth is used by each client, or in total for your app?
Eric J.
Sorry about the late reply Eric, we require bandwidth usage on the client side. Since the end-user is a large organization considering roll-out of multiple client applications and is concerned about the load our data heavy application will put on their network.
whatnick
Is it possible to install caching mechanisms onsite with the client so that only the local network segment will be burdened for a file that has been downloaded once? Caching could be a traditional server-based cache, or a P2P mechanism.
Eric J.