There's two layers to this. We have most of #1 in place for our testing. We're about to start on #2.
Django in isolation. The ordinary Django unit tests works well here. Create some tests that cycle through a few (less than 6) "typical" use cases. Get this, post that, etc. Collect timing data. This isn't real web performance, but it's an easy-to-work with test scenario that you can use for tuning.
Your whole web stack. In this case, you need a regular server running Squid, Apache, Django, MySQL, whatever. You need a second computer(s) to act a client exercise your web site through urllib2, doing a few (less than 6) "typical" use cases. Get this, post that, etc. Collect timing data. This still isn't "real" web performance, because it isn't through the internet, but it's as close as you're going to get without a really elaborate setup.
Note that the #2 (end-to-end) includes a great deal of caching for performance. If your client scripts are doing similar work, caching will be really beneficial. if your client scripts do unique things each time, caching will be less beneficial.
The hardest part is determining what the "typical" workload is. This isn't functional testing, so the workload doesn't have to include everything. Also, the more concurrent sessions your client is running, the slower it becomes. Don't struggle trying to optimize your server when your test client is the slowest part of the processing.
Edit
If "line-by-line" means "profiling", well, you've got to get a Python profiler running.
http://www.python.org/doc/2.5.2/lib/profile.html
Note that there's plenty of caching in the Django ORM layer. So running a view function a half-dozen times to get a meaningful set of measurements isn't sensible. You have to run a "typical" set of operations and then find hot-spots in the profile.
Generally, your application is easy to optimize -- you shouldn't be doing much. Your view functions should be short and have no processing to speak of. Your form and model method functions, similarly, should be very short.