There are lots of different ways that people approach development.
Sometimes people follow your three steps and discover that the slow bits are due to the external environment, therefore rewriting Python into C does not address the problem. That type of slowness can sometimes be solved on the system side, and sometimes it can be solved in Python by applying a different algorithm. For instance you can cache network responses so that you don't have to go to the network every time, or in SQL you can offload work into `stored procedures which run on the server and reduce the size of the result set. Generally, when you do have something that needs to be rewritten in C, the first thing to do is to look for a pre-existing library and just create a Python wrapper, if one does not already exist. Lots of people have been down these paths before you.
Often step 1 is to thrash out the application architecture, suspect that there may be a performance issue in some area, then choose a C library (perhaps already wrapped for Python) and use that. Then step 2 simply confirms that there are no really big performance issues that need to be addressed.
I would say that it is better for a team with one or more experienced developers to attempt to predict performance bottlenecks and mitigate them with pre-existing modules right from the beginning. If you are a beginner with python, then your 3-step process is perfectly valid, i.e. get into building and testing code, knowing that there is a profiler and the possibility of fast C modules if you need it. And then there is psyco, and the various tools for freezing an application into a binary executable.
An alternative approach to this, if you know that you will need to use some C or C++ modules, is to start from scratch writing the application in C but embedding Python to do most of the work. This works well for experienced C or C++ developers because they have a rough idea of the type of code that is tedious to do in C.