It is important to understand how the threadpool scheduler works. It was designed to fine-tune the number of running threads against the capabilities of your machine. Your machine probably can run only two threads at the same time, dual-core CPUs are the current standard. Maybe four.
So when you dump a bunch of threads in its lap, it starts out by activating only two threads. The rest of them are in a queue, waiting for CPU cores to become available. As soon as one of those two threads completes, it activates another one. Twice a second, it evaluates what's going on with active threads that didn't complete. It makes the rough assumption that those threads are blocking and thus not making progress and allows another thread to activate. You've now got three running threads. Getting up the 500 threads, the default max number of threads, will take 249 seconds.
Clearly, this behavior spells out what a thread should do to be suitable to run as a threadpool thread. It should complete quickly and don't block often. Note that blocking on I/O requests is dealt with separately.
If this behavior doesn't suit you then you can use a regular Thread. It will start running right away and compete with other threads in your program (and the operating system) for CPU time. Creating 30,000 of such threads is not possible, there isn't enough virtual memory available for that. A 32-bit operating system poops out somewhere south of 2000 threads, consuming all available virtual memory. You can get about 50,000 threads on a 64-bit operating system before the paging file runs out. Testing these limits in a production program is not recommended.