What I have done recently is create an interface between the simulation environment and host system. Different hdl simulators have different interfaces, and getting the simulator NOT think in batch mode, the traditional simulation model, instead run for ever like a real design is half of the problem.
Then from the host using C (or whatever) you can create abstractions that may or may not allow you to write your application software for whatever target (depending on what language and compiler capabilities you have). For example you can make a generic poke and peek function and on the final target have those actually poke and peek memory or I/O, but for simulation through the abstraction you talk to a testbench in the simulation that simulates the same memory cycle.
I went one step further and used (Berkeley) sockets between the host and test bench so that the simulation can keep running while the host applications stop and start. Not unlike having a real processor with an OS that you are starting applications and running them to completion and starting another. At least for test applications, for delivery you probably only have one app.
By creating these abstraction layers I can write real applications that will be used on the target when it is built. Along the way you can use software simulation of the logic initially, then if you like build an fpga with an abstraction interface (throw away logic) say a uart for example. Replace the shim between the applications abstraction layer and the simulator with a uart interface, or whatever. Then when you marry the processor and logic in the same chip or on the same board, replace the abstraction layer again with direct calls to whatever interfaces they have always though they were talking to. If something breaks and you have retained the abstraction layer you can take the application back to the simulation model and have access to all of your logic internals.
Specifically this time around I am using a hdl language cyclicity cdl which is on sourceforge, the documentation needs some help but the examples may get you going, and it produces synthesizable verilog, so you get an extra win there. I threw out all the scripting batch stuff other than the bare minimum needed to connect and start a C simulation model. So my test bench is in C (well C++ technically) the sockets layer was done there. The output can be .vcd files which gtkwave uses. Basically you can do the bulk of your HDL design using open source software with no licenses, etc. By adding one or two lines of code to the CDL simulation part I was able to have it run as an infinite loop, which I can say works quite well, there doesnt appear to be any memory leaks, etc.
both modelsim and cadence have standardized ways of connecting host C programs to the simulation world and from there you can use an IPC to get to host applications talking to an abstraction layer api.
this is probably way overkill for a pic, I have given up pics a while ago for the faster and C friendly arm based micros anyway. There is/was an open core pic that you could simply incorporate into your simulation, even though that is not what you are trying to do here.