Due to some requirements on speed, we need to some computation in-place on internal memory and then DMA the results of the computation to a external memory. The application runs on a TI DM355 processor which is based on ARM926EJ-S core and a set of TI periferals (EDMA, video accelerators etc).
How cleanly can this be done from the application? Is it as simple as mmap'ing the afore said internal memory address into a virtual space and doing the calculation?
Thanks