Ah.. as i don't have any answers and i did resolve this.. here is what i did do..
After loosing a lot of time, downloading many files, reading unclear documentation and so on and so on...
best and most important thing here is avisynth tool that can load any kind of video under directshow - do really a lot (using it's own script language) and then send that script to x264.exe encoder that will create video, which u will need to include with mp3 (also extracted by using avisynth plugin) and wrap it into mp4 file with mp4box.exe.
All this jobs are done by running process from .net that will return output.
My list of tools is:
avisynth - best thing for video ever made
ffmpeg - to get images out but u can use it for other things if u like
x264 - to get x264 video out from avs (avisynth script)
mp3box - to combine 264 file with mp3 into h264
soundout - avi synth plugin to extract mp3 sound from avisynth video
yadif - avi synth plugin to do some tihngs