views:

861

answers:

4

I want to embed a command-line utility in my C# application, so that I can grab its bytes as an array and run the executable without ever saving it to disk as a separate file (avoids storing executable as separate file and avoids needing ability to write temporary files anywhere).

I cannot find a method to run an executable from just its byte stream. Does windows require it to be on a disk, or is there a way to run it from memory? If windows requires it to be on disk, is there an easy way in the .NET framework to create a virtual drive/file of some kind and map the file to the executable's memory stream?

A: 

Creating a RAMdisk or dumping the code into memory and then executing it are both possible, but extremely complicated solutions (possibly more so in managed code).

Does it need to be an executable? If you package it as an assembly, you can use Assembly.Load() from a memory stream - a couple of trivial lines of code.

Or if it really has to be an executable, what's actually wrong with writing a temp file? It'll take a few lines of code to dump it to a temp file, execute it, wait for it to exit, and then delete the temp file - it may not even get out of the disk cache before you've deleted it! Sometimes the simple, obvious solution is the best solution.

Jason Williams
An executable is just a bunch of bytes, and I see no reason why an OS should care whether the bytes originate from disk or memory, so long as both types of locations can be locked (and they can). The OS seems to DEPEND on an executable being located in a file system, why? If we can memory map files, why not easily do the reverse by registering a unique UNC path to a virtual file backed only by a locked region of memory? Windows 7 still can't even mount an ISO image to a virtual drive natively without 3rd party software, even though all of Microsoft's own stuff is distributed as ISO images.
Triynko
I don't think I can package a third party unmanaged command line utility compiled in c++ as an assembly that can be used with Assembly.Load. The only thing wrong with writing temp files is that I don't think it should be necessary, and it requires me to find a temp directory and ensure write access, which is also unnecessary. There's just all these unnecessary steps that could cause problems that I am trying to avoid, but the OS and the .NET framework are making it difficult to do so.
Triynko
I agree with you. I just think you'll have a hard time finding anything as easy to implement and test as writing a temp file.
Jason Williams
I agree there's nothing easier, it's just exactly what I'm trying to avoid because I don't want to write to the disk for any reason.
Triynko
Not even "because you could have done it 2 months ago and moved on to another problem"? Or "because it's the standard approach and would just work for any executable (managed or unmanaged, etc) because it's specifically supported by .net"?
Jason Williams
No, because it involves writing to a disk, which is exactly what I'm avoiding. In a system with no writable disks, a RAM disc is the only option, and by calling that extremely complicated, you've just reinforced my assertion that requiring executable code to originate on a disk is a burdensome and pointless requirement, but unfortunately that's all the operating system and its ancient underlying DLLs were written to handle.
Triynko
+1  A: 

Take a look at the "In Memory" section of this paper. Realize that it's from a remote DLL injection perspective, but the concept should be the same.

Remote Library Injection

pcorey
Interesting, but I'm looking for a way to use something like Process.Start, passing a byte array rather than a file name. I don't like their justification for not launching processes from memory as "virus scanner's can't scan programs that never exist on the disk"... what a cop out. If they'd build the operating system to run programs securely to begin with, the "must originate on disk" rule would be unnecessary.
Triynko
What do you mean? There is no such rule. It's just a fact of life that virus scanners (which are not written by MS) get suspicious any time a process suddenly starts executing self-modifying code. Sounds like appropriate behavior to me, regardless what OS the scanner is running on. If you don't want to rely on virus scanners for this kind of protection, well, that's why Windows has had DPE built in for many years.
Richard Berg
The .NET framework enforces an implied rule that "executable code must originate on disk" by providing Process.Start overloads that will ultimately accept only a filename as a pointer to executable code. It will accept neither a managed byte array nor a pointer to unmanaged bytes that contain executable code.
Triynko
A good API would accept bytes primarily, and support filenames as a time-saver, not the other way around. Code ultimately ends up in memory, but the existing API forces one to copy code bytes from memory, to disk, and back to memory, working with a temp file in between. It's unnecessary work, and the framework does it only because it's wrapping an underlying platform API (LoadLibrary, etc.) that's stuck on the idea that executable code always originates from a disk. Code is bytes. The fact that the OS requires a file is really something the framework should abstract away. Think about it.
Triynko
The OS is insecure by design, because its permissions system is backwards. A truly secure OS would assign permissions to virtual instances of executable code, rather than to user-accounts, since users only provide input, whereas executable code does everything else. That's why when you download an executable in windows you must tremble in terror that it can access any files your account is allowed to access, and the only difference in Vista is *when* you tremble (after clicking ok to the prompt, lol).
Triynko
Executable code should access only the folders it's explicitly granted access to by the administrator, and users should be able to execute and provide input only to the appropriate virtual instances assigned by the administrator. For example, user "littlejohn" can run instance "XBOX One", which can access only the "childrens' games" folder or group, and user "bigjohn" can run "XBOX Two", which can access that folder in addition to the "adult games" group. Both instances share the same executable file, but the key is that virtual instances are assigned unique permissions.
Triynko
You can put any data you like in memory, and as long as you call the correct entry point address within it, you can execute it. But this is a very unmanaged thing to want to do, so C# is probably not the language to do it from - it should be prety easy to achieve by p/invoking to unmanaged code (C++, C, assembler). Of course, as discussed above, you may have to run with elevated permissions and run the gauntlet of virus checkers.
Jason Williams
Every other major OS bases its security model on user accounts too. And there are very good reasons to do so. Not enough room in the comments to enumerate all the reasons why you're wrong, so I'll let Raymond do the talking: http://blogs.msdn.com/oldnewthing/archive/2006/08/18/705957.aspx
Richard Berg
And that's why every major OS is plagued by viruses and security warnings. I read that blog and it doesn't refute my point. In fact, one of the comments there says the same thing I said: "The problem is that the Windows and UNIX security models are ass-backwards. The operating systems go to great lengths to protect themselves from me, but do nothing to protect my data from the programs I run. Re-architecting Windows with a capability security model would be impracticable, but .Net is an excellent step in the right direction."
Triynko
And for example: "This isn't correct - *most* UNIXes have the user as the center of the security system, but SELinux and AppArmor change this completely. They allow you to assign fine grained privileges to applications and not users. Binaries are tagged with extended attributes identifying which security context they should run it (or in AppArmor they are identified by file paths)."
Triynko
+2  A: 

You are asking for a very low-level, platform-specific feature to be implemented in a high-level, managed environment. Anything's possible...but nobody said it would be easy...

(BTW, I don't know why you think temp file management is onerous. The BCL does it for you: http://msdn.microsoft.com/en-us/library/system.io.path.gettempfilename.aspx )


  1. Allocate enough memory to hold the executable. It can't reside on the managed heap, of course, so like almost everything in this exercise you'll need to PInvoke. (I recommend C++/CLI, actually, so as not to drive yourself too crazy). Pay special attention to the attribute bits you apply to the allocated memory pages: get them wrong and you'll either open a gaping security hole or have your process be shut down by DEP (i.e., you'll crash). See http://msdn.microsoft.com/en-us/library/aa366553%28VS.85%29.aspx

  2. Locate the executable in your assembly's resource library and acquired a pinned handle to it.

  3. Memcpy() the code from the pinned region of the managed heap to the native block.

  4. Free the GCHandle.

  5. Call VirtualProtect to prevent further writes to the executable memory block.

  6. Calculate the address of the executable's Main function within your process' virtual address space, based on the handle you got from VirtualAlloc and the offset within the file as shown by DUMPBIN or similar tools.

  7. Place the desired command line arguments on the stack. (Windows Stdcall convention). Any pointers must point to native or pinned regions, of course.

  8. Jump to the calculated address. Probably easiest to use _call (inline assembly language).

  9. Pray to God that the executable image doesn't have any absolute jumps in it that would've been fixed up by calling LoadLibrary the normal way. (Unless, of course, you feel like re-implementing the brains of LoadLibrary during step #3).

  10. Retrieve the return value from the @eax register.

  11. Call VirtualFree.

Steps #5 and #11 should be done in a finally block and/or use the IDisposable pattern.


The other main option would be to create a RAMdrive, write the executable there, run it, and cleanup. That might be a little safer since you aren't trying to write self-modifying code (which is tough in any case, but especially so when the code isn't even yours). But I'm fairly certain it will require even more platform API calls than the dynamic code injection option -- all of them requiring C++ or PInvoke, naturally.

Richard Berg
Looking back at all those steps and prayers, I hope it's apparent why the "code must be on disk" rule is so onerous. Since executable code is ultimately loaded into memory, I should be able to run it directly from memory, or load it from disk to memory and then run it. The current .NET API would require me to take code in memory, write it to disk, work with a temporary file managment system, only to read it right back into memory, where it already was in the first place! This is just demonstrating what happens when you wrap an old technology in new clothes and give it a new name.
Triynko
Also, this "low-level, platform-specific" feature is already in the high-level managed .NET framework as "Process.Start". The problem is that the only kind of pointer it accepts to executable code bytes is a filename, which implies that a disk is the only place code should originate.
Triynko
I just don't think file-permissions should ever need to come into the mix at all, and unfortunately with temp files they do. See the comment on MSDN for Path.GetTempFilename: in low rights environments e.g. in some ASP.NET configurations, this function will fail with a security exception... because it needs to find the temp directory and that's reading the "Environment"... it requires EnvironmentPermission for unrestricted access to environment variables. Associated enumeration: PermissionState.Unrestricted
Triynko
No. Calling Process.Start is very very different from merely setting CS:IP to a random offset in an executable memory region and jumping there. ESPECIALLY from managed code. Even in native Win32 apps there is much, much more involved in loading arbitrary .exe/.dll modules than you think. What you see in my answer above is not intended to be burdensome; it's actually a dramatic oversimplication. Here's an article that [starts to] scratch the surface of everything LoadLibrary handles for you: http://msdn.microsoft.com/en-us/magazine/cc301727.aspx
Richard Berg
A: 

This is explicitly not allowed in Vista+. You can use some undocumented Win32 API calls in XP to do this but it was broken in Vista+ because it was a massive security hole and the only people using it were malware writers.

Karl Strings
Well it's rediculous. A stream of bytes is a stream of bytes no matter how you look at it or where it comes from. Forcing those bytes to come from FILE rather than MEMORY to run serves no purpose. Malware? Seriously? Give me a break... so they just have to write it to a file first to run it. Are you saying there was a security hole in the API (Ok, I guess), or are you saying there's a security hole in the idea of running code from a memory stream of bytes rather than a file stream of bytes (Definitely NOT).
Triynko
It does matter, and it is not ridiculous. For example; how do you properly and securely validate the signature of an executable image if the file is not on disk? How will filter drivers work now that you are executing from a memory image? How will support images work now that you are feeding in a memory stream rather than a file name? These are just a few, but you still have a million corner cases, like dealing with Image File Execution Options, inconsistencies with debugging, etc. It really is trivial to just write out the file, this is dealing with the guts of a very complicated OS.
Karl Strings