The amount of stack space used on a 32-bit x86 processor to pass arguments of various types:
- byte: 4 bytes
- bool: 4 bytes
- enum: 4 bytes
- char: 4 bytes
- short: 4 bytes
- int: 4 bytes
- long: 8 bytes
- float: 4 bytes
- double: 8 bytes
- decimal: 16 bytes
- struct: runtime size of the structure
- string: 4 bytes
- array: 4 bytes
- object: 4 bytes
- interface: 4 bytes
- pointer: 4 bytes
- class instance: 4 bytes
The ones below the line are reference types, their size will double on a 64-bit processor.
For a static method call, the first 2 arguments that are up to 4 bytes will be passed through CPU registers, not the stack. For an instance method call only one argument will be passed through registers. The rest are passed on the stack. A 64-bit processor supports passing 4 arguments through registers.
As is clear from the list, the only time you should ever consider passing an argument by ref is for structures. The normal guidance for this is to do so when the structure is larger than 16 bytes. It isn't always easy to guess the runtime size of a structure, up to 4 fields would usually be accurate. Less if those fields are double, long or decimal. This guidance then usually recommends turning your structure into a class, precisely for this reason.
Also note that there is no savings passing an argument as a byte or short intentionally, an int is the type that a 32-bit processor is happy with. Same for currently available 64-bit processors.
A method return value, the real topic of your question are almost always returned in a CPU register. Most types fit comfortably in the eax or edx:eax registers, an FPU register for floating point values. The only exceptions are large structures and decimal, they are too large to fit a register. They are called by reserving space on the stack for the return value and passing a 4 byte pointer to that space as an argument to the method.