If by "simple image file" you refer to JPEG, GIF or so, it's tough luck because you'd have to implement all the decoding logic, which is far from being simple (take a look here for more info, but briefly because you really don't want to go into details ;)).
After decoding, eventually what you get is a matrix (two-dimensional array) of pixel information (usually three numbers for red, green and blue component, but other options exist). Then your methods get_pixel
and set_pixel
are trivial.
What Ruby folks usually do in such cases is wrap already existing C library for image manipulation, into a library such as rmagick.