Hello, I'm a beginner and I was wondering how pictures,video,windows and buttons etc are represented visually on the screen. I'm not asking whether it was made from for example gtk or wxwidgets, My question is what is the fundamental idea behind making the pixels come up the way they do. And what exactly does GUI library use to put them on the screen?
Short answer for the last question: GUI libraries call operating system functions to draw UI elements, which in turn call the appropriate driver for the display hardware. The driver sends commands to the hardware by writing into its externally accessible ports, wich are mapped to a special area on the computer's memory or I/O area (see Device Driver on Wikipedia).
At its most basic, the operating system exposes a set of base drawing apis (gdi, directx, gdi+, opengl) which then call the display driver and later updates the "video memory". Back in the DOS days, you could update it manually, but it's become increasingly difficult with the large number of hardware systems out there, so you instruct the video driver to do it for you.
Now once it's in the video memory, the information gets sent sequentially to your monitor, scan line by scan line (read row by row). If you update the video memory while it's being uploaded to your monitor, you get what's called tearing (the thing v-sync settings in games avoid).
To avoid tearing and locking the video memory during this upload, a technique called double buffering is usually used, where you actually have two "video memory" buffers on your graphics card, and after you finish uploading to one and the monitor scanning starts, the card uploads the first buffer and lets you write your new information to the second buffer, thus parallelizing the process.
Note: this is about the 2D part of it, since that's what you seem to be asking. The 3D part is similar, but it has an additional layer, once you pass in the vertexes to your display driver, it projects them in "screen space" and uploads them scan line by scan line to video memory, which is later uploaded to your monitor.
Overview
Usually, the grouping of pixel data in raw form is called a framebuffer. This is a one-dimensional array of values of the same size. The size of each value depends on the color space used, and the color depth. For example, a 32-bit RGBA framebuffer could be defined like this in C: unsigned int fb[width*height];, given that sizeof int is 4
.
Hardware usually use the RGB color space for hardware graphics like OpenGL and DirectX, and direct framebuffer access like with /dev/fb on Linux and GDI+ on Windows. Hardware decoding of movies usually depends on the YCbCr color space. Some image file formats save their data in other color spaces than RGB too, but is converted passed as RGB to the API they use.
Pixel color
An individual pixel's color can be represented these ways (the list is not complete), assuming the RGB color space:
Seperate color channels (True-color)
True color is one component for each of the three or four color channels, where the separate color channel values represent the intensity of each color. If it has an alpha channel, it works the same way, but the meaning of the alpha is context-dependent. Usually alpha represents opacity, where the lowest value is 100% translucent and the highest value is 100% opaque. For example: (0,255,0,128) would represent 50% translucent green, given RGBA True-color, using 8-bits of color depth per channel.
Palette
With palettes, the individual pixel value is an index into an array of colors, usually true-color (does not imply 8-bits per channel though). The range of the index is the number of colors that can be represented. 8-bit palette indices was common before, especially on commodity VGA hardware in the early 1990s. The hardware could then display a subset of 256 colors at a time, out of a pool of 2^24 colors.
Color spaces
I don't want to make this answer too long, and I think Wikipedia answers this better than me anyway, so here is a link: Wikipedia: color space Color spaces are ways to represent the colors and the combination of colors.
Which color space do I choose for my application?
It depends on what the application does with the pixel data. Different color-spaces defines different needs. YCbCr is used by moving pictures because it defines gamma levels, which again is defined by the NTSC and PAL standards for example. sRGB does the same thing, only for computer monitors, where you can select gamma/color profiles for your particular screen. Those color spaces are handy when it is important that the color you perceive on the screen is as close to the color perceived on the final medium. RGB is often used when gamma isn't important, and when the computer screen is the final medium. It is easy to work with, since the color space is linear. So, for a computer game, you would probably use RGB, but for an image manipulation program like Photoshop or GIMP, you would support HSV/HSL/sRGB and CMYK. When working with raw pixels in a framebuffer returned by an API, you can assume RGB unless you figure out otherwise. When working with moving pictures, assume YCbCr. Hardware supports a lot of different ways to encode the data. Make sure you pick the format with respect to hardware support and performance.