Introduction to OpenGL Theory
This document serves to address the theory behind OpenGL, as several facets of OpenGL are commonly used with little knowledge of the reason, or worse yet, used without reason. Firstly, this article intends to explain what a state machine is and why OpenGL is considered as such.
A state machine is a term for describing, in simplest terms, is a machine that stays in a persistent state until it receives a message that tells it to change. This is a very common and basic model of computing. OpenGL is a finite state machine, as it has a predetermined and countable number of different states.
An OpenGL state is the exact configuration of the system at any particular time. Anything that changes a persistent option in OpenGL is a state change. You might enable or disable an option, perhaps using glEnable(GL_DEPTH), or set a specific change, such as glLineWidth() whereas you explicitly indicate a value. If you set a state, you know it will remain that way until you change it. It also means that you know exactly what effect you will have by enabling or disabling options because they are well documented.
A common misconception is that the primitive drawing functions are state changes. Instead they are considered output functions telling the system to draw something to the screen with all relevant options of the current state applied to it. Again, those states might be the current color, the current point size, or whether the depth function is currently enabled or not.
OpenGL Matrix Stacks
The basic idea is that OpenGL maintains a stack of matrices whereas the top of the stack is the current matrix, and every time you use a transformation then it is applied to the current matrix.
- glTranslate(x, y, z);
- glRotate (a, x, y, z);
- glScale (x, y, z);
The transformation functions will apply your specified values to the current matrix. Thus, if you rotate by 30 degrees and draw something, then everything you draw after that will also be rotated by 30 degrees; to counter this you would need to rotate by -30 degrees. If you have several matrices applied, then to undo them you would need to do the inverse of them in reverse order of how they were applied.
This does not mean that you need to keep track of all your transformations to undo them back to a previous state. That's where the OpenGL matrix stack functions come in, you can ‘save’ and ‘load’ previous matrices.
glPushMatrix() copies the top of the stack, so directly after a push the top two entries of the stack are identical, equating to a ‘save and continue.’
glPopMatrix() would be the equivalent of ‘load’. This gets rid of the top matrix in the stack, and sets the matrix underneath it as the new current matrix. So, a glPopMatrix() could undo immediately the result of several or even hundreds of transformations.
Here is a description of drawing a robot, for example, with a hierarchical stack: To draw a robot, you would first perform a translation to where the robot's torso is supposed to be. You could then draw the body of the robot around this point. From there you save the matrix and translate to the position of the first arm. Here, you draw the arm, and again save the matrix. Then translate to the hand, draw the hand, save the matrix. Translate to the first finger (the robot has three fingers, it’s a very advanced robot :P) and draw that. Then load the previous matrix. This puts you back at where you drew the hand. From here you save again (loading destroys the save point, so you need to remember this step) and go to the second finger. Repeat for all the fingers on the hand. Then load again, you are back at the arm. Another load and you are back at the body. From here, you can go on and draw the other arm, the head and the legs with the same procedure and without ever having to apply an ‘undo’ matrix to get back to a previous point. Any time you have a point in your scene where you will want to get back to, you just push that down to save it, and then when you pop the stack you are back there again.
The following illistrates the hierarchical stack method described in the above Robot example:
//... glTranslatef(x,y,z); // 1) position for drawing torso DrawTorso(); // 2) draw torso glPushMatrix(); // 3) save torso matrix glTranslatef(x,y,z); // 4) position for drawing arm DrawArm(); // 5) draw arm glPushMatrix(); // 6) save arm matrix glTranslatef(x,y,z); // 7) position for drawing hand DrawHand(); // 8) draw hand glPushMatrix(); // 9) save hand matrix glTranslatef(x,y,z); // 10) position for drawing first finger DrawFinger(); // 11) draw first finger glPopMatrix(); // 12) load hand matrix glPushMatrix(); // 13) save hand matrix again glTranslatef(x,y,z); // 14) position for drawing second finger DrawFinger(); // 15) draw second finger glPopMatrix(); // 16) load hand matrix glPushMatrix(); // 17) save hand matrix again glTranslatef(x,y,z); // 18) position for drawing third finger DrawFinger(); // 19) draw third finger glPopMatrix(); // 20) load hand matrix glPopMatrix(); // 21) load arm matrix glPopMatrix(); // 22) load torso matrix
With the same procedure and without ever having to apply an ‘undo’ matrix to get back to a previous point. Any time you have a point in your scene that you will want to get back to, you just push that down to save it. Then when you pop the stack, you are back there again.
That’s the OpenGL matrix stack. Well, almost. You see OpenGL actually has 3 matrix stacks.
3 Matrix Stacks of OpenGL
The one I have just been manipulating is called the modelview matrix stack, and corresponds to all the transformations done to the scene. It is the most commonly used matrix, and is also the largest matrix (more on this later). The other 2 stacks are the projection matrix stack and the texture matrix stack. The projection matrix stack is essentially camera positioning and configuration. If you have the correct matrix, then this stack is where you can set up the camera to look like a fish eye, or to bulge in weird places and so on. Mostly you just want to set an aspect ratio and position though, and for this there are some helper functions that perform the projection matrix manipulations for you (gluOrtheo2d() and so on). This matrix stack can be done for some interesting effects if you know what you are doing. But it is quite a bit smaller than the ModelView matrix stack. The final stack is the texture matrix, and this is applied to any texture coordinates to perform interesting texture effects. This is not used very often though, and is also smaller than the ModelView matrix.
When refering to the stack size, the OpenGL specification says that to be a conforming implementation, the stacks need to be at least a certain minimum size. The last time I looked at the spec, the ModelView matrix had to be able to hold at least 32 different elements. As long as you remember to always pop after a push, then this is plenty. I have personally never gone above about 6 elements deep. The other 2 stacks are quite a bit smaller, coming in at either 2 or 4 elements minimum (I can’t quite remember which it is at the moment. There is some homework for you ;) ).
On a final note, what about the situation where you want to save your current matrix for future use, but also want to do a complete reset of the transformations and go down a different path? Well, that’s where the function glLoadIdentity() comes into play. This simply sets the current matrix to the Identity Matrix, and counts as a complete reset of all transformations. glPush() followed by glLoadIdentity() is your save and restart option of matrices ;)
Well, I hope this helps a bit with OpenGL understanding. If you have any questions or notice any errors then feel free to contact me on the forums, and have fun playing around with OpenGL.