PS2 Linux Programming
Simple Vertex Transformation With VU1
Introduction
In this tutorial, VU1 is used to perform some
simple vertex transformation calculations on the vertices of a textured sprite.
This lays the foundation for using VU1
for more advanced vertex transformations.
In the example code main.cpp is very similar to the
previous two samples, except now instead of putting the vertices in the GS
Packet in their final 12:4 fixed point format, they are instead inserted into
the packet using floating point numbers that represent the untransformed positions
of the vertices. This is achieved with the inline function AddFloatVert() (in
file AddRegisters.h) which packs the vertex data into a qword. (The packing format
is rather strange but it is used at this point to serve the purposes of this
tutorial.)
The real function of the VU is now beginning to be
uncovered. The VU code which operates on the vertex data is contained in the
file vu1code.vcl and is repeated below for clarity.
StartVert .equ 3
NumVerts .equ 4
.init_vf_all
.init_vi_all
.syntax new
.vu
--enter
--endenter
|
iaddiu |
iVert, vi00, 0 |
; Start vertex counter |
|
iaddiu |
iVertPtr, vi00, StartVert |
; Point to the first vert |
|
iaddiu |
iNumVerts, vi00, NumVerts |
; Load the loop end condition |
|
loi |
2048.0 |
; Put 2048 into the I register |
|
|
|
|
loop: |
|
|
; For each vertex |
--LoopCS 4,0 |
|
|
|
|
lq.xy |
fVert, 0(iVertPtr) |
; Load the XY components of the vert |
|
add.xy |
fVert, fVert, i |
; Add 2048 to the XY components |
|
ftoi4.xy |
fVert, fVert |
; Convert to 12:4 fixed point |
|
sq.xy |
fVert, 0(iVertPtr) |
; Store the newly converted data |
|
iaddiu |
iVert, iVert, 1 |
; Increment the vertex counter |
|
iaddiu |
iVertPtr, iVertPtr, 3 |
; Increment the vertex pointer |
|
ibne |
iVert, iNumVerts, loop |
; Branch back to "loop" label if |
|
|
|
; iVert and iNumVerts are not equal |
|
xgkick |
vi00 |
; Kick the data at VUMEM location 0 |
|
|
|
; to the GS |
--exit
--endexit
.end
The VU code loops through all 4 vertices, adds 2048
to them, converts them to fixed point and stores the result back into the
correct place in the GS packet. It is necessary to unpacked the data every
frame in this sample since the transformed data is being saved over the
original data.
The exact operation VU code is well documented and
should be fairly clear. It is left up to the reader look up what each
instruction does in the VU Users Manual and piece together the operation of the
code. Note that the instruction loi will not be found as this is a pseudo
instruction which operates on the I register of the VU. The I register is a 32-bit
single prexision floating point register in which immediate values are stored.
No stalls due to data dependency are generated for the I register. All this
instruction loi does, is sets the I register to the constant that is specified
(in this case 2048).
loop: is a label that is used in the ibne instruction
to tell the compiler where to jump to if the condition is true. Just after the
loop: label there is "--LoopCS 4,0" This is a VCL loop unrolling
command. Instead of repeating the same code many times in a row, VCL allows
loop unrolling. "--LoopCS 4,0" basically says that a minimum of 4
runs around the loop will be performed. The second parameter is ignored by VCL
and is generally set to zero.
Note that both floating point and integer variables
are being used. All of the floating point variables are prefixed with f and all
the integer variables prefixed with i. A number of constants are being used i.e.
"StartVert .equ 3" - this is similar to a macro in C/C++
and the pre-processor simply replaces "StartVert" with "3".
It is worth looking at the vsm code that VCL produces (vu1code.vsm). It
is possible to see how VCL has expanded the code into two streams and has tried
to make the loop as efficient as possible at the expense of the parts outside
of the loop.
In this tutorial a simple VU1 micro program has
been used to transform the vertices of a textured sprite made from a triangle
strip.
Dr Henry S Fortuna
University of Abertay
Dundee