SNESDEV » Blog Archive » First big optimisation in sprite functions.

First big optimisation in sprite functions.

It’s been a while that I’m working on various optimisation to some functions to copy tables of data. I found out that the C compiler wasnt optimizing it at all. Here is the original piece of code :

    counter = data[frame].spriteNum;
    for(i=0; i<counter; i++) {
        //x = data[frame].data[i].nameLow & 0x0f;
        //y = data[frame].data[i].nameLow >> 4;

        spriteData.data[i+offset].HPos =
            data[frame].data[i].HPos + 0x76;
        spriteData.data[i+offset].VPos =
            data[frame].data[i].VPos + 0x80;
        spriteData.data[i+offset].nameLow =
            data[frame].data[i].nameLow;
        spriteData.data[i+offset].priority =
            data[frame].data[i].priority;
        spriteData.data[i+offset].color =
            data[frame].data[i].color;
    }

    // set big sprite
    spriteData.prop[offset].properties = 0xaa;
    spriteData.prop[offset+1].properties = 0xaa;

This was consuming almost 25% of the cpu time between each Vblank. The new code is going about 30 times faster :

    heroPreparedSpriteData *myData;
    OBJECTData *myObjectData;
    OBJECTData *currentSpriteData
    OBJECTProp *currentSpriteProp;

    // init with base address
    myData = data;
    // set to current frame
    for(i=0; i<frame; i++) {
        myData++;
    }

    // init with base address
    currentSpriteData = (OBJECTData*) &spriteData.data;
    currentSpriteProp = (OBJECTProp*) &spriteData.prop;
    // set to current frame
    for(i=0; i<offset; i++) {
        currentSpriteData ++;
    }

    counter = myData->spriteNum;

    // init start of the data array
    // we assume we always starts at 0 index
    myObjectData = (OBJECTData*) &(myData->data);

    for(i=0; i<counter; i++) {
        currentSpriteData->HPos = myObjectData->HPos + 0x76;
	currentSpriteData->VPos = myObjectData->VPos + 0x80;
	currentSpriteData->nameLow = myObjectData->nameLow;
	currentSpriteData->priority = myObjectData->priority;
	currentSpriteData->color = myObjectData->color;

	// update myObjectData Adress
	myObjectData++;
	currentSpriteData ++;
    }

    // set big sprite for the 8 first sprites
    currentSpriteProp->properties = 0xaa;
    currentSpriteProp++;
    currentSpriteProp->properties = 0xaa;

C compilers seems to something really handle the table really badly.

So now I’m going to finish writing the sprite routines since now I have acceptable performance going on. Why acceptable ? Because I’m sure there is a relly more effecient way to perform all this. It’s basically just copying data, so I should get myDta in ROM and DMA it to RAM and from there just updating some values like HPos and VPos.

I keep you updated anyway …

See ya, Lint

This entry was posted on Tuesday, September 16th, 2008 at 4:14 pm and is filed under Snes, Software dev. You can follow any responses to this entry through the RSS 2.0 feed. Responses are currently closed, but you can trackback from your own site.

5 Responses to “First big optimisation in sprite functions.”

sylvainulg Says:
September 17th, 2008 at 10:29 am
indeed, i’m a bit curious to know why you haven’t used a DMA transfer …
Another trick you could use to boost performance would be to keep a bitvector of which sprite have changed and which have not (oh, unless you mostly have sprites that need to follow a scrolling screen, in which case their coordinates require an update almost every frame and it might not be worth the hassle of flagging updated sprites)
sylvainulg Says:
September 17th, 2008 at 10:31 am
also, you could easily replace

// init with base address
myData = data;
// set to current frame
for(i=0; i<frame; i++) {
myData++;
}
with
myData = data+frame;

that’s the very definition of adding an integer to a pointer.
lint Says:
September 17th, 2008 at 1:47 pm
yes and no … This is really depending to the CPU and compiler. When I do : myData = data+frame; It’s calling a C function that multiply frame with the size of the data since there is no operation availble on the 65816 to do the multiplication. Now it’s true there is a 3 specialized registers on the SNES CPU that allow to do multiplication. I can maybe try that. The multiplication function offered by the compiler is really not efficient at all.
sylvainulg Says:
September 22nd, 2008 at 3:06 pm
even less efficient than adding N times the value X to get N*X ?
lint Says:
September 22nd, 2008 at 3:08 pm
yes really … totally inefficient function.