Small display lists?

I have a puzzle game underway, where I use GL vertex quads as sprites. Now, many of these are groupable into display lists. I’ve been tampering with speeding this up, reaching a desperate and clearly unoptimized way, where I load the vertices of the sprites (which are somewhere between 3 and 10 vertices apiece) into short vertex arrays, which I then put into display lists, and then I group those into longer display lists. This causes a mad chain of executions to occur: the long list calls the short lists 24 times, which in turn render one short vertex array each. In the long lists, there are quite a few transformation calls, which I suppose benefit from living in a display list…

My question is, what would be a good balance here? I can’t put everything into a vertex array, but maybe it would be quicker to issue all those vertices to the large display list via straight glVertex2f calls? or should I keep only the display lists? Is there much overhead for a simple 4-vertex display list?

Thanks for any thoughts,