I'm conceptualising a good approach to rendering as many disjointed pieces of geometry with a single draw call in OpenGL, and the wall I'm up against is the best way to do so when each piece has a different translation and maybe rotation, since you don't have the luxury of updating the model view uniform between single object draws. I've read a few other questions here and elsewhere and it seems the directions people are pointed in are quite varied. It would be nice to list the main methods of doing this and attempt to isolate what is most common or recommended. Here are the ideas I've considered:

1) Instancing; A new attribute is sent and updated per object, rather than per vertex. I could then pass varied transformation data efficiently, and within one draw call. The drawback of this technique is that my code would be less portable, supporting desktop GL only, since most mobile platforms do not seem to support this feature yet in OpenGL ES 2.0.

2) Creating matrix transformations in the shader. Here I'd send a translation vector or maybe a rotation angle or quaternion as part of the attributes. The advantage is it would work cross-platform including mobile. But it seems a bit wasteful to send the exact same transformation data for every single vertex in an object, as an attribute. Without instancing, I'd have to repeat these identical vectors or scalars for a single object many many times in a VBO as part of the interleave array, right? The other drawback is I'm relying on the shader to do the math; I don't know if this is wise or not.

3) Similar to 2), but instead of relying on the shader to do the matrix calculations, I instead do these on the client side but still send through the final model view matrix as a stream of 16 floats in the VBO. But as far as I can tell, without instancing, I'd have to repeat this identical stream for every single vertex in the VBO, right? Just seems wasteful. The tradeoff with 2) above is that I am sending more data in the VBO per vertex (16 floats rather than a 3-float vector for translation and maybe a 4 float quaternion), but requiring the shader to do less work.

4) Skip all the above limitations and instead compromise with a separate draw call for each object. This is what is typically "taught" in the books I'm reading, no doubt for simplicity's sake.

Are there other common methods than these?

As an academic question, I'm curious if all the above are feasible and "acceptable" or if one of them is clearly a winner over the others? If I was to exclusively use desktop GL, is instancing the primary way for achieving this?