Part of the Khronos Group
OpenGL.org

The Industry's Foundation for High Performance Graphics

from games to virtual reality, mobile phones to supercomputers

Page 1 of 3 123 LastLast
Results 1 to 10 of 27

Thread: Map Texture Objects

  1. #1
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,201

    Map Texture Objects

    We've currently got the capability to map buffer objects, which enables loading of data into buffers without needing to use any intermediate system memory arrays or otherwise do a memory copy. We don't have that ability with textures - true, we can do something similar with PBOs but it involves more round-tripping.

    Being able to map a texture object has the following uses:


    • The ability to implement "texture orphaning" for dynamic textures which must be fully replaced each frame, without needing a round-trip through a PBO.
    • The ability to more directly get data into a texture which can provide advantages for both creation and updating use cases, and without needing any intermediate steps (this can facilitate any kind of texture streaming scheme).
    • The ability to read data from a texture without the driver first having to do a copy back to system memory.


    This suggestion is to overload the glMapBuffer, glMapBufferRange, glFlushMappedBufferRange and glUnmapBuffer calls to also accept a texture target as the target parameter. This target can be any that is valid for any of the glTexImage/glTexSubImage calls. Behaviour should otherwise be identical, and the texture must be unmapped before it can be used in a draw call.

    Issues I can see include:


    • Is it only allowed to map mip level 0, or - if not - how to specify mapping a mip level other than 0? Suggest to perhaps use glMapBufferRange with an appropriate offset?
    • glBindBuffer or glBindTexture?
    • What happens if glTexImage or glTexSubImage is called while a texture is mapped?

  2. #2
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    Being able to map a texture object has the following uses:
    OK, so... how do you define what you get when you map it?

    It's easy to say that you just "map a texture", but textures are opaque to allow drivers to hide their particular implementations. That's why pixel transfers are complicated, while buffer uploads are simple byte copies. It hides details like swizzling, the specific bit-pattern of formats, etc.

    So now you want to directly expose the vagaries of the range of OpenGL hardware. There are three ways to go about it:

    1. Force the driver to use a single, specific standard across vastly different hardware. Say goodbye to cross-platform portability, let alone future-proofing.

    2. Extend the query API to tell you how to interpret the data for a particular format, thus allowing different hardware to expose its particular eccentricities to you. Of course, since most existing sources of streamed data (FFMpeg buffer writing, DirectShow, etc) will export to their own format, you have to use an intermediate buffer. They must write to some memory, you convert it to the hardware's version in the mapped space. In short: no different from having those APIs write to a mapped PBO (assuming they can).

    Also, I'm just guessing, but I'm fairly sure NVIDIA's not going along with that. They're seem really protective of their IP and implementation details.

    3. Split the difference and allow the user to tell OpenGL that a particular texture will need to adhere to a particular structure. That is, it should be mappable. Of course, now you have to make glTexStorageMappable functions for 3 separate types (1D, 2D, 3D). Either that, or you're going to have to create a bunch of new image formats that force the texture to use a specific format. Or some kind of texture parameter or something.

    Even assuming that there would be a particular structure that all hardware could support.

    The ability to implement "texture orphaning" for dynamic textures which must be fully replaced each frame, without needing a round-trip through a PBO.
    Doesn't glInvalidateTexImage give us that already? You upload to it, use it, invalidate to it, and write again. It seems much easier than mapping a texture just to invalidate it.

    The ability to read data from a texture without the driver first having to do a copy back to system memory.
    I'm not sure how useful that ability is. For most GPU situations, that memory is across a (relatively) slow bus. You'd be better off issuing a DMA and then doing something else until it's over, which we effective have with PBOs.

    Also, there has never been a guarantee that mapping means you're talking to GPU memory.

    This suggestion is to overload the glMapBuffer, glMapBufferRange, glFlushMappedBufferRange and glUnmapBuffer calls to also accept a texture target as the target parameter.
    Absolutely, not!

    Ignoring the question of whether this is good and useful functionality, that is not the way to implement it. OpenGL has enough confusing and overloaded functions that do twenty different things to pointlessly add another 4 to that list.

    glMapBufferRange means glMapBufferRange. We're only recently getting to the point where we don't have to call glVertexAttribPointer with an argument that we have to pretend is a pointer, and it really fetches some of its data from somewhere else. There's no need to screw up that important progress just to avoid adding a few new API functions.

    The ARB is not running out of functions. There is not some hard limit that they're coming up on, such that OpenGL can't have more functions. There's absolutely no reason to overload a perfectly good API when you can just have glMapTexSubImage et. al.

    If this is going to happen, then it gets its own API. Don't screw up APIs that actually make sense just to shoehorn in new functionality. That road leads to stupidity like AMD_pinned_memory (good functionality, Godawful API).

  3. #3
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,201
    Hmmm - as rants go I'd give it maybe a 2. It would have been a 6 or 7 but it blatantly contradicts other things you've ranted in favour of (or against, as appropriate) in the past.

  4. #4
    Senior Member OpenGL Guru
    Join Date
    May 2009
    Posts
    4,948
    Really? Because I don't remember the time I argued that IHVs should put their texture format on display for everyone to see. Or when I said that OpenGL's API needs to get worse and more confusing by senselessly overloading functions. Or when I said that mapping memory assured you of getting GPU access.

    But whatever it takes to avoid addressing the substance of my argument, right? Because the best way to present your case is to dismiss any inconvenient facts.

    Why do you always make things personal? I don't seek you out; I barely take note of the fact that it's mhagan presenting an idea. I'm only "rant"ing at your ideas because you post a lot of them and don't put much thought into them. This idea is nothing more than, "wouldn't it be wonderful if we could map textures?" There's no consideration of the ramification of such a decision. No explanation for how this could work cross-hardware. Even your suggestion of API shows how little actual thought you put into it. The only bit of substance to this is the basic idea: map texture memory.

    If you're going to seriously present an idea beyond the basic concept of "let us do this somehow", then put some effort into it. Show that you're better than just throwing ideas against a wall and hoping that one sticks.

  5. #5
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,201
    Ok then.

    First of all, the ability to Map (or Lock in older version) a texture is something that has been in D3D for an eternity (in D3D11 both textures and buffers even use the very same API call). So far as the hardware vendors are concerned, this is a complete non-issue. There are no deep, dark, proprietary internal representations going on here; textures are just the same as buffers - a stream of bytes.

    Now let's get one thing real clear before continuing. This is not about adding functionality to GL that D3D also has. This is about adding functionality that may be generally useful, irrespective of whether D3D has it or not. D3D is not relevant beyond this point.

    So point 1 is this: the argument that vendors may not want to put their internal texture formats on display is bogus.

    Point 2 is this: even on hardware that may have it's own funky internal representation, the whole point of OpenGL as an abstraction layer is to abstract that away from the programmer. This is something that already happens with e.g. a glTexSubImage call. Any hypothetical Map/Unmap of a texture can go through the very same driver code paths as are used for glTexSubImage to accomplish this. So even in such a case, this amounts to use of a problem that has already been solved as an argument against.

    Point 2 also exposes way number 4 to go about it: if the internal representation is already appropriate for exposure to the programmer, then just expose it as it. Give the programmer a pointer and be done with it. This case could be satisfied e.g. where the internal representaion matches the internalFormat param used when creating the texture. If the internal representation is not appropriate, then likewise give the programmer a pointer, but add a conversion step that happens in the driver - either at Map time (for reading) or Unmap time (for writing). As I said - this is something that already happens with glGetTexImage/glTexSubImage, the driver already contains code to do it, so arguments against it won't fly.

    Now onto specifics.

    glInvalidateTexImage? No; that just accomplishes one part of the requirement, which is to orphan the texture. It does absolutely nothing about the second part, which is to avoid round-tripping through PBOs or program-allocated system memory in order to perform the update. Mapping a texture solves that; instead of the round-trip and extra memory copies you write directly (or as directly as the driver allows).

    Overloading the buffer calls. Yes, it's ugly, yes it's confusing, yes, a set of extra entry points would be better. And to head this one off at the pass - there is no need for separate entry points for 1D/2D/3D textures; follow the pattern established by glInvalidateTexSubImage instead - one entry point that works with all types.

    Portability? You're going to need to come up with some compelling reasons as to why it's a problem for portability, rather than just waving the word around. No, endianness is not one; we already use floats, unsigned ints, ints, unsigned shorts and shorts in buffer objects; this is another problem that has already been solved; endianness as an argument against is also bogus.

    Specific utility of this suggestion? I thought I'd made it clear but let's restate it again. It's explicitly not a case of "wouldn't it be great if..."; it's to avoid round-tripping through intermediate storage and/or intermediate objects when loading data into a texture. In other words, to serve the same utility as glMapBuffer/glMapBufferRange. No, PBOs don't already provide this as there is still a requirement for the driver or GPU to copy data from the PBO to the texture. No, this wouldn't invalidate the utility of PBOs as there are still cases where you may want an asynchronous transfer. The functionality suggested is to enable transfer of data in a similar manner to mapping a PBO, but without the additional intermediate step of transferrring from a PBO to the texture.

    And moving on:

    I'd dismissed the argument against as frivolous and vexatious because it read as an argument against being made purely for the purpose of arguing against rather than for constructively discussing pros and cons. No, I don't see substance in it to be addressed; much of it is bogus and can be shown to be so.

    Do I post a lot of ideas? No - there were precisely two others in this section of the forum.

  6. #6
    Advanced Member Frequent Contributor
    Join Date
    Apr 2009
    Posts
    591
    As a side note, it would make texture streaming a lot less painful, a lot less. Also, for those GL implementations that have a unified memory architecture, this would be great. Various platforms have various hackage ways to "write directly to a texture" but they are icky and horribly non-portable (OMAP3/4 I am talking about you)

    My take is that at texture creation (somehow ) a texture's representation is specified so that mapping makes sense. One way for that somehow to happen is to have for each acceptable internal format value GL_FOO create the enumeration GL_FOO_MAPABLE, where MAPABLE means that when it is mapped the texture data is in a format specified in the spec (bytes/pixels, pixel format, line padding, etc). Even compressed textures would be ok. My only take where this can go icky is the GL_RGB8 is stored as GL_RGBA8 (and analogous GL_RGB8UI is stored as GL_RGBA8UI). I am not crazy about querying the GL implementation about this packing and padding since it will make the code that uses the mapable texture jazz hard to test reliably beyond "just try it on several different boxes and hope for the best".

  7. #7
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,201
    Texture streaming was the main use case I had in mind, yes. It would also apply for more general texture loading in console or other limited-memory scenarios where the extra overhead of PBOs (or system memory buffers) may be too onerous.

    Agree, adding *_MAPPABLE internal formats seems the best way to go. The driver does need a way to distinguish between textures that would have this usage pattern and those that wouldn't, and optimize accordingly. If *_MAPPABLE internal formats were added then the list of valid internal formats for this suggestion could be reduced (to exclude e.g. the wacky GL_RGB8-style formats) and a hypothetical spec could be more explicit about internal representations. That would also sidestep the need to query the implementation about padding/etc.

    It would be important to not go down the D3D route of applying seeming-arbitrary restrictions, such as you can only map with discard ("orphan" behaviour), or you can't map e.g. a texture array. In a fully general case it's all just data managed by the driver, so there should not need to be any such distinction.

  8. #8
    Advanced Member Frequent Contributor
    Join Date
    Dec 2007
    Location
    Hungary
    Posts
    985
    Alfonse is perfectly right. The internal swizzling/tiling used by the hardware is not something that you should forget about. This can change from vendor to vendor, and from GPU generation to GPU generation. Even if there would be any meaningful way to expose to the application this layout, the number of different layouts an application might have to handle would be impossible to tackle.

    Also, considering that once the application knows the swizzling, the uploads to these swizzled structures would be non-trivial thus it would not even reach the best case scenario of a pure CPU memcpy. GPUs on the other hand have DMA engines or other ways to directly perform copies from linear to swizzled memory at full speed, without utilizing any CPU power, thus a memcpy to PBO plus a hardware upload is almost guaranteed to be faster despite the intermediate copy.

    As Alfonse pointed it out, orphaning can be achieved with ARB_invalidate_subdata.

    Finally, a note on kRogue's "it would make texture streaming a lot less painful, a lot less" comment:
    Believe me, if you would have to deal with 3 vendor's 4 GPU generation's different tiling mechanism individually then you would reconsider your statement.

    The only potential approach is that if the user explicitly requests a mappable texture then the driver could potentially give him a texture storage with a linear layout, thus mapping would make sense (wouldn't be much more different than a texture buffer, except for the addressing, or actually pretty similar to the APPLE_client_storage extension, which is kind of like AMD_pinned_memory but for textures). However, then you would have to pay the cost at rendering time as accessing data with a linear layout but with addressing coherency of e.g. a 2D texture would give you poor performance.

    Exposing direct texture data mapping might make sense on a console or other fixed hardware as you only have a fixed number of swizzling modes that you have to handle and they don't change. But not for a cross-platform API.
    Disclaimer: This is my personal profile. Whatever I write here is my personal opinion and none of my statements or speculations are anyhow related to my employer and as such should not be treated as accurate or valid and in no case should those be considered to represent the opinions of my employer.
    Technical Blog: http://www.rastergrid.com/blog/

  9. #9
    Advanced Member Frequent Contributor
    Join Date
    Apr 2009
    Posts
    591
    Finally, a note on kRogue's "it would make texture streaming a lot less painful, a lot less" comment:
    Believe me, if you would have to deal with 3 vendor's 4 GPU generation's different tiling mechanism individually then you would reconsider your statement.
    I think that if mapping a texture was in the spec and if its format was specified (as mhagain) suggested the issues of dealing with different GPU's idiosyncrasies would drop.

    For what it is worth, OMAP3's texture_stream extension (which you have to dig through TI's website to find) is a horror interface to map texture memory for texture streaming.. it uses ioctl on a finite number of /dev/foo .. finite across the whole system, not per process or per context, finite across the whole system.. so it just plain sucks. The platform needs/wants it for essentially presenting the stream coming from the camera with GL.. one can make an argument that getting the bytes directly from the camera and copying them is.. icky and that an extension should be used where there is some kind of texture stream to handle this, that is not the case (there the a texture stream in EGL land, so on paper it is possible)..

    The extension you posted:
    http://www.opengl.org/registry/specs...nt_storage.txt is what everyone wants with texture streaming me thinks, and is ideal for unified memory arch gizmos.

    For a unified memory arch, especially where texture from pixmap is supported, I'd bet that such GPU's can use (at a potential performance loss) directly texture data that is non-swizzled, non-tiled. For texture streaming that performance loss is much less next to the bandwidth and CPU overhead of convert, etc... my thinking is that by saying the texture is mapable one is saying, dude I am likely to be streaming (be it reading or writing) and so the texture data is very dynamic and it's data is not "made by GL".

    Though I must confess, that the Apple extension you posted for streaming is all that one really wants at the end of the day, though I am concerned that the spec does not spell out when changes to that client data are reflected in GL.. and it is much more that one wants, since client memory is pageable/virtual etc, the Apple extension looks like it can be horror to implement and it is to some extent, overkill.








    Last edited by kRogue; 10-11-2012 at 12:59 AM.

  10. #10
    Senior Member OpenGL Pro
    Join Date
    Jan 2007
    Posts
    1,201
    Quote Originally Posted by aqnuep View Post
    Alfonse is perfectly right. The internal swizzling/tiling used by the hardware is not something that you should forget about. This can change from vendor to vendor, and from GPU generation to GPU generation. Even if there would be any meaningful way to expose to the application this layout, the number of different layouts an application might have to handle would be impossible to tackle.

    Also, considering that once the application knows the swizzling, the uploads to these swizzled structures would be non-trivial thus it would not even reach the best case scenario of a pure CPU memcpy. GPUs on the other hand have DMA engines or other ways to directly perform copies from linear to swizzled memory at full speed, without utilizing any CPU power, thus a memcpy to PBO plus a hardware upload is almost guaranteed to be faster despite the intermediate copy.
    The point was made that even if such internal representations did exist, OpenGL already abstracts them away for glTex(Sub)Image calls; this is a solved problem.

    The point was also made that D3D allows mapping of textures but yet doesn't suffer from any of these hypothetical reasons-for-objection. Operating systems and drivers may differ, but the underlying hardware is still the same. It's great to theorize about the way things might work internally in hardware, but such theories don't really hold up in the face of a working example that refutes them.

    Quote Originally Posted by aqnuep View Post
    As Alfonse pointed it out, orphaning can be achieved with ARB_invalidate_subdata.
    The point was made that texture invalidation only satisfies one (small) part of this suggestion. It fails to meet the main part, which is to avoid the need for intermediate copies.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •