PDA

View Full Version : Raspberry Pi OpenGL issue with Kivy screenmanager



FrankGould
01-25-2018, 11:27 AM
Versions

Python: 2.7.13
OS: Raspian Jessie
Kivy: v1.10.1.dev0, git-8776f28, 20180123
RPi: 3

Trace-startup-GLerror file (https://github.com/kivy/kivy/files/1660682/Trace-startup-GLerror.txt)

Description I am migrating an operational kivy app from Android using the same resources now on Raspberry Pi that is failing to display images in the kivy screen manager. I asked on the kivy forum but was told to use KivyPie, which I did but got the same results. My configuration is NOOBS, manually installed Kivy for Raspian Jessie, on a Raspberry Pi 3. I will also ask this same question on the OpenGL forum.

In the log after starting my app, the following sections appear regarding OpenGL and Shader (see attached 'python main.py -c kivy:log_level:trace' file). It appears these are working and setup.py reports the library files are in the /opt/vc/lib/ folder.

[INFO ] [GL ] Using the "OpenGL ES 2" graphics system
[DEBUG ] [GL ] glShaderBinary is not available
[INFO ] [GL ] Backend used
[INFO ] [GL ] OpenGL version <OpenGL ES 2.0>
[INFO ] [GL ] OpenGL vendor
[INFO ] [GL ] OpenGL renderer
[INFO ] [GL ] OpenGL parsed version: 2, 0
[INFO ] [GL ] Shading version <OpenGL ES GLSL ES 1.00>
[INFO ] [GL ] Texture max size <2048>
[INFO ] [GL ] Texture max units <8>
[DEBUG ] [Shader ] Fragment compiled successfully
[DEBUG ] [Shader ] Vertex compiled successfully

After my app starts displaying images through screenmanager, the following error appears in the same log (invalid operation):

glGetError 0x502

After 7 of these invalid operation error messages, the following error appears (out-of-memory):

Exception: FBO Initialization failed: Incomplete attachment (36054)
Exception Exception: Exception('FBO Initialization failed: Incomplete attachment (36054)',) in 'kivy.graphics.fbo.Fbo.raise_exception' ignored
glGetError 0x505

After 3 of these Exeption messages, my app crashes with the following end of process regarding Fbo and Shader:

File "/home/pi/kivy/kivy/uix/screenmanager.py", line 508, in add_screen self.fbo_out = self.make_screen_fbo(self.screen_out)
File "/home/pi/kivy/kivy/uix/screenmanager.py", line 472, in make_screen_fbo
fbo = Fbo(size=screen.size, with_stencilbuffer=True)
File "kivy/graphics/fbo.pyx", line 152, in kivy.graphics.fbo.Fbo.init
File "kivy/graphics/instructions.pyx", line 753, in kivy.graphics.instructions.RenderContext.init
File "kivy/graphics/shader.pyx", line 184, in kivy.graphics.shader.Shader.init
File "kivy/graphics/shader.pyx", line 695, in kivy.graphics.shader.Shader.vs.set
File "kivy/graphics/shader.pyx", line 555, in kivy.graphics.shader.Shader.build_vertex
File "kivy/graphics/shader.pyx", line 585, in kivy.graphics.shader.Shader.link_program
Exception: Shader didnt link, check info log.

I've searched anywhere I can to find any clues to what to do but below are the attempts to find a solution, unsuccessfully.

I have updated the rpi gpu memory as displayed below:

/boot/config.txt: gpu_mem=512

One comment suggested generating a listing, as attempted below, but I received the error instead.

pi@goddard-rpi:~/.kivy/logs $ glxinfo -l
Error: unable to open display

I get the following response when running setup.py:

Detected Cython version 0.27.3
Using this graphics system: OpenGL ES 2
GStreamer found via pkg-config
SDL2 found via pkg-config
Found brcmEGL and brcmGLES library filesfor rpi platform at /opt/vc/lib/
SDL2: found SDL header at /usr/include/SDL2/SDL.h
SDL2: found SDL_mixer header at /usr/include/SDL2/SDL_mixer.h
SDL2: found SDL_ttf header at /usr/include/SDL2/SDL_ttf.h
SDL2: found SDL_image header at /usr/include/SDL2/SDL_image.h

Please let me know what to do to solve this error and app crashing.

Dark Photon
01-25-2018, 05:52 PM
[INFO ] [GL ] OpenGL version <OpenGL ES 2.0>
...
After my app starts displaying images through screenmanager, the following error appears in the same log (invalid operation):

glGetError 0x502

After 7 of these invalid operation error messages, the following error appears (out-of-memory):

Exception: FBO Initialization failed: Incomplete attachment (36054)
Exception Exception: Exception('FBO Initialization failed: Incomplete attachment (36054)',) in 'kivy.graphics.fbo.Fbo.raise_exception' ignored
glGetError 0x505

Trace-startup-GLerror file (https://github.com/kivy/kivy/files/1660682/Trace-startup-GLerror.txt)


[INFO ] [reload_config_pix] [0] IMG_1017.JPG
[INFO ] [reload_config_pix] [1] Frank-in-Field.png
[INFO ] [reload_config_pix] [2] IMG_5442.JPG
[INFO ] [reload_config_pix] [3] IMG_5444.JPG
...
[INFO ] [reload_config_pix] [12] gallery-on-first-drag.png
[INFO ] [reload_config_pix] [13] gallery-on-first-chairs.png
[INFO ] RELOADED pix from config file locally. Ordered pix=14
[INFO ] [startup ] Clock.schedule_once(self.next, 10.0) - Now={2018-01-24 08:25:13}
[INFO ] [NEXT Loaded Picture] /home/pi/.masterpics/pix/Frank-in-Field.png
[INFO ] [NEXT-continued] frame_sleeping=False: time={2018-01-24 08:25:14}
[INFO ] [NEXT Loaded Picture] /home/pi/.masterpics/pix/IMG_5442.JPG
glGetError 0x502
[INFO ] [NEXT-continued] frame_sleeping=False: time={2018-01-24 08:25:24}
[INFO ] [NEXT Loaded Picture] /home/pi/.masterpics/pix/IMG_5444.JPG
glGetError 0x502
...
Exception: FBO Initialization failed: Incomplete attachment (36054)
Exception Exception: Exception('FBO Initialization failed: Incomplete attachment (36054)',) in 'kivy.graphics.fbo.Fbo.raise_exception' ignored
glGetError 0x505

...

Please let me know what to do to solve this error and app crashing.

I don't know. You should probably dig in and figure out what's triggering those GL_INVALID_OPERATION errors. In your output, they seem to correspond to loading from JPEGs. Does that correlation hold up in other runs? If so, I'd try removing them.

As far as the GL_OUT_OF_MEMORY errors, sounds like you've got 512MB of GPU memory configured, but it's unspecified exactly how much of that is unallocated when you start and what subset of that you can use. You seem to be loading a lot of images. I'd first start with cutting back and loading as few of those as you can get away with to see if that helps. Also, see if you can figure out how to query the available GPU memory dynamically at runtime. That'll give you a tool to help narrow down how the GPU memory usage is distributed across resources within your application. I'd also look at the resolution, bit depth, and buffer config that is being used for your system framebuffer and any offscreen FBOs, and see if you can reduce the resolution on them to save GPU memory, if only for initial troubleshooting. Your app may be triggering excessive GPU memory consumption through texture ghosting, but lets wait and see what you find before we jump into that possibility.

FrankGould
01-27-2018, 07:06 AM
Thank you Dark Photon for responding with great ideas and suggestions. In a different response and similar to your response, there was a recommendation to test smaller resolution images which I have with both my app and another much simpler and smaller online sample SlideShow app. In both cases, I received the same errors and crash. So, I can only assume it is something inside OpenGL or parameters being passed to or from Kivy screenmanager.

What is new since my original report is the app now displays a glGetError 0x501 error after the first image loads. Your point is also true about JPEG images, specifically phone camera shots. So, I need to dig into the code as you suggested; however, it's a new area for me and I will be searching today to see if I can find some tools to help expose memory distribution across resources within these applications. In the interim, if you have any suggestions how to do this, I would certainly appreciate your guidance.

Again, thanks for your help with this issue. I was hoping Linux would be a superior OS to Android but this issue has made me doubt it will work with my app. Hopefully there are tools I can use to find a solution but the 512MB limitation is making me wonder if it will work.

EDIT: It turned out I had a problem with my resizing code and the phone camera pictures were not resized and thus did not display. After fixing this, pictures showed up on screen but still received the glGetError 0x502 0x505 that eventually crashed the app. Will continue to research this problem.

FrankGould
01-28-2018, 12:13 PM
I have completed some more tests to try to isolate this issue by using smaller images and a simple slideshow test app. In all cases, screenmanager appears to be causing the OpenGL glGetError 0x502 and 0x505 errors. In addition, using free -h -s 1 I analyzed the memory usage and found there is not a GPU memory issue as well as running the glxgears app to show OpenGL is installed and operational.

To provide these test cases, I have uploaded a copy of the simple screenmanagertest.py and sample images in the link below to a folder containing test files to duplicate the symptoms of this problem.

https://goo.gl/6e7edY

Below is the memory usage while running these tests:
total used free shared buff/cache available
Mem: 479M 223M 53M 11M 203M 194M
Swap: 99M 32M 67M

Dark Photon
01-28-2018, 02:23 PM
In all cases, screenmanager appears to be causing the OpenGL glGetError 0x502 and 0x505 errors.

Ok. GL_INVALID_OPERATION followed by GL_OUT_OF_MEMORY.

It could be the GL_OUT_OF_MEMORY error is being thrown for either: 1) insufficient GPU memory, or 2) insufficient CPU memory in the GL driver. Depending on which one this is, you might be able to help your problem by either increasing or decreasing the amount of system memory allocated to GPU memory. Note that here I'm assuming that the GPU on the Raspberry Pi is UMA; that is, GPU memory and CPU memory are both carved from system DRAM memory as is common on mobile GPU systems (a few quick websearches appear to confirm this assumption on Raspberry Pi).

Related: Here's someone that hit the same problem with KivyPie on Raspberry Pi and ended up having to both increase the GPU mem allocation as well as reduce the album art sizes.


In addition, using free -h -s 1 I analyzed the memory usage and found there is not a GPU memory issue as well as running the glxgears app to show OpenGL is installed and operational.

I'm not sure that follows. On desktop Linux (with a discrete GPU), free only reports CPU and CPU swap memory usage, not GPU memory usage.


Below is the memory usage while running these tests:
total used free shared buff/cache available


Mem: 479M 223M 53M 11M 203M 194M
Swap: 99M 32M 67M


If these are in the same order as on desktop Linux, that that would suggest you've got 479MB total CPU memory, with 223MB used and 53MB free. Of the used memory, 203MB is tied up in buffers and 194MB in cache.

If you were to flush the buffers and cache (https://unix.stackexchange.com/questions/87908/how-do-you-empty-the-buffers-and-cache-on-a-linux-system) used on your system, you'd probably find that much of the buffer and cache mem would go away, and "used" mem would decrease by around the sum of the reduction in buffer and cache mem. You might try that and see what you get. That'll give you a better picture on how much free CPU memory you really have. However, in the absense of that, I'm going to guess that you've got ~200-300MB free here.

If we assume that the "free" command is just reporting CPU memory, and if you have 1GB of system memory (as it sounds like Raspberry Pi 2s do), then this would support your assertion that the GPU memory split is 512MB for GPU with the remaining 512MB for CPU.

To confirm that though, (caveat: I've never worked on a Raspberry Pie in my life), from this post (https://raspberrypi.stackexchange.com/questions/27468/raspberry-pi-2-1024m-increase-gpu-memory-to-512-at-least) it suggests you may be able to query your GPU memory allocation at runtime with vcgencmd get_mem gpu. This article also mentions several other config vars that might be used to set the amount of GPU memory.

So what's left? You have a way to query total and available CPU memory and total GPU memory. Best case, you need to find a way to report the amount of available GPU memory on demand as well. With "free" for CPU memory, this should help you pin down which memory space is running out of memory and triggering the GL_OUT_OF_MEMORY errors. My guess (particularly with your CPU memory usage results) is that it's GPU memory that's running out of space.


I was hoping Linux would be a superior OS to Android but this issue has made me doubt it will work with my app.

So far, I don't know that this has anything to do with Linux vs. Android. It's an app overconsuming memory. I've hit this same thing (GL_OUT_OF_MEMORY) on a mobile GPU/system running Android. It's just a heck of a lot easier to hit it on mobile because:



there's so darn little GPU memory compared to desktop GPUs, and
the architecture of mobile GPUs is so radically different from desktop GPUs (if you drive the GPU poorly -- e.g. like a desktop GPU -- the driver may consume considerable additional GPU memory behind-the-scenes).

Dark Photon
01-28-2018, 04:11 PM
... you need to find a way to report the amount of available GPU memory on demand as well.

Here's a promising link: GPU stats (free mem etc) (https://www.raspberrypi.org/forums/viewtopic.php?t=23185). It shows several methods, and shows how to extract GPU logs.

Also, you might see if you can dump logs for GPU memory allocations like the article shows. That would give you another tool to work with.

FrankGould
01-29-2018, 11:21 AM
Thanks Dark Photon. Very good tools that showed it never gets close to the maximum amount I configured for GPU. I'm going to reach out to the Kivy developers now with this information and see if we can find the code problem that I'm thinking is in the Shader module that calls OpenGL.

Again, I appreciate your time, recommendations, and suggestions.

FrankGould
03-09-2018, 01:58 PM
Greetings Dark Photon. I have been busy testing all kinds of graphic solutions to see if a kivy driver is causing problems. After many RPi OS builds and regression testing, I found a post that may shed light on kivy problem with image modules, sdl2 and OpenGL. I've pasted a link to that issue below FYI. From the forum you provided, I also ran 'sudo vcdbg reloc' (see attached zip file for results) while my kivy test app was reporting out-of-memory errors.

https://github.com/kivy/kivy/issues/5348

From my test results, I am assuming kivy needs to be updated to support glGetFramebufferAttachmentParameteriv correctly to fix this problem (possibly per the issue, link above) that appears to be consuming GPU memory.

Thanks again for your suggestions!

Dark Photon
03-10-2018, 10:53 AM
Ok, hopefully you've got a good line on it. I wouldn't expect fixing an undefined FBO API symbol to reduce your GPU memory consumption radically, but I guess it depends on how the code's written.

The GPU memory consumption report you got while your "GPU out of memory" problem was occurring definitely details clearly what under-the-hood is consuming all that GPU memory. 480MB consumed with only 5.4MB free certainly confirms there's a problem!

Of the GPU memory consumers listed, it looks like the biggest fish are:


310MB tied up in 39 8MB images (KHRN_IMAGE_T.storage)
105MB tied up in 17 ~6.3MB texture blobs
53MB in 4 13MB texture blobs
7MB in ARM FB
3.3MB in Miscellaneous ( shader code/programs, etc.)


So if the kivy update doesn't solve your problem, it sounds like what you need to go after is:


finding out which app is consuming 310MB of GPU memory for those 39 8MB KHR_image (https://www.khronos.org/registry/EGL/extensions/KHR/EGL_KHR_image.txt)s (probably), and
cutting its usage down to where it fits better in your amount of GPU memory.

Secondary to that to that would be cutting back on it (or another app's) usage of GL textures presumably.

Options for cutting back the space consumption for either include:


Using more space-efficient formats (e.g. packed and/or GPU compressed texel formats)
Reducing resolutions
Decreasing the number of images/textures used.

FrankGould
03-10-2018, 12:32 PM
Thanks Dark Photo. I'm forwarding your info to the kivy team and asking if a volunteer to help fix this might want a FREE Raspberry Pi 3.

FrankGould
03-18-2018, 10:58 AM
Greetings Dark Photon and others watching this thread. I have spent many hours researching and testing environments to establish the right OS build to isolate this bug that eventually consumes all graphic memory and crashes the Kivy app.

Once I determined all Raspberry Pi linux builds exhibit this same problem that did NOT occur on Oracle VM Mint Linus (No glGetError messages), I began adding debug messages in shader.pyx that allowed me to isolate where certain actions causing a glGetError 0x502 (Invalid Operation) that I had suspected caused the eventual 0x505 out of memory errors.

What I found was a “uniform” named “t” that passed a zero value (i.e. 0) immediately before an opacity named uniform was uploaded. Below is what I found in the logs.

[DEBUG ] [Shader ] uploading uniform t (loc=2, value=0)
[DEBUG ] [Shader ] uploading uniform opacity (loc=1, value=1.0)
glGetError 0x502
[DEBUG ] [Shader ] -> (gl:1282) opacity

That prompted me to change shader.pyx to ignore these invalid “t0” uniforms and the app stopped generating the glGetError 0x502 errors; however, that did not stop the “memory leak” because eventually, like in 10 minutes, the app crashed with the 0x505 and then 0x506 (invalid frame buffer operation - same FBO error in Kivy log entries) error messages. I think this is also why shader.pyx reports 'Exception: Shader didnt link, check info log' because there is no memory to link.

So, now I’ve collected several run logs that show memory usage during the period when 0x505 error messages are displaying on the console. Below are from the results while running 'vcdbg reloc' when these console 0x505 error messages are displayed (see run log named vcdbg reloc - Fresh Boot18Mar-crashingwithGC.txt (https://drive.google.com/open?id=1asI87wDCt_Lge4r2ox0IWztndBqk3fYn) for full snapshot).

9.8M free memory in 9 free block(s)

Of the many entries (like the two below with parenthetical values removed), these two appear to be the majority memory consumption candidates (as Dark Photon mentions above). The amount used varies throughout the log.

[ 336] 0x375bf740: used 6.3M 'Texture blob'
[ 840] 0x37c03760: used 1.5M 'KHRN_IMAGE_T.storage'

Then, in one section, I found the following error messages.

0x2c570da0: corrupt trailer (space -256 != 1572961)
0x2c570da0: corrupt trailer (guards[0] = 0xffffff00)
0x2c570da0: corrupt trailer (guards[1] = 0xffffff00)
0x2c570da0: corrupt trailer (guards[2] = 0xffffff00)
0x2c570da0: corrupt trailer (guards[3] = 0xffffff00)
0x2c570da0: corrupt trailer (guards[4] = 0xffffff00)
0x2c6f0e00: corrupt entry (space 0xffffff00)
resynced at 0x2c6f0fe0 (skipped 480 bytes)

I'm not sure what to do next other than read the GL runtime debug log to see if I can find what is going on in OpenGL and how to fix it in Kivy or I can try to figure out what screenmanager is doing when it issues the OpenGL commands (or whatever uniform upload is getting). My problem is not knowing how these graphic layers operate under Kivy and which might be suspect.

Below is a recap of other suggestions from forums:


vcgencmd get_mem gpu returned 512 always, which I defined in RPi config.
There are screenmanager transitions that do not exhibit this same 0x502 error but eventually show glGetError 0x505 with low free memory.
I have attempted these tests with different images with only two jpg files in low resolution but the same results eventually occur with 0x505.
I was not able to free the GPU memory (tested using vcdbg reloc) using echo # > /proc/sys/vm/drop_caches.
Got same results after changing RPi graphics to GL Driver (Full KMS) in raspi-config->Advanced->GL Driver.
Got same results after installing gl4es, libgl1-mesa-dri, and xcompmgr.
Generated a huge (>60K lines) realtime OpenGL Debug Kivy console log that shows many internal GL errors. This was enabled using os.environ['KIVY_GL_DEBUG'] = '1' in Kivy app.



Things to try going forward:
1. Finish setting up QEMU to allow others to see the execution to duplicate and maybe fix.
2. Find a way to 1) free memory, 2) fix shader/screenmanager, 3) get dev support, 4) examine log to find module causing memory leak.

Any other suggestions will be greatly appreciated by anyone and everyone.

FrankGould
03-20-2018, 07:18 AM
Greetings Dark Photon and others watching this thread. I have found a Kivy companion (dolang on #kivy) who is much more familiar with the graphics core in Kivy than I am. He suggested we further isolate the problem to determine which direction to go next. In Kivy, there is a context.pyx module that manages the graphics memory resources where we inserted the following lines of code to display the texture, canvas, and frame buffer object resources it had acquired.

l_texture_count = get_context().texture_len()
print 'l_texture_count,l_canvas_count, l_fbo_count=' + str(l_texture_count)

What we learned from this exercise was that these three counts initialized at (2, 6, 0) then as images are processed in Kivy, they maximize out at (16, 11, 2), as shown in the attached Rasp-Stretch-Kivy-Context-log-20Mar.txt file. Eventually, because of something outside Kivy, the memory is consumed until glGetErrors 0x505 messages begin appearing, indicating no more memory.

While these tests are running, we also checked memory using 'sudo vcdbg reloc' and found that 'KHRN_IMAGE_T.storage' and 'Texture blob' were not releasing their resources. Before running the Kivy test app, free memory was 427M but during the 0x505 messages free memory was 9.8M and declining. This can be seen in the link (https://drive.google.com/file/d/1asI87wDCt_Lge4r2ox0IWztndBqk3fYn/view) from my post above on 03-18-2018,*at 10:58 AM.

We searched through the Kivy code files and were not able to grep KHRN in any code files but learned from an online search that it stands for Khronos, which we believe is an OpenGL brand name.

What this appears to indicate is that the memory leak appears to be in the OpenGL code and we wanted to report this as bug for correction or learn where we should test next to isolate the problem more specifically.

If possible, is there someone at OpenGL who can help us find a solution to this memory leak? It would be greatly appreciated to learn what we should test or explore next.

Dark Photon
03-20-2018, 06:31 PM
A suggestion: You probably should nail down which application is allocating these images via the RPi GPU driver. At least then you know who to ask about for more info or (if you establish that it's a bug) to report the bug to.

On that note...

Clone this GIT repo (source code for ARM side libraries for interfacing to Raspberry Pi GPU):



git clone https://github.com/raspberrypi/userland


If you search it, you'll find that this contains plenty of references to those "KHRN_IMAGEs". For instance, see egl_khr_image.h, egl_server.h, khrn_image.h, khrn_int_image.h, khrn_int_image.c, etc. It appears it is the underlying representation used for EGL, VG, WFC, etc. images inside the userland system libraries.

If you look at eglCreateImageKHR() in egl_khr_image_client.c, which is one place which operates on KHRN_IMAGE formats, you can see that there is some logging in here:


vcos_log_info("eglCreateImageKHR: dpy %p ctx %p target %x buf %p\n",
dpy, ctx, target, buffer);


If I were you, I'd search through the source code and figure out how to turn on that logging (see vcos_logging.h and vcos_log_set_level()). Even better would be to get it to log the app so you could tell who was allocating what.

Also, you might double-check your GPU memory dump and make sure there isn't anything in there that identifies the app, process, and/or thread that allocated or owns the images. That would help you walk this back to the instigating app/process as well.

A few links down into that repo above:

https://github.com/raspberrypi/userland/blob/master/middleware/khronos/common/khrn_image.h
https://github.com/raspberrypi/userland/blob/master/interface/khronos/common/khrn_int_image.h
https://github.com/raspberrypi/userland/blob/master/interface/khronos/egl/egl_client.c

FrankGould
03-22-2018, 07:08 AM
THANK YOU Dark Photon for your help! You are right. It turned out to be a problem with a kivy module that was fixed by development. I really appreciate having you help us isolate, find, and fix this problem. You spent a valuable amount of time and effort that is greatly appreciated.

Dark Photon
03-22-2018, 05:50 PM
Glad you got it figured out!