Lots of activity lately: Eric just landed a major cleanup to the 2D driver, there’s been lots of bug activity, and I’ve been hacking some more on the page flipping & DRI2 swap buffers support.
We still have way too many bugs open (but isn’t that always the case when the bug count is > 0?), but I spent a lot of time the last couple of weeks closing stuff out, fixing issues and generally troubleshooting configuration issues people have been having with the bleeding edge KMS, UXA and DRI2 code.
One particularly annoying issue was reported by Mateusz Kaduk. While we were debugging it recently, we found that for some reason IER (the interrupt enable reg of the GPU) was getting cleared sometime after the kernel driver was loaded (and had enabled it successfully!). During that time Dariush discovered that running vbetool to save the graphics state seemed to trigger the bug. So the VBIOS (which is ultimately what vbetool ends up running) disables interrupts behind our back, ouch! Take this as yet more evidence that the VBIOS really shouldn’t be run after you’ve booted and loaded a proper driver. Mateusz and I also worked out a workaround for the problem, which I posted here; not sure if we’ll actually ship that yet though, since the distro configs that called vbetool at boot time seem to have been fixed.
On the page flipping and swap buffers front, I’ve been having heavy discussions with Jakob Bornecrantz and Kristian Høgsberg lately about how things ought to work. I’m actually using one set of my code on the machine I’m typing on now, and it seems solid, but it has a few shortcomings we’d like to address. Overall there were a couple of issues we felt were important:
- no blocking - that is, a call to glXSwapBuffers shouldn’t block until the swap completes. Either the swap should occur immediately (as in the case of a less than full screen swap which is just a blit) or it should be queued and the process should be allowed to continue rendering. My last patch set was only partially asynchronous; it would return once the front buffer base address had been updated but before it had taken effect, but this had the side effect of flushing any oustanding rendering, which could be quite expensive.
- no double rendering - with page flipping, a glXSwapBuffers call switches the back & front buffers. So the caller (if it continues to render) will perform any new rendering in the old front buffer. This can’t be allowed to actually hit the old front buffer unless it’s not currently being displayed. My last patchset waited on flips, but only on the same object, so in certain cases could have allowed rendering after two quick flips to hit the still displayed front buffer.
Beyond that there are the implementation details of how the new DRI2 protocol looks and what the exact sequence looks like on the display server and client side. Kristian is re-working things to require more of the display server (which should help Wayland), so hopefully we’ll see this work committed soon. It’s about time we had a way to enable tear-free compositing window managers.
Update: forgot to mention which bits I’m using:
- dri2-swapbuffers branches of dri2proto, mesa, xserver and xf86-video-intel
- kms-pageflip from the drm tree
- i915-dri2-swapbuffers-15.patch from the “[RFC] DRI2 swapbuffers (yes yet again)” thread on email@example.com
And since that wasn’t keeping me busy enough, I’ve been hacking on some other features lately, namely GPU reset and framebuffer compression for KMS. The first feature is tied in with some of the error handling improvements we’ve wanted for a long time. Recent GPU hangs (which are just plain hard to figure out) have motivated us to create some tools for dumping GPU command buffers, and to improve our handling of errors in the kernel. So I’ve got some code to capture error state when the GPU detects a failure, and also some code to reset the GPU which we can use if we encounter a hang or other fatal error. They both need a little more work though; we want to capture an error record right when we receive an error interrupt, so I need to create an error structure and export it through debugfs. Once that works reliably, I should be able to hook up the GPU reset code. That should make GPU hangs non-fatal; and if we’re lucky won’t even be noticeable to the end user. Framebuffer compression also needs a little more work; right now it just supports 965 and before; I need to add support for the G4x series and do some more testing to make sure it’s working as expected. Ah the fun never ends.