Well, xf86-video-intel 2.5.0 is out finally. About 3 weeks later than we would have liked, but at least we got most of the blockers (nearly all of them in fact) fixed and stabilized the 3D side of the release. As I mentioned in the release announcement this release includes lots of new code and puts us in a good position to really improve EXA and support kernel mode setting quickly.
GTT mapping & pixmap management
One of the things that GEM enables is improved memory management in the 2D driver. In the past, the 2D driver statically allocated several chunks of memory for use by various GPU structures, 2D pixmaps, the front, back and depth buffers, and 3D textures. Needless to say this was fairly inflexible. With GEM in place, most of those static barriers have been eliminated, with the exception of the EXA offscreen pixmap area. Fortunately, EXA (and its close relative, UXA, which is also part of the 2.5.0 release) allow the driver to manage pixmaps if desired. So following the 2.5.0 release I started hacking on the driver to do its own pixmap management. In order to do this well, however, EXA’s PrepareAccess hook should map pixmaps through the GART using write combined access, since the alternative is to map the actual RAM backing store of the GEM object and flushing the CPU caches at FinishAccess (turns out this is fairly slow). This, in turn, means that the kernel needs to allow mapping of objects through the GTT aperture of the CPU’s address space. The patches to allow that are fairly complete at this point, and even include fence register allocation & setup for mapped objects. This should allow us to tile our offscreen pixmaps at some point (fairly soon on 965, and with a little more work on pre-965). As long as we don’t fall back to writing the objects with the CPU very often (thus putting pressure on the limited number of fence registers available and incurring page faults) this should be a performance win.
Kernel mode setting
All of the above is also required for kernel mode setting to work. In a KMS world, the kernel will be allocating a front buffer and will expect the 2D driver to use it for all memory management and fence register allocation. Now that I’ve got these pre-requisites working fairly well, I’ve been able to revisit the KMS code and fix it up in preparation for merging into 2.6.29. Yesterday I fixed merged the latest bits from the drm-next and drm-rawhide branches into the drm-rawhide-intel branch and got a root weave going with a slightly tweaked 2D driver. I should be able to post the required patches this week and get them cleaned up for merging (hopefully next week, assuming the GTT mapping stuff goes in).
And just when I thought I was done travelling for awhile… Turns out I have to spend a few days in London for top secret meetings (should be good actually but it would be nice to be home for a change).
On the plus side, GEM looks like it’s finally queued up for 2.6.28! This is huge, since we’ve all been waiting for some kind of memory manager to land for many years now. If there’s any justice in the world, Eric should never have to buy beer again in the company of anyone even slightly involved with Linux graphics and desktop development.
With that out of the way, some of the other stuff queued up behind the memory manager can finally land; Dave and I had talked with Linus about merging the kernel mode setting bits, but now that I’m out of the loop for a week or so I’m not sure if I’ll be able to help much with that (not to mention that I’m trying to fix what I hope is the last issue with my GEM GTT mapping patches, which now include fence register management, so that xf86-video-intel with GEM based pixmap management becomes viable). DRI2 should also be coming soon, though I haven’t talked with Kristian about it recently.
This quarter’s release of the Intel driver has been somewhat delayed though, despite all the good news. Trying to stabilize the GEM stuff on the 3D side took a lot of Eric’s time, but it looks like things are in reasonable shape there now. We’ve fixed a few of the more offensive 2D bugs recently too, but there are still more open than I’d like at this point, but I suppose we can always do point releases later as things get fixed up. So when a new libdrm comes out (which should be possible now that GEM is queued up and the interfaces are stable) I should be able to spin the final 2.5.0 package, completing the 2D side of the release.
Well, more later. I have to go dig up a separate machine so I can figure out what’s going on with my GTT mapping code (no kernel mode setting means I can’t see the panics I’m causing yet, sigh).
It’s been a busy couple of weeks since I got back from Plumbers. The merge work seems to be going well; we finally got an ack from Nick on the GEM stuff, so we should be able to push that for 2.6.28. Eric put together a rollup tree with everything we’d like to push (this will probably be the busiest release for i915 since it was initially written), including GEM, several bug fixes, the reworked vblank code (which also contains bug fixes), and support for MSIs, which can save us a lot of CPU time in shared IRQ configurations. The xf86-video-intel 2.5 release is slowly coming together as well, with several bugs fixed, including some of the nastier, EXA related ones. The related 3D and kernel releases are also maturing well, so hopefully this quarter’s release will be stable and feature filled. PCI has been pretty busy too; lots of reviews to do, lots of fixes streaming in and a few new features too (still hoping to get the I/O virtualization support in before the merge window opens too). Finally, the top secret soon-to-be-released platform work is going well; I finally managed to get my bits into a state I’m fairly happy about, so I’m looking forward to seeing it out in the wild.
So the GEM merge, and more generally, some kind of memory manager merge, has been a looong time coming. It looks like the wait will finally be over once the 2.6.28 merge window opens. To recap, having a real memory and execution manager like GEM enables all sorts of good features: reasonable texture-from-pixmap support for our 2D/3D stack, in-kernel mode setting, removal of user level register access from our 2D driver, redirected & composited direct rendering, and probably several more that I’m forgetting. With the kernel bits in place we should be able to merge the kernel mode setting code finally too (it’ll be disabled by default so it should be fairly safe to merge early), and accelerate our pace of integration into the kernel. That means things like improving the panic stuff I demoed at LPC to be even better, and providing some additional features and generally polishing things. Yay!
We’ve been closing bugs at a fairly good rate recently, but I still have some concerns about our stability on G4x machines. I should be getting mine soon though so I can help with that, and Zhenyu will be back from vacation next week to finish off some of the fixes he’s been working on, so I’m confident we’ll be in good shape by the time we release. Carl, meanwhile, has been doing some good work on EXA, closing out the notorious “EXA slower than XAA” bug at long last. With a recent X server, the driver is actually in pretty good shape, performance-wise. Carl is trying to track down a couple of more issues for the release, so if you’re someone who can easily reproduce EXA problems, please add your $0.02 to the various EXA bugs; I’m sure Carl would appreciate it.
On a related note, we’ve finally been able to release the IGD OpRegion specification, which should allow distributors and developers to more easily hack on the driver to support various platform features, like ambient light sensors and hot keys. And for pre-OpRegion platforms, I’ve been trying to add more support to our VBIOS support code to at least document the various structures and handshaking registers, I hope to be able to push that work out soon, though I doubt I’ll have time to writeup a full PRM like we did for the OpRegion spec.
The PCI arena has been fairly busy over the last couple of weeks. Several people got involved in the e1000e NVRAM fire drill, including me (since the X server was initially though to be to blame, and it just so happens that I also wrote the code the X server was using to bang on PCI registers). In the course of debugging the problem, we added several safety checks to various bits of code (the PCI sysfs layer, the I/O resource management layer, the e1000e driver itself, etc.) which turn out to be generally good to have regardless. On the patch management front things have been looking good; we’ve merged several cleanups and fixes (including at least one annoying regression) in the past couple of weeks. The PCI slot naming and SR-IOV support patches aren’t merged yet, but they’ve been getting a lot of review so I hope to be able to pull them in soon (and in time for 2.6.28).
Well that’s it for now. Back to doing ‘git pull’ every few minutes to see if the GEM bits have landed yet. :)