Archives for: 2010


Permalink 10:57:51 am, by jbarnes Email , 41 words, 48874 views   English (US)
Categories: Announcements [A]

Qt on Wayland: first clock!

Qt's analogclock example widget on Wayland

With the Mesa fix Kristian recently posted the Qt drawing code for Wayland works well enough to draw the analogclock widget! Here’s hoping this is just the first of many Qt based clocks that will run on Wayland in the future.


Permalink 07:23:06 am, by jbarnes Email , 201 words, 11439 views   English (US)
Categories: Announcements [A]

More XDS

OML extensions

Finally sat down with Mario to review his patches and ideas for making our OML sync implementation more accurate. He needs very accurate time stamping for his applications, and needs it in every configuration. He’s implemented patches to the DRM core, radeon and intel drivers to dramatically increase the accuracy and precision of our vblank timestamps. And we discussed a way to get accurate timestamps and necessary information for buffer swaps implemented with a blit. In short, we plan to have the kernel generate an event when a swap blit completes, and include info on the current scanline and all the usual timestamp fields. Hopefully Mario will find some time to post his series to dri-devel in the next few weeks so we can get it some wider exposure.


Had a nice night out yesterday (and this morning!) with various XDS attendees. Details omitted to shield the guilty, suffice it to say we met some real characters and had a lot of fun. Hope to go out to a nice restaurant tonight before flying back home tomorrow morning (will be nice to get some sleep finally on the plane, this has been a busy and mostly sleepless conference!).


Permalink 10:31:40 am, by jbarnes Email , 279 words, 21777 views   English (US)
Categories: Announcements [A]

XDS 2010

XDS 2010

I’m here in Toulouse, France (a charming example of an old French city) for XDS 2010, which is going well so far. The venue is good, we have a relatively large number of attendees compared to some of the recent X conferences we’ve had, and there has been good discussion in many of the talks and of course in the hallways

Kristian and I have been working more on the Qt Wayland port, and have basic apps like analogclock drawing. I’m working on fixing the Qt Wayland code to support more complete drawing so that other apps run; I hope to have sliders and other examples going by next week. At that point we can start the process of building libmeegotouch on top of the Qt Wayland environment and seeing what it will take to get some sample MeeGo apps running.

The ultimate goal of the work is to remove the X server (which acts as an obtrusive middle man in handset configurations especially) from the graphics stack for certain environments. We felt MeeGo was a good target because its apps are relatively new and shouldn’t have many X dependencies, and the philosophy of MeeGo is consistent with what we want to achieve. Overall, we hope for a reduction in the number of context switches (down from 3 or more to get stuff on to the screen with X and a compositor to 2 with Wayland), a reduction in the amount of code and memory needed to run the environment, and a massive simplification of the architecture, which should allow for faster and more robust development.

more to come (off to get drinks with krh, DrJakob, ickle, rib and others now)


Permalink 01:38:01 pm, by jbarnes Email , 445 words, 54292 views   English (US)
Categories: Announcements [A]

using kdb on KMS

It’s not quite upstream yet, but there’s enough code out there for it to be useful. And by describing it maybe a few contributors will step forward to help with the remaining pieces. :)

First off, you’ll need Jason’s KGDB tree with a few fixes from me on top. I’ve collected them all at git:// in the kdb-kms branch.

Go ahead and build that kernel, and make sure you have CONFIG_KGDB_KDB=y and CONFIG_KDB_KEYBOARD=y set in your kernel config. After installing the kernel, add “kgdb=kms,kdb” to your boot command line. This enables the KMS enter/exit hooks and allows KDB to be driven from the locally attached keyboard.

Once you’ve rebooted, you should be able to enter KDB using SysRq-G or “echo g > /proc/sysrq-trigger", or by hitting a bug or breakpoint. Resuming from where you left off is as simple as typing ‘go’ from the kdb prompt.

The above should work ok for simple cases today, but there are several outstanding issues:

  1. console unblank support - currently when you enter KDB it will try to unblank the console. Since this path takes locks in the console, fb and drm layers, it can cause problems. Fixing this shouldn’t be too hard though; the kdb enter hook should take care of actually unblanking the console (e.g. if it had been DPMS’d off before), so all KGDB needs to do is make sure console I/O is enabled, which is a smaller console and fb bookkeeping activity. So the console hook needs to be split and the fb enter hook needs to handle preparing fbcon for I/O.
  2. cursor save/restore - right now, when you enter KDB you’ll still see the cursor if it was enabled when you entered. Saving and restoring cursor state should be handled by the fb hooks, but the DRM hooks currently don’t bother.
  3. driver support - I’ve only tested on i915, but adding radeon and other KMS support should be fairly straightforward. Likewise, adding support for plain fb drivers should be pretty easy as well, they just need a small function to write the scanout base register, ideally to the previously allocated fbcon memory location.
  4. enhance the DRM KMS layer to allow the reservation of a dedicated debug crtc, encoder and connector tuple - this would allow keeping the kernel console active on e.g. the VGA port, making debugging of desktop applications and the graphics stack easier.

That’s it for now, hopefully we can get at least some of this merged for 2.6.36 (the fb and DRM changes in particular are very small).


Permalink 02:37:07 pm, by jbarnes Email , 1575 words, 51173 views   English (US)
Categories: Announcements [A]



Wow it’s been awhile. Life in the land of Linux graphics has been exciting recently, and there have been a few interesting developments on the Linux PCI front as well.

Linux Graphics Maturing

The Linux graphics stack has really been maturing recently. The Intel and radeon KMS drivers are seeing a lot of bug fixing, and nouveau is getting into shape as well. I think the Intel driver is in better shape than the userland driver ever was at this point (though that’s not to say it’s without defects; our serious bug count is still way too high for my liking). It supports more hardware and features, including power saving, DisplayPort, new hardware, advanced rendering APIs, than ever, and has been shipping in Linux distros for quite awhile now.

We recently finished off the page flipping support, and landed it upstream (it’ll be part of 2.6.33). We also landed a new, core, buffer execution interface (creatively named execbuf2), that allows for more flexibility in the way we submit our command buffers. Specifically, it allows us to control whether a given buffer needs to be mapped with a fence register for operations performed by the commands in its parent execution buffer. This allows our command buffers to be larger, since we won’t exhaust our fence registers prematurely by mapping all objects unconditionally, and allows us to enable tiled texture rendering on pre-965 chips, which can improve performance significantly for some types of rendering.

To support the page flipping work, I had to extend the DRI2 protocol a bit to include support for a SwapBuffers request. While I was at it, I added support for the SGI_video_sync and OML_sync_control extensions, which meant adding support for a few more requests. The SGI_video_sync addition was an important one, since its absence was a regression relative to DRI1. All this new protocol meant new Mesa and X server code, new DRI2 interfaces between the server and DDX drivers, and a bunch of testing and reworking of the interfaces as I figured things out.

All these new features are landed now, and should be a part of Linux 2.6.33, Mesa 7.8, X server 1.9 and xf86-video-intel 2.11. See CompositeSwap for an overview of the features and how they’re implemented. With that out of the way I’ve been able to think more about how compositors and clients should interact, so I came up with CNP. It’s not implemented yet, since I’m still gathering feedback on it, but my hope is that it will help us reduce memory consumption and partial frames in composited environments, as well as address some of the undefined behavior of current GLX calls when drawables are redirected.

Finally, after some discussions with toolkit and compositor developers, I worked with Kristian and Ian to come up with the INTEL_swap_event GLX extension (note it’s definitely possible to implement this on non-Intel as well, but only Intel has support at the moment). This extension allows GLX clients to receive X events when previously queued buffer swaps complete. So rather than making another swap call before the previous one has completed, clients with mainloops can simply poll their X event queue and do other work if their last swap isn’t done yet, rather than wasting time blocked in the server or queuing another swap and getting too far ahead of the display.

Using it all

One side effect of the new DRI2 code is that glXSwapBuffers calls are now totally asynchronous. Previous versions of DRI1 and DRI2 would either block waiting for vblank, or only return after the blit to implement the swap had completed. With the new code, a DRI2SwapBuffers protocol request ends up in the X server, where it’s scheduled by the DDX driver to occur at some later time (though in some cases it will happen immediately, e.g. if the drawable is offscreen). This leaves more time for clients to do other work while their swap occurs; the INTEL_swap_event extension can help clients take advantage of this extra CPU time.

Some optimizations are present in the new code as well. For instance, if the drawable is the same size as the current root window pixmap and there’s no clipping to worry about, the DDX driver can queue a page flip instead. This saves a tremendous amount of memory bandwidth, and so can really increase performance, especially on high resolution and/or bandwidth starved configurations (e.g. most integrated and embedded graphics platforms). Similarly, if a simple back to front copy is requested for a window, if the back and front pixmaps are the same size (i.e. the window manager hasn’t reparented the front window to accommodate decorations and the like), the DDX can simply exchange the backing pixmap object pointers rather than blit. Again this is important on low memory bandwidth platforms (though note this code is currently disabled due to lack of testing; however it’s trivial to enable once I have some test cases).

New hardware

With our Core i7 parts launched, I can talk about some of the hardware feature work we’ve been doing. Zhenyu has been doing most of the bringup and hardware support work for this platform, but I’ve been busy with one of the more interesting hardware features in the Core i7-6xx series, called Intelligent Power Sharing (IPS). Core i7-6xx and 7xx chips are MCP (multi-chip packages); both the CPU and GPU/MCH are in the same physical processor package, but not on the same die. This means they share a thermal and power design domain. In many cases, only one of the components will be very busy, and thus generating much heat or drawing much power, and it would be a waste to let any extra thermal or power headroom go unused. IPS allows one component to use more than its share of power or thermal budget so long as the other component is idle enough to allow it. One of the key parts of this technology is so-called “graphics turbo", in other words the capability of the GPU to exceed its default frequency (and therefore thermal and power budget) when possible. I posted support for this at around launch time (latest patch here), and hope to be able to post the full IPS driver soon, since the potential graphics performance upside is fairly large (still collecting measurements but I’m hoping for something around 15% or maybe even a little higher). The code also allows the GPU to downclock when idle, saving power. The CPU already has its own opportunistic turbo mode which is very effective, but there may be cases where giving it extra power will be helpful (though I’ve yet to find a benchmark, again I’m still testing).


A recent thread highlighted an interesting design choice in Linux. All platforms supporting PCI (indeed pretty much every platform, PCI or no), splits its address space into multiple regions, allowing for memory mapped I/O (MMIO) from the CPU to different devices. Discovering which ranges belong to which devices is done in a number of different ways, from hard coded offsets (as is found on many embedded platforms), to firmware descriptor tables (as found in OpenFirmware or ACPI), to physically reading MMIO routing information from CPU host bridges down through the hierarchy.

There’s a drive in Linux to support the last option. After all, Linux is the operating system driving your hardware, it should do everything itself, right? Well, that’s where we get into trouble. Linux usually runs on platforms designed for Windows (either specifically for Windows or for Windows in addition to Linux). Windows generally uses the second option to make it easier to port to new platforms. For better or for worse (usually the latter) BIOS writers for new platforms generally consider their work done when Windows boots on their new platform and the Windows device manager doesn’t have any dreaded “yellow bang"s next to devices in the device tree. This usually means the ACPI tables used to describe MMIO layout need to be fairly accurate, or Windows may map a device into a location occupied by another or by a host bridge range with decode priority, causing hangs, corruption or the dreaded “yellow bang".

In October of last year, for arguably good reason, we tried to take Linux down the last path. Yinghai Lu added support for reading root bus resource ranges directly from the host bridge on Intel systems. The thought was that we’d be insulated from firmware bugs this way, and have a more accurate view of the system in general. Unfortunately, due to the above, bridge vendors like Intel have no reason to fully document all the decode windows of a given host bridge, which bits might enable or disable decode for a given region, or generally worry about providing the sort of info we’d need to make this approach tenable. So as of now, we’ve removed the supporting code, and are placing a bet that using the same information Windows does (and hopefully in the same way) will give us the same level of portability. We actually tried this back in 2.6.31 I believe, but had to disable it because our resource tracking code couldn’t handle all the resources handed us by some ACPI firmware implementations. We (well Bjorn hopefully) should fix that limitation for 2.6.34, and we’ll try again, and hopefully fix quite a few resource mapping related bugs in the process.

Virtuous blogs

 << Current>>
Jan Feb Mar Apr
May Jun Jul Aug
Sep Oct Nov Dec



XML Feeds

What is RSS?

Who's Online?

  • Guest Users: 23

powered by b2evolution free blog software