Archive for the 'mozilla' Category
Mysterious Aero Peek Bugs (Part 1)
Aero Peek is one of the new Windows 7 features that we are integrating Firefox with. It adds an entry in the taskbar for each tab. Hovering over the entry shows a preview of the tab contents. IE8 and Opera support this (Chrome has it disabled). Though it slipped from the release of 3.6, we intend to ship it in Firefox 4.0.
Browsers can implement this feature with varying levels of polish. IE8 was the flashship use case when it shipped with Windows 7 and Microsoft invited other browser vendors to follow along. One of the noticeable drawbacks in the IE8 implementation is that when you preview a tab, the chrome does not change to reflect the tab the user is previewing (ex: the url is that of the whatever tab was last selected instead of the previewed tab). I found this to be unsatisfactory and potentially confusing user experience so in Firefox’s implementation, we display the correct chrome when you preview a tab. This has not been without its hardships. I do not know why IE8 does not switch chrome but there are definitely bugs that affect us and not IE8 because we have chosen to switch the chrome for the previews. One of these is that the previewed tab content is blended over the window as it was last drawn. For semi-transparent parts of our new UI like the location bar and unselected tabs, this causes them to look different in the preview. You can see part of the window’s current tab’s URL when you preview another tab and this is clearly unacceptable. So now we have a choice to make:
- We can stop switching chrome for previews.
- We disallow semi-transparent parts of our theme
- Maybe we can control the window contents when the preview is drawn so that we blend with an empty window
Option 1 is a last resort and option 2 is still pretty undesirable. Option 3 is rather tricky due to the API design. Rather than notify us when preview starts/stops (a “peek session” I’ll call it), the API merely asks for bitmaps for live preview so we can’t just change our painting when we’re in peek. But hey, we have support for custom window previews, not just tab previews. We could set up a custom window preview to draw a transparent black bitmap for the window’s preview and then draw our tab preview as expected. In isolation, the window preview code works but when combined with the tab previews, it mysteriously stops working. We get requests for the window preview and the tab preview and we dutifully return the correct bitmaps for those. Windows will display the tab preview’s bitmap on top of the “waiting” animation for the window preview. It’s really quite strange as all the API calls return success.
This bug is possibly related to another one which I think frequently causes woe. The requests for bitmaps are asynchronous – they are posted to the windows’ thread queue. Say you have three tabs: A, B and C. If you move quickly from A to C, stopping at B long enough to see the spinning circle animation but before we finish rendering the preview, Windows will forget for the remainder of the peek session that it ever got our rendered preview. It also never asks for it for the remainder of the peek session.
If anyone has insight into fixes (either on our end or coming from Microsoft) or workarounds, please let me know.
3 commentsHelp Needed with Aero Peek Tab Previews
Bug 522262 is one of the more important blocking bugs for turning Aero Peek on for the next version of Firefox (with a slight chance of backporting to 3.6). I am having trouble determining the cause and reliably reproducing it on my machine. As some of you may know, I am a fulltime graduate student (school takes priority) and my Windows 7 laptop is suffering from intermittent (but frequent) screen corruption (it is becoming increasingly difficult to fix). For both of these reasons, I have not been able to work on the taskbar tab preview feature for the last 2.5 months. If you have experience debugging on Windows and would like to help out by determining the cause (or even a fix!), please feel free to do so! I would be very thankful for any assistance in tracking down the cause. And if there are other Aero Peek bugs that interest you (for example, this bug should be fairly easy to fix), by all means please take a stab at them.
5 commentsWindows 7 Tab Preview Preview
The screenshot shows the IE 8 preview-per-tab model, but the architecture is very flexible. The patches are almost done; just a few details left to work out. I’ll write a followup post with all the technical details when it’s done.
![]()
hg qimport my-bugzilla-patch redux
Some of you may remember my previous post about a little script that will import patches from Bugzilla into your local queue. I haven’t worked on it much since October. Yesterday I took a look into converting it into an extension and you can now see the results. Usage is simpler than before: hg qimportbz 418454 should be all you need to type in most cases.
Read more
Measuring performance
TL;DR: Dates are terrible; use timestamps. Don’t expect good Date.now() resolution from Safari/Opera/IE on Windows.
Introduction
A common testing method in benchmarks, including our own Talos tests, is to measure and compare performance by recording how long it takes to run a given set of tests. In JavaScript, this is often done by using Date.now() to record the start time and the end time. The duration is computed by subtracting the start from the end. This seemingly simple method has hidden complexities and imposes undue requirements upon the Date.now() implementation.
Assumptions
Let’s take a step back to consider the assumptions this testing method makes upon Date.now(). For clarity and simplicity, I assume that time travel is impossible and relativity doesn’t come in to play. Thus if we were to have some function called continuously that represents the current time as a number, it would be continuous and increasing with a constant slope. (Ok, this is starting to fall into the realm of philosophy and physics). These are the two properties we want in our function that records the start and end times for our benchmark.
Dates
Dates do not have these properties. In the notation used by humans for dates, the representations are ambiguous and finding the duration between two dates can be complicated in that notation. Even with the notation used internally by the computer, issues still arise.
DST
In the United States on days where daylight savings changes, 2am occurs twice or never. This breaks the continuous and increasing properties. Switching to UTC avoids this, but having to convert from UTC into your local timezone is burdensome. We can have the computer automate this, sure, but local times are still ambiguous if that’s how we look at them. It’s hard to tell duration when you have to factor DST into account. This is furthermore complicated by the fact that the laws governing DST change frequently and vary from country to country. Converting from UTC thus depends on having a system that has up to date DST information and of course a correct implementation. So if you are looking at performance results on an out of date system, you may see different durations than on an up to date system. How many Linux/UNIX/Java/PHP users have up to date zoneinfo? Microsoft pushes out DST updates via Windows Update and Apple presumably does the same via their Software Update. Older systems like Windows 98, OS 10.2 and most Linux distributions from more than a few years ago no longer receive updates. So when doing performance tests (or checkins around the DST change), dates can be problematic because you cannot assume that there are 24 hours in a day. Thanks to leap seconds, you cannot even assume 60 seconds per minute. I’m not sure that any operating system supports them though any box using Unix time does not.
This matters why?
But computers don’t use human notation for the dates. Windows, OSX, and Linux all represent dates as an offset from a epoch so why bother concerning ourselves with the human representation? User interface matters. Start/stop dates for the tinderbox are important to get right to track regressions and connect them to checkins (though with hg, figuring out the changeset is much easier). Like DST, timezones are problematic for date representation, but so long as they are clearly marked and the user is aware of the timezone, it is not too problematic. Still, many apps are not aware of timezone changes (Thunderbird included sadly) so this small burden remains on the user to be aware of which timezone apps are using.
Still not enough
Supposing that the DST problems are overcome, there remains the problem of clock drift. Nearly every system routinely synchronizes time via NTP with some master time servers. This results in small and usually unnoticeable corrections to the system time which break our desired properties. One might argue that such offsets are rare and small and thus hardly noticeable, but they are on the order of milliseconds to 100s of milliseconds which means that they can come up and make a difference. Turning off NTP sync is undesirable since clock sync between computers is important for coordinating distributed tasks and having a reliable clock to look at to find out if you’re late to a meeting or not.
Resolution
Which brings up the next concern: time resolution. Put yourself in an operating system architect’s shoes for a moment. You’re designing an API and internal structures to keep track of the time. Let’s look at some use cases using Windows as an example.
1) The taskbar clock – this updates every minute or so
2) The analog clock in Vista – this updates once a second to draw the second hand
3) A fancy analog clock – this updates at a continuous 60 frames per second to draw a smoothly animating second hand
I cannot think of any other use cases for needing knowing the whole date at a rate faster than 60 fps (~16 ms). Coincidently (or maybe not), the Windows system clock updates roughly every 15.6ms on most NT based systems. And given the use cases discussed above it is a reasonable rate.
However, once we start using Date.now() for performance tests, 15.6ms is not fast enough. Some of the SunSpider benchmarks from Apple run in just a few ms. Either these benchmarks need to be made longer or the resolution of Date.now() needs to increase. There are drawbacks to both. Increasing the test length means that the Talos boxes take longer to run and so regressions take longer to find and we need more machines to keep up with the rate of checkins. Also, sitting around for hours waiting to find out if your changesets regressed performance is no fun. Looking at the second option, it’s not immediately clear how to increase the resolution on Windows’ GetSystemTimeAsFileTime API. This was my challenge in bug 363258. OSX and Linux have a 1ms resolution via gettimeofday() so this is a Windows-only issue. Just as Windows/OSX/UNIX store dates as an offset from an epoch, so can we. There are a few methods for measuring durations in Windows: GetTickCount, timeGetTime and QueryPerformanceCounter. There are drawbacks to all of them.
GetTickCount
GetTickCount returns the number of milliseconds since the NT kernel booted. A tick is the timer interrupt from the APIC on x86 systems (and because Mozilla doesn’t run on the other platforms supported by NT I won’t consider them). At each tick, the system time is updated by a certain amount as specified by the Get/SetSystemTimeAdjustment or “the system’s internal adjustment mechanisms.” This computed time since boot is stored into a shared readonly page mapped into each process so the API call is fast since it just reads from that page, but resolution is still limited by the kernel tick rate. Furthermore, the value returned a DWORD (32bit int) so every 49.7 days, the value will overflow. There is a GetTickCount64 which does not have that problem but it is Vista only.
timeGetTime
timeGetTime() suffers from the same problems as GetTickCount by default. By calling timeBeginPeriod(1), the kernel tick rate is increased to 1ms sometime after the call returns. In my brief testing with this function, I found it took effect immediately on Vista and one or two ticks on XP. This also increases the resolution of GetTickCount. Ignoring the overflow problem, why can’t we just call timeBeginPeriod and be done with it? Well, increasing the kernel tick rate increases the system load; more work is being done just to maintain a higher resolution clock. This would mean that leaving Firefox running while your laptop is on battery could decrease the battery’s performance by up to 25% (see this Intel article or this Microsoft presentation) There is a corresponding timeEndPeriod call that can restore the old tick rate, but that would require us to know when the last high-performance call to Date.now() is made.
QueryPerformanceCounter and RDTSC
QueryPerformanceCounter returns a 64 bit time stamp so it does not suffer from the overflow problem. This timestamp increases at the rate returned by QueryPerformanceFrequency. The API states that this rate does not change while the system is running. That’s not quite true. On older systems, QPC is implemented using the rdtsc instruction for x86. This returns the number of clock cycles since the processor booted. It is very high resolution and very accurate on those systems if they are uniprocessor. Newer processors such as Intel’s Centrino and Core processors dynamically change the clock speed of the processor in response to the workload and certain HALs (Hardware Abstraction Layer) for certain versions of Windows still use rdtsc to implement QPC. Needless to say, this makes it difficult to obtain accurate timings. Suppose we could force the processor into its highest speed; this can be done by the user fairly easily, especially if we’re running a CPU intensive benchmark. Note that I stated rdtsc returns the number of cycles since the processor booted. On multiprocessor systems (including multi core), the cores don’t boot at the same time. There is sometimes a noticeable difference between the two depending on your OS. Windows attempts to keep the CPU TSCs roughly in sync but the keyword there is roughly. Even in single threaded programs there is the issue of reading different time stamps when the thread switches cores. There is no way to perfectly tell which core you are on when the TSC is read because context switches can happen between any two instructions (in user space). On newer systems, there is a dedicated timer on the motherboard which QPC reads. This results in a system call and a read from the device so it is far more expensive than the TSC read but it is stable.
To add to the challenge, neither the TSC nor hardware timer are reliable after suspend/hibernate.
Summary
So to summarize, we have GetTickCount/timeGetTime which suffer from resolution and rollover issues and we have QPC which has superb resolution but lousy consistency in some cases (though it is very fast then). How can we use these to generate a reasonably reliable high performance (multithreaded) Date.now() implementation? Read the bug or the source or perhaps I’ll write another blog post describing the implementation. If you were hoping for a clever and perfect solution, there is none right now. My point is that high resolution date calculation is tricky and sometimes unreliable to calculate the date with high precision on Windows. So let’s revisit the duration calculation.
Duration revisited
duration = end - start where start and end are calculated using an offset from some epoch. Expanding the known implementations of gettimeofday() we get
duration = (epoch + offset_end) - (epoch + offset_start)
which reduces to
duration = offset_end - offset_start
where both offsets are measured in seconds. As you probably already knew, the current date doesn’t matter, it’s all relative. Calculating the full date to high precision just wastes time and effort. What we really care about is the offsets and it doesn’t matter what epoch they’re from so long as it’s consistent for the lifetime of the process.
Looking back on our choices in Windows, QPC is the only viable option since we don’t know when Date.now() needs to be high precision and when it doesn’t. It would be much nicer to have a separate API like Timestamp.sample(). To use timeGetTime or GetTickCount, we could extend this to include Timestamp.begin() and Timestamp.end(). This would leave us with fairly fast implementations for Date.now() and the Timestamp functions so that JS users that would like to do performance tests or animations could do so without sacrificing performance. This API is fairly trivial to implement for other systems since they can reuse their gettimeofday() functions for Timestamp.sample() or use higher performance timers (librt for example) the same as Windows does.
Moving forward
Of course, there should be some concerns with allowing a web page to alter the tick rate of the user’s computer so this should be discussed and more thoroughly thought out before committing to an implementation. And it should be standardized. Doesn’t seem right to have a Mozilla-only API (though for Talos, it wouldn’t be the end of the world).
Points of interest
And for those who are curious about other browser’s implementations, John Resig
had a nice article
comparing JS time resolution on Windows.
SpiderMonkey/TraceMonkey implementation
NSPR’s PR_Now()
NSPR’s interval API
Chrome’s implementation seems to be here
V8′s implementation (look for Time::SetToCurrentTime())
Note that Chrome and V8 don’t have the multithread-safe requirement that SpiderMonkey does.
As for Safari, IE and Opera, I don’t know but it looks like they just call GetSystemTimeAsFileTime based on John Resig’s performance results.
12 commentsScience of the Web
My friend Dan Schultz is currently taking a course called “Science of The Web”. For one of his assignments (see problem 2), he needs as much help as he can get. Here’s where you come in. It’s quick and easy:
1) Go to http://boom.aladdin.cs.cmu.edu/cgi-bin/ipaddy (the server might encounter an error, just refresh and it should work)
2) Enter ‘dschultz’
3) Get as many other people as possible to do the same
hg qimport my-bugzilla-patch
Since I’m only a few days away from the end of my internship, I can’t really start any large projects so I went looking for little projects I can get done. This one originated from my dislike of pushing patches from bugzilla. My steps so far have been to go to the bug, find the attachment, download it, import it into my mozilla-central patch queue and then qpush, qrm and push. Well that’s a small hassle. So today I decided to write a small python script to help me out (I would have made it an hg extension but I can’t build hg on my system due to compiler issues with python’s extension API). So here’s my script.
Usage is pretty simple:
qimportbz.py 418454
This will fetch bugzilla’s xml output for bug 418454, look for patches that are not obsolete, and let you pick which ones to import. It conveniently displays the patch description and any review flags. If there’s only one, it’ll pick it automatically. It then takes the patch and feeds it into hg qimport, automatically generating a patch name from the bug and attachment name.
I plan to extend this further to automatically generate a commit message and user (preserving any already in the patch), display more flags (like approval) and automatically upgrade to newer versions of the patch posted in the bug.
2 commentsInitial Glass support
Yesterday, I pushed my patch to add glass support to chrome windows for Vista (see bug for some implementation discussion). For those who don’t use Vista on a physical machine (virtual machines don’t support glass): it is a fancy blurring effect rendered using the system’s graphics card. It is part of Vista’s Aero theme but requires some hardware support beyond Vista’s minimum requirements. The changes overall are mostly trivial, but they required lots of little edits over many files.
How to use it
Start by adding the CSS property -moz-appearance: -moz-win-glass; to your XUL window. For any areas that you want to be glass, be sure to make their backgrounds transparent. This includes the window itself. You can also set the opacity on elements to have them blend with the glass as transparent windows already do. Ok, you’re done.
Well, almost done. The glass effect is possible only when the user has desktop composition enabled, which requires a reasonably modern graphics card. Also, they can toggle it on or off at runtime as with native themes. Oh, and it only seems to work with the Windows Aero theme; Aero Basic users are left behind.
Rather than add some fallback code when glass is disabled, I left the issue to the theme designers. There’s a new system metric selector, windows-compositor, which detects if the glass effect is enabled. Now you can setup your CSS rules like this:
window:-moz-system-metric(windows-compositor) {
background: transparent;
-moz-appearance: -moz-win-glass;
}
to add glass to your window’s client area (and presumably other UI changes) when the user enables composition.
Here are some example rules for emulating the fallback used by Media Player and Explorer:
window[active="true"]:-moz-system-metric(windows-default-theme) {
background-color: #b9d1ea;
}
window:not([active="true"]):-moz-system-metric(windows-default-theme) {
background-color: #d7e4f2;
}
window:not(:-moz-system-metric(windows-default-theme)) {
background-color: -moz-Dialog;
}
window:-moz-system-metric(windows-compositor) {
background: transparent !important;
-moz-appearance: -moz-win-glass;
}
Note: When bug 431666 lands, you’ll want to use windows-classic instead of windows-default-theme in your selectors.
Gotchas
The Desktop Window Manager (DWM) draws a border around the window’s client area, but our method of enabling glass disables that, so if you want to achieve the same look as Media Player or Explorer, you’ll have to do some fancy border work. I hope to fix this in the future so that it is automatically done in most cases.
Text on glass is hard. It’s sometimes hard to read which is why Windows provides the DrawThemeTextEx function which adds a glow behind the text; this is done by the DWM for the window title of unmaximized windows. DrawThemeTextEx takes characters, not glyphs so we can’t really integrate it into our text rendering code. CSS text-shadow can fake the glow, but it doesn’t work on the XUL widgets you’d want to use for your UI. So for now, don’t count on using text on glass.
Content (the xul browser element specifically) doesn’t render quite properly on glass. The underlying issue seems to be present in Firefox 3 (with transparent windows though) and will probably be fixed when the internal compositor for Gecko is completed. This unfortunately prevents Firefox from adopting IE7′s Vista UI.
Under the hood changes
Previously, windows were either transparent or not, so I added an enum nsTransparencyMode for the three options: opaque, transparent, and glass. Only windows supports the glass option; the other platforms fall back to opaque. A glass nsWindow calls DwmExtendFrameIntoClientArea to tell the DWM to render the entire window as glass. This has a performance impact for large windows since the entire window has the (already expensive) glass shader applied to it, even though we are probably going to be painting most of the window opaquely. I’m looking into ways to detect which areas of the window are glass and tell the DWM to only render those areas. This also solves the aforementioned border problem. We also have to render each window with an alpha channel, so there is a rendering performance hit.
Demos
I have two little demos to show off. The first (and its style sheet) was my testcase which shows opaque, semi transparent and transparent XUL on a glass window. The second uses an animating CSS transform on a green box with text on plain glass window. Since CSS transforms haven’t landed yet, you’ll need do a build yourself with the patch applied.
9 comments
