13:10 January 24, 2012

Mozilla Automation and Testing - Jetpack Performance Testing

I have a working proof of concept for Jetpack performance testing (JetPerf): http://k0s.org/mozilla/hg/jetperf . JetPerf uses mozharness to run Talos ts tests with an addon built with the Jetpack addon-sdk to measure differences betwen performance with and without the addon installed.

Playing with Jetpack + Talos performance lets us explore statistics in a bit more straight-forward manner than the production Talos numbers. As part of the Signal from Noise project which I am also part of, there is a lot of parts to staging even small changes in how we process Talos data since the system involved has many moving parts ( Talos, pageloader, graphserver ). By contrast, since JetPerf is a new project, it is much more flexible to explore the data that we have not hitherto explored.

I made a mozharness script to clone the hg mirror of addon-sdk . It then builds a sample addon and runs Talos with it installed.

Looking at raw numbers wasn't very interesting, so I made a parser for Talos's data format It was pretty quick to get some averages out before and after the addon was installed, but I thought it would be more usefulto display the raw data along with the averages.

https://bug717036.bugzilla.mozilla.org/attachment.cgi?id=591224

These really aren't fair numbers, as currently the stub jetpack I use prints to a file, but its at least a start of a methodology.

The reason I'm sharing this isn't just to make a progress report, but more to present some ideas about thinking about what to do with Talos data. While this was done for JetPerf, much of this also applies to Signal from Noise. You run Talos and get some results. What do you do with them? Currently we just shove them into http://graphs.mozilla.org/ and say that's where you process them, but I think looking at them locally is not only important but necessary if you're doing development work. I think a big part of any statistics-heavy projects is to make it easy for all of the stakeholders to explore data, apply different filters and see how things fit together. While it takes a statistician to be rigorous about the process, anyone can play with statistics and it takes a village to really conceptualize what is being looked at. I hope, to this end, developers will use my software so that they can understand what it is doing and provide the valuable feedback I need.

TODO

JetPerf is still very much at a proof of concept stage. Ignoring the fact that none of it is in production, there are still many outstanding questions about basic facts of what we are doing here. But outside of polishing rough edges, here are some things on the pipe.

  • test more variation of addons; currently we just load panel and print something to a file
  • test on checkin (CI): so the main point of JetPerf is to get a better idea of what SDK changes cause addon performance regressions and hits, to be able to quantify them. While as stated this is a very open ended project, one thing to turn this from a casual exploration to a developer tool is running the tests on checkin. This will give an update in real time of if a checkin breaks performance.
  • graphserver: in order to assess Jetpack's performance over time, we will want to send numbers to some sort of graphserver . This will allow us to keep track of the data, to view it, and apply various operations to it.

I may also spin off the (ad hoc) graphing portion and the Talos log parser portions into their own modules, as they may be useful outside of just Jetperf.