Answers and Questions » Blog Archive

Laying Blame

As I mentioned last week, I’ve been resurrecting a project to report on compiler warnings. A basic form of this buildbot is now operational on the Firefox tinderbox tree (look to the far right for the static-analysis-bsmedberg column). It prints a summary of the total number of warnings on the summary page: in the full tinderbox log, it lists each warning and who can be “blamed” for that warning:

/builds/static-analysis-buildbot/slave/full/build/memory/jemalloc/jemalloc.c:177:1: warning: C++ style comments are not allowed in ISO C90 blamed on Taras Glek  in revision hg:36156fbf817d8a0e2d54a271cf0bff94a1c41c13:memory/jemalloc/jemalloc.c
/builds/static-analysis-buildbot/slave/full/build/js/src/jsdbgapi.cpp:712: warning: ISO C++ forbids casting between pointer-to-function and pointer-to-object blamed on brendan@mozilla.org in revision cvs:3.36:js/src/jsdbgapi.c

Assigning blame can be a tricky process. In order to figure out the blame for a warning, the code uses the following steps:

Resolve relative paths against the current working directory, using GNU make “Entering/Leaving directory” markers as a guide.
Dereference symlinks to find the source tree location of an error. For instance, Mozilla headers which produce warnings often do so via paths in dist/include. We have to resolve these to their original source tree location in order to find blame.
Using mercurial APIs (through python), find the mercurial changeset which introduced the line in question.
If the code dates back to Mercurial revision 9b2a99adc05e, which is the original import of CVS code to Mercurial, use a database of CVS blame to find the original CVS checking which was responsible for introducing that line of code.

If you’re interested, take a look at the build log parsing code, or see the scripts which save CVS blame to a database (thanks Ted!).

The current reporting system for warnings is very primitive. I’m currently working on a second version of this code which will provide additional features:

Compare warnings with the previous build and highlight “new” warnings. I do this by recording the error text and the blamed location of the warning. As lines are added and removed from the code, the reported location of the warning changes, but the location of Hg/CVS blame doesn’t. This means it is a stable location which can be used for comparisons across runs. It even works across file renames/moves!
Web frontend to the warning database to allow users to query warnings by user or directory.
Classify warnings by “type”. This is not a simple process, because GCC mixes distinctive error text, such as “may be used uninitialized in this function” with variable names; and the granularity of -fdiagnostic-show-option is low enough that it’s not very useful by itself. Oh, I wish GCC had error codes like MSVC does: C1234 is easy to recognize!

At one point, I thought I could implement all of the warning mechanism on the buildbot server by parsing the BuildStep logs. It quickly became clear that I couldn’t do that, because I couldn’t resolve symlinks, and getting Mercurial blame was difficult or impossible. My new version actually uses a hybrid mechanism where the build log is parsed on the buildbot slave: this parses out warnings, resolves symlinks, and looks up blame. It then sends the results back via stdout to the master. A custom build step on the master parses this log, saves the information to a database, and does the checking for new warnings and prints various results to custom build logs.

This entry was posted on 9 Dec, 2008 at 6:22 pm and is tagged with , build-system, buildbot, gcc, mercurial, warnings. Trackback.

5 Responses to “Laying Blame”

gz Says:
December 9th, 2008 at 6:45 pm
How interesting, I just wrote some very similar get-blame mercurial code the other day, but with an eye to reporting on runtime warnings and assertions, not build issues. I think a lot of your ideas are transferable (I’d not considered going back to CVS history, was just going to give up on hg rv 1).
Victor Bogado Says:
December 10th, 2008 at 10:20 am
Using the line information to put blame is not correct, it is trivial for someone to introduce a warning in a different line. For instance if I were to remove the last use of a variable and not remove the declaration the build will have a warning on the declaration line that is responsibility of another person, but in fact it was I that should have removed the declaration.

Off course that the ideal solution would be to build on every “commit” and check if there were new warnings and blame them on the person who just made the commit, but for large source bases as mozilla this is not practical.
Benjamin Smedberg Says:
December 10th, 2008 at 10:27 am
Victor, you are of course technically correct. But since we don’t use blame to enforce things, it doesn’t really matter. A complete build of mozilla with static checking, without -j, takes about 3.5 hours, and so it’s infeasible to do that on every checkin.

I’m working on the solution to identify new warnings independently, as noted.
Victor Bogado Says:
December 10th, 2008 at 12:52 pm
I think that a database of when such warnings did appear would help a lot.

By the way a complete build takes 3.5 hours, but a progressive build, by this I mean keeping the ‘.o’ files from the previous commit, could be quite faster, or do you already do it like this?

I am just trowing ideas and thoughts, it is not my intention to criticize your work or the work of the mozilla community, please don’t take this the wrong way. :-) I love mozilla and firefox and always tell people to use them and lately is getting harded to do this as people are already using firefox. I feel that I have to say that since it is sometimes hard to tell when one is simply being a troll. :-)
Benjamin Smedberg Says:
December 10th, 2008 at 12:57 pm
The problem with doing progressive (incremental) rebuilds is that you only will get the warnings for files that need to be rebuilt. To actually make this work you’d have to associate the warning with the file being rebuilt, which gets really complicated really quickly (at least in a make-driven system).

Laying Blame

5 Responses to “Laying Blame”

Leave a Reply

Recent Posts

Archives

Categories