Monday, April 3, 2017

On CheckStyle and Coding Standards

Recently I was given the opportunity to work on an open source project. My team had developed a new additional feature and we were eager to get is pushed upstream before a code freeze milestone. Our team was closed source and project was proprietary, so this effort was essentially to push a portion upstream as open source.

We decided to do three pushes due to various time and resource constraints: push the basic feature, then an enhancement, then some automated integration tests. Our team was not familiar with the process or requirements of the upstream project but two thirds of the code was written under contract to us by a former major upstream committer to that project.

The push of the basic feature was rougher than I thought, given the authorship of much of the code. Most of the issues were stylistic and our reliance on the contractor to "do the right thing" had been a bit shakey. After some corrective actions, the first merge happened. Based on code review issues raised on the first push, I revised the second batch of code, and submitted a pull request. As expected, there were a few minor issues raised which I corrected.

So this project has checkstyle enabled as warnings. But the build is so full of noise and warnings (that is deserving of another rant on its own) that developers tend to ignore warnings, especially those involving things like whitespace, which happens a lot with checkstyle.

So I submitted the pull request and sort of expected an approval and merge. This was highly desirable because the code freeze deadline was only a couple of days away. Instead what I got was a request to move the checkstyle setting from warning level (the project default) to error level.

This was a stunning request for several reasons.

* First, making the code freeze deadline was now unlikely.

* Second, this would be a new requirement on our code that was not met by any of the hundreds of thousands of lines of code that already existed. Plus our previous commit was accepted without that obligation.

* Third, the time to enable something like checkstyle is day zero of a project. Turn it on then and let issues be fixed as part of every single commit. If you have never turned on an analyzer like checkstyle on to an unsuspecting code base, it is dismal. Our commit was probably a few thousand lines of code, but checkstyle emitted about 250 "errors". This would be such gems as "there is a space between your method name and the parameter list". Manually fixing all of these would be tedious and expensive - probably several days of work.

There were a half-dozen "functional" issues that I fixed, like "unused import" and "catching naked exceptions" that were entirely reasonable to ask for a fix. Those functional fixes were done and the new pull request has been awaiting approval for three days now. I guess in open source, not everyone has a sense of urgency. Complicating this, your reviewer might actually work for a competing company, so politics might interlope.

Some observations...

I wanted to point out an issue of open versus closed source. In the past, on smaller closed source teams, we never enforced cosmetic coding standards. Everyone's slightly different coding styles acted as fingerprints so you could tell without looking who the developer was. But in an open source context, where there could be large numbers of developers, it makes a lot of sense to have strict coding standards and enforecment of those standards should be enabled by default.

Readability is subjective. The wars of bracket placement, and other whitespace issues, happen all the time and clearly have no real impact to the business. To me, a perfect solution would be that whitespace is always managed by the tools locally to each developer. For example, a developer's whitespace style is applied to the code base upon "get", does not show up as a "diff", and is removed as part of a "push".

The imperfect solution is the best we have today. You can get plug-ins into IntelliJ or Eclipse that will show you violations in "real time" so that you can take instant corrective actions. Unfortunately there is no comprehensive code reformatting tool to manage style violations. Many of them are amenable to automated fixes, but not all of them.

We have some ways to go before we can have our cake and eat it too. Maybe it will happen when we have spent enough expensive developer time chasing down that stray space in our code-under-review.

No comments: