Wednesday, May 29, 2019

The Easiest Developer Productivity Measure of All - Not!


It's 2019 and I can't believe I'm in this position again. I ran across an organization that was trying to find a way to measure the productivity of a team of developers. Their insistence on using lines-of-code (LOC) brought back some long-suppressed PTSD from years ago. Before diving into that subject, let's take a minute to get an orientation around the problems of using productivity performance metrics, as espoused in "Against metrics: how measuring performance by number backfires".

It's only a five-minute read. I'll wait...


Hopefully your unease has increased. So, let's get right into it.

LOC was probably the earliest attempt at measuring developer productivity. The concept was simple and was borrowed from operations research in the domain of a factory production. In such a setting, you might try to measure the productivity of an assembly line worker by counting how many widgets they processed - and that output widget pile just keeps on growing - go, team.

It's a simple hop to apply this in the software world. Developers produce value by writing code. The more the better. A developer who writes a lot of code is clearly better than a developer who writes less code.

If you are trying to measure a developer's or team's bottom line dollar impact to business, and you are looking covetously at LOC, then let me try to talk you down off that ledge by asking some questions.

* What do you want to use this productivity measure for?
* Are you trying to compare the delivered value among individual developers?
* Are you comparing delivered value of different teams?
* Are you deciding if a project delivered enough value to the business

You can see from the abstract reference to value in the above questions that LOC as a productivity is not a good fit. The LOC metric is very, very narrow and has the following false-isms (and with a few minutes more thought, this list can be increased).

* Non-coding tasks have no value
* Reducing risk via spikes has no value
* Design and architecture discussions have no value
* Concise coding has less value
* Taking more time and getting fewer bugs has less value
* Writing internal documentation has no value
* Performance testing has no value

As alluded to in the cited "backfire" article, your organization may morph under the pressure of productivity measures. Specifically, developers will gravitate to having "higher productivity" aka more LOC, so be careful what you incentivize. The absurdity of the situation is captured nicely in one of my all-time favorite Dilbert cartoons. (Yes, you can get this on a mug to carry your coffee around every workday!)


So, you still insist on using LOC as a metric?

Let's look more closely at the code. What lines do we want to count? Configuration lines like Xml? What if those XML lines actually replace code as in the case of dependency injection? Data files like JSON? What if this is a declarative representation to replace ad hoc coding? Generated code? Should you be counting the sources of this generation instead? So you can see that the definition of what is code has moved beyond the traditional, simplistic definition.

Is a line-of-code a LOC?

Do comments have any value? Ha, ha, ya, it is a trick question. What about comparing across languages? A given bit of logic can be expressed concisely in some languages and more verbosely in others, so how does that affect things? What about lines of test code versus lines of production code? What about test code density? (Spock/Groovy is about 5 times denser than JUnit/Java)

How do you account for the lifespan of a LOC?

Refactoring can result in negative productivity. One of my long-time assertions is that one of the most valuable things a developer can do is to throw away code. An example of this would be when a developer reduces the damage of "copy-paste-itis" and coalesces to a function. That is going to reduce testing and maintenance costs. Also how do you handle splitting/merging of repositories? The effect of this can be very difficult to track against LOC metrics.

What about other measures?

In an agile environment, the team is refocused on the moving "business value" target at the start of each sprint. The sprint stories were vetted by the product owner, who should have marketing's estimated value of each feature. This is a very direct way to assess value. So what if you don't have access to that? Well, you could use the "T-Shirt Sizing" estimates as an abstract measure. If that isn't available, you might use stories/iteration or points/iteration, with the caution that the definition of a point will vary among teams.

Other Resources

You can find Martin Fowler's take on using LOC here... https://martinfowler.com/bliki/CannotMeasureProductivity.html
There is a good article on agile metrics for productivity here... https://www.business.com/articles/agile-team-productivity-metrics/

That's it for now - hopefully you are no longer on that ledge.

Sunday, June 24, 2018

Hercules in the Age of Agility

Intro


I have been in the software development business for well over twenty five years and have seen many successful and unsuccessful projects. The past ten years have been on teams following Agile principles to varying degrees. The teams have always been composed of different levels of experience and expertise, but I have been very fortunate to have teams that are heavily populated with not only experienced developers, but with open-minded developers.

On these teams, I have seen patterns of behavior that, on the surface, seem beneficial to the team, the product, and the company, but in reality come with a lot of unrecognized costs. These costs are unmanaged; they are not planned for; they are not quantified. Oftentimes they are never historically reconciled.

There is one particular behavioral offender in this developer ecosystem and it might surprise you. I'm talking about the hero developer.

Hercules


Meet Hercules. He is a member of our mythical development team. Hercules fits pretty well into the team. He communicates reasonably. He is a competent developer. He gets his work done consistently. But where he really shines is when there are looming, seeming impossible deadlines. He is the one that stays up into the wee hours, throwing himself across the finish line just-in-time.

He is the hero. He is the one who gets the adoration of managers and that good ole back slap with an "at-a-boy". Honestly he really loves the attention and approval from his manager and the awe from the junior developers who hope to grow up to be like Hercules someday.

Managers see the deeds of Hercules and know that this is how successful software is created. If only they had a bunch developers just like Hercules, their software problems would be solved.

A Deeper Analysis


First, in the context of Hercules, let's acknowledge that being dedicated to your team and working hard are assets and that these assets are not found in all developers. But when we look at Herculean deeds from an Agile point-of-view, we start to see issues that can undo all of the gains that came from the providence of Hercules.

Hercules specializes in cranking out code in a short amount of time. Sometimes it seems the more pressure, the better. Hercules knows that code. Due to time pressure, he didn't get to do a lot of the unit or integration tests, but through sheer ability to focus, he delivered today's functionality, and that's what's important, right?

Hercules sure knows that code, but the other developers don't. The agile developer wants group code ownership. They want group input on design choices. They want the most well thought out, flexible code that the company and project can afford. The agile development team wants to de-silo the implementation, meaning that multiple developers are familiar with the code so future changes occur at a reasonable cost. But given the single source effort, the odds go against this goal.

A good hero sees heroics as an aberration - something to be avoided if possible. But I have seen cases where heroes use heroics as a path to job security. They hoard information by being the only custodians of said code, such that others recognize their area expertise. They want to make sure that there is a large painful hole in the development team should they ever leave. Naturally job security like this is good for them and really bad for the company.

Another issue involves scaling the team. If you have a Hercules, it is almost impossible to clone him/her. You should be able to find developers, but if you need one to fit into a chaotic, barely-managed environment, you need another Hercules, and they are both harder to find and cost more money.

The final drawback is a bit more subtle. Tom Demarco's book "Slack" (https://www.amazon.com/Slack-Getting-Burnout-Busywork-Efficiency/dp/0767907698) does a good job of discussing the issue. In a nutshell, if you have heroics as your normal means of delivering software, your heroes are >100% busy - and this means time for creativity and giving back (like mentoring) fall by the wayside. This cost is never recognized by management but can have a very large impact on the business over time.

Parting Advice


Hire as many heroes as you can and avoid heroics at all costs.

Tuesday, June 5, 2018

The Last Measurable Ounce of Quality Can Be Expensive



A Brief Lexicon of the Software Quality Landscape


The quality of a software product is a multi-dimensional measurement, spanning such things as functionality, correctness, performance, documentation, ease of use, flexibility, maintainability, and many others. Many of these qualities are difficult to measure, are difficult to see, and hence are difficult to manage. The result is that they are ignored by all but the most enlightened in management.

The topic of interest for this screed is going to be correctness. This is not going to concern correctness in the end-user sense, meaning the correct meeting of requirements. It is going to mean "is the code doing what we think it should be doing." Since we don't generally have provably correct programs, it is a matter of convincing ourselves, through lack of evidence, that our programs are working as we'd like. This precarious situation is perfectly portended by Edsger Dijkstra's observation that "absence of evidence is not evidence of absence."

So we have several levels of ways to convince ourselves of correctness. They are, from most detailed to most abstract, unit testing, integration testing, functional testing, and system testing. Note that there is not industry agreement concerning these exact terms, but the general concepts are recognized.

In typical object oriented designs, unit testing involves isolating a given class and driving its state and/or behavior and verifying that we see what we expect. Integration testing is a layer above that where we use multiple classes in the tests. Functional testing is yet above that, where we try to deploy our programs in the natural components that they would inhabit in production, like a server or a process. Finally system test covers testing in the full production-like environment.

Unit Testing


Our focus will be on the lowest level, namely unit testing. The expectation is that unit tests are both numerous (think many hundreds or even thousands) and are extremely fast (think milliseconds). To be effective, these tests should be run on every single compile on every developer's machine across the organization. The goal is that the unit tests precisely capture the design intent behind the implementation of the class code, and that any violation of those intents result in immediate feedback to the developer making code changes.

I'd like to tell you that every developer is doggedly focused on both the quality of the production logic and the thoroughness of the unit tests that back that logic. Through a combination of poor training, lack of emphasis at the management level, and just plain laziness, developers produce tests that span from greatness all the way down to downright destructive (more on that in another blog entry). One of the easiest ways to try to externally track this testing is through code coverage.

Code Coverage


Code coverage is a set of metrics that can give developers and other project stakeholders a sense of how much of the production logic has been tested by the unit tests. The simplest metric is the "covered lines of code" aka line coverage. This is usually a percentage and it means that if a class has 50 lines of code in it, and it has 60% code coverage, then 30 lines of that production logic is executed as part of the running of the unit tests for that class. There are other coverage metrics that can help you gauge the goodness of your tests, like branch coverage, class coverage, and method coverage. But here, we will focus on line coverage since that is most widely used.

The general, common sense assumption is that "more is better", so mis-guided management and deranged architects insist on 100% code coverage, thinking that would give the maximum confidence that the quality of the code is high. If we had an infinite amount of time and money to spend on projects, this conception could represent the optimum. Since this luxury has never been true in the last 4 billion years, we have to spend our money wisely. And this changes things drastically.

The truth is that it might cost M dollars+time to achieve say 80% line coverage, but it might take M *more* dollars+time to get that last 20%. In some cases, getting the last few percentage might be extremely expensive. The reason for this non-linear cost is complicated.

First, production logic should be tested through its public interface where possible rather than through a protected or private interface. It can be laborious to construct the conditions necessary to hit a line of code buried in try/catches and conditional logic behind public interfaces. This cost can be lowered by refactoring the code towards better testability, but this is a continuous struggle as new code is produced. There is a truism in the veteran developers that increasing the testability of the production logic improves its design.

Second, some code has high complexity also known as cyclomatic complexity. Arguably this code should be refactored, but projects do have a certain percentage of their code with high cyclomatic complexity that gets carried forward from sprint to sprint.

The third reason is a bit technical. Code like Java is compiled into byte code. The code coverage tools run off of an analysis of the byte code, not the source code. The Java compiler will consume the source code and emit byte code that may have extra logic in it, meaning code with extra branches. It might not be possible to control the conditions which would take one path or the other through this invisible branch. Further complicating this, is that the invisible logic can change from Java compiler release to release, putting a burden on the test logic to reverse engineer the conditions needed to cover this invisible logic.

Summary


Based on the above discussion, achieving 100% line coverage can be very expensive. On teams that I have worked on over the years, a reasonable line coverage would be 70% or more. But you should let the development team determine this limit. If you force your teams to get to 100% line coverage, you are spending money that might be better spent on automation tests. In addition, I have seen cases where developers will short-circuit the unit tests by writing tests only for the purpose of increasing the coverage. You can readily identify these test because they have no assertion or verification check in them - they just make a call and never check on the result.

In short, you should be careful what you ask for. Make sure you interact with the development team in making the decision about code coverage. Spending another 50% of scarce testing dollars on that last 10% coverage is unlikely to bring a return on investment.

Sunday, May 21, 2017

Good Reasons to Dockerize Builds

The Problem

On my current team, we use a common tool stack of Jenkins, Maven, Robot Framework, and Java. Our build system is centralized with a couple of dozen projects. We have suffered build and developer friction, namely
  • Our version of Jenkins is older than the Internet and cannot be easily upgraded
  • Upgrading a plugin/tool risks incompatibility with other plugins/tools
  • We have accumulated snippets of shell scripts embedded int POM files
  • Developers sometimes use our continuous integration (CI) to do compiles and test code
  • We sometimes have trouble with local builds passing and CI builds failing
These and more are the Usual Suspects on almost any build system that has grown over time. Having been on this team for around two years now, I have identified a direction we should take to modernize our build system.

A Solution


Docker and containerized builds have been around since at least 2013. There are some compelling reasons to consider using containerized builds. This is my attempt to clearly enumerate the advantages and to convince my chain-of-command to make the jump. Hopefully this will help any readers to do the same.

Moving to containers means we can deploy onto a cloud with minimal work. This can address scaling issues effectively. Note that some builds will still depend on lab access to local devices, and these dependencies will not scale.

Containerizing the build pipeline means easier upgrading. For example, running a component in its own container isolates it so other containers that depend on it are forced through a standard, explicit interface.

Containerizing the build means better partitioning. So instead of making environments that contain all possible tools and libraries, a container can only use those needed for its specific purpose. This has the side effect of reducing defects due to interacting 3rd party components.

Containerizing the build means a clean reset state. Instead of writing reset scripts, the container is discarded and resurrected from a fixed image. This is a phoenix (burn-it-down) immutable view of containers, and forces all build and configuration to be explicit (not accumulate in snow flake instances).

Containerizing means 100% consistency between local development and the build system, which should eliminate the but-it-works-on-my-machine problem.

Containerizing the build means effective postmortems for build failures, potentially leaving failed runs in the exact state of failure, rather than relying solely on extracted log files.

Containerizing the build means building and installing an intermediate, shared artifact onces, instead of 52 times, and potentially speeds up the build cycle.

Containerizing the build means that some tests can make temporary checkpoints via a temporary image/container and roll back to that state rather than tearing  down and rebuilding, affording a quicker build.

Judicious use of containers might help with diagnosing hard-to-reproduce issues in the field. I have seen instance of technical support sending/receiving VM images to/from customers. Containers would be both simpler and could be a lot smaller.

Containerizing the build is considered a modern best-practice and affords access to all kinds of build work flow and management tools.

That's it! Good luck convincing your management when the time comes to modernize your build systems.

Monday, April 3, 2017

On CheckStyle and Coding Standards

Recently I was given the opportunity to work on an open source project. My team had developed a new additional feature and we were eager to get is pushed upstream before a code freeze milestone. Our team was closed source and project was proprietary, so this effort was essentially to push a portion upstream as open source.

We decided to do three pushes due to various time and resource constraints: push the basic feature, then an enhancement, then some automated integration tests. Our team was not familiar with the process or requirements of the upstream project but two thirds of the code was written under contract to us by a former major upstream committer to that project.

The push of the basic feature was rougher than I thought, given the authorship of much of the code. Most of the issues were stylistic and our reliance on the contractor to "do the right thing" had been a bit shakey. After some corrective actions, the first merge happened. Based on code review issues raised on the first push, I revised the second batch of code, and submitted a pull request. As expected, there were a few minor issues raised which I corrected.

So this project has checkstyle enabled as warnings. But the build is so full of noise and warnings (that is deserving of another rant on its own) that developers tend to ignore warnings, especially those involving things like whitespace, which happens a lot with checkstyle.

So I submitted the pull request and sort of expected an approval and merge. This was highly desirable because the code freeze deadline was only a couple of days away. Instead what I got was a request to move the checkstyle setting from warning level (the project default) to error level.

This was a stunning request for several reasons.

* First, making the code freeze deadline was now unlikely.

* Second, this would be a new requirement on our code that was not met by any of the hundreds of thousands of lines of code that already existed. Plus our previous commit was accepted without that obligation.

* Third, the time to enable something like checkstyle is day zero of a project. Turn it on then and let issues be fixed as part of every single commit. If you have never turned on an analyzer like checkstyle on to an unsuspecting code base, it is dismal. Our commit was probably a few thousand lines of code, but checkstyle emitted about 250 "errors". This would be such gems as "there is a space between your method name and the parameter list". Manually fixing all of these would be tedious and expensive - probably several days of work.

There were a half-dozen "functional" issues that I fixed, like "unused import" and "catching naked exceptions" that were entirely reasonable to ask for a fix. Those functional fixes were done and the new pull request has been awaiting approval for three days now. I guess in open source, not everyone has a sense of urgency. Complicating this, your reviewer might actually work for a competing company, so politics might interlope.

Some observations...

I wanted to point out an issue of open versus closed source. In the past, on smaller closed source teams, we never enforced cosmetic coding standards. Everyone's slightly different coding styles acted as fingerprints so you could tell without looking who the developer was. But in an open source context, where there could be large numbers of developers, it makes a lot of sense to have strict coding standards and enforecment of those standards should be enabled by default.

Readability is subjective. The wars of bracket placement, and other whitespace issues, happen all the time and clearly have no real impact to the business. To me, a perfect solution would be that whitespace is always managed by the tools locally to each developer. For example, a developer's whitespace style is applied to the code base upon "get", does not show up as a "diff", and is removed as part of a "push".

The imperfect solution is the best we have today. You can get plug-ins into IntelliJ or Eclipse that will show you violations in "real time" so that you can take instant corrective actions. Unfortunately there is no comprehensive code reformatting tool to manage style violations. Many of them are amenable to automated fixes, but not all of them.

We have some ways to go before we can have our cake and eat it too. Maybe it will happen when we have spent enough expensive developer time chasing down that stray space in our code-under-review.

Monday, March 27, 2017

A New Gerrit User

Recently I had to commit work upstream to the public Open Daylight project. As part of that effort, I had to learn some new tools, particularly Gerrit. This is a code review/management tool that works with Git, which I was already familiar with.

Here are some ad hoc things I learned as part of committing work upstream...

Gerrit has a different workflow that you are likely using in downstream coding with other tools. There is a general tutorial and an ODL coding introduction to working with Gerrit.

We followed our own coding guidelines downstream and those did not match to the Google coding conventions used upstream. Differences we ran into were:

• Lower-case method names, even for JUnit test methods (I won't push back with Spock conventions) • Using the most restrictive visibility for access qualifiers (I won't push back the philosophy of library design)
• Use Java 8 lambdas everywhere possible

Gerrit (plugin) has some non-obvious controls, namely that you can kick off another build by putting "recheck" as your comment. Others are "rerun", "verify" as in here.

Upstream coders usually add the prefix "WIP: " to their bug report to let other developers know things are not ready for prime time reviews yet.

Mechanically you can review the issues by using "Diff against:" drop list to pick a revision to start at then go into the first file and then use the green upper right arrows to navigate between files. (It'd be nice if Gerrit had a "next feedback item" navigation).

The master branch on ODL can and does break, causing lots of down time with broken builds (days even).

Full builds take 10 hours and verifying-before-pushing is an evolving process.

If you are using "git review" and you forgot the "--amend" option on your commit, you will be prompted to squash and will end up with a new branch. You can recover using "git rebase -i HEAD~2" then using the force option and then abandoning the new branch.

Along with master breaking, master can also produce stale artifacts, so don't assume that clearing your m2 cache, creating a subproject repo and building will give you up-to-date code.

You can search jar files using find . -type f -name '*.jar' -print0 | xargs -0 -I '{}' sh -c 'jar tf {} | grep "YourClassHere" && echo {}' for helping verify your m2 has the latest.

The patch workflow is not very good at all for having multiple people working on an item; traditional git branches and management are superior in this respect so expect some pain.

If your patch won't build because master is broken, you can rebase it to pick up a fix temporarily. When you do a "git review" to push it, you will be prompted because there are two commits (your new patch revision and the fix). You do not want to merge that fix into your patch.

Hopefully, some of these tips will find their way to helping someone else along the way.

Saturday, November 19, 2016

The Cost of Separation of Concerns

One of the things I enjoy doing is looking at other developers' code and dissecting it. You can find a lot of advice on how to refactor code (including this blog). One that caught my eye recently was this one. It shows a small C# example and how to refactor/improve it.

Go ahead and read it. It'll take 5 minutes.

I'll wait.

So I agree that those changes were probably improvements to the code. But when I see code, I see layers and coupling and dependencies. So I thought I'd post a refactoring here that is very different than you'll see in most places.

First, the original code...


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
 public class Invoice  
 {  
   public long Amount { get; set; }  
   public DateTime InvoiceDate { get; set; }  
   public void Add()  
   {  
     try  
     {  
       // Code for adding invoice  
       // Once Invoice has been added , send mail   
       MailMessage mailMessage = new MailMessage  
 ("MailAddressFrom","MailAddressTo","MailSubject","MailBody");  
       this.SendEmail(mailMessage);  
     }  
     catch (Exception ex)  
     {  
        System.IO.File.WriteAllText(@"c:\Error.txt", ex.ToString());  
     }  
   }  
   public void Delete()  
   {  
     try  
     {  
       // Code for Delete invoice  
     }  
     catch (Exception ex)  
     {  
       System.IO.File.WriteAllText(@"c:\Error.txt", ex.ToString());  
     }  
   }  
   public void SendEmail(MailMessage mailMessage)  
   {  
     try  
     {  
       // Code for getting Email setting and send invoice mail  
     }  
     catch (Exception ex)  
     {  
       System.IO.File.WriteAllText(@"c:\Error.txt", ex.ToString());  
     }  
   }  
 }  

And now the original refactored code...


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
 public class Invoice  
 {  
   public long Amount { get; set; }  
   public DateTime InvoiceDate { get; set; }  
   private FileLogger fileLogger;  
   private MailSender mailSender;  
   public Invoice()  
   {  
     fileLogger = new FileLogger();  
     mailSender = new MailSender();  
   }  
   public void Add()  
   {  
     try  
     {  
       fileLogger.Info("Add method Start");  
       // Code for adding invoice  
       // Once Invoice has been added , send mail   
       mailSender.From = "rakesh.girase@thedigitalgroup.net";  
       mailSender.To = "customers@digitalgroup.com";  
       mailSender.Subject = "TestMail";  
       mailSender.Body = "This is a text mail";  
       mailSender.SendEmail();  
     }  
     catch (Exception ex)  
     {  
       fileLogger.Error("Error while Adding Invoice", ex);  
     }  
   }  
   public void Delete()  
   {  
     try  
     {  
       fileLogger.Info("Add Delete Start");  
       // Code for Delete invoice  
     }  
     catch (Exception ex)  
     {  
       fileLogger.Error("Error while Deleting Invoice", ex);  
     }  
   }  
 }  
 public interface ILogger  
 {  
   void Info(string info);  
   void Debug(string info);  
   void Error(string message, Exception ex);  
 }  
 public class FileLogger : ILogger  
 {  
   public FileLogger()  
   {  
     // Code for initialization i.e. Creating Log file with specified   
     // details  
   }  
   public void Info(string info)  
   {  
     // Code for writing details into text file   
   }  
   public void Debug(string info)  
   {  
     // Code for writing debug information into text file   
   }  
   public void Error(string message, Exception ex)  
   {  
     // Code for writing Error with message and exception detail  
   }  
 }  
 public class MailSender  
 {  
   public string From { get; set; }  
   public string To { get; set; }  
   public string Subject { get; set; }  
   public string Body { get; set; }  
   public void SendEmail()  
   {  
     // Code for sending mail  
   }  
 }  

So here is my refactoring. You will see that it is very, very different than the original improved suggested version.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
 interface Invoice  
 {  
   void Add();  
   void Delete();  
 }  
 interface BetterInvoice extends Invoice  
 {  
   void Error(String msg, Exception e);  
 }  
 class InvoiceBasics implements Invoice  
 {  
   public void Add()  
   {  
      // code for adding invoice  
   }  
   public void Delete()  
   {  
      // code for deleting invoice  
   }  
 }  
 class InvoiceLogging implements BetterInvoice  
 {  
   InvoiceLogging(Invoice inner)  
   {  
      this.inner = inner;  
   }  
   public void Add()  
   {  
     fileLogger.Info("Add method Start");  
       inner.Add();  
   }  
   public void Delete()  
   {  
     fileLogger.Info("Add Delete Start");  
      inner.Delete();  
   }  
   public void Error(String msg, Exception e)  
   {  
      fileLogger.Error(msg, e);  
   }  
 }  
 class InvoiceErrorFlow implements Invoice  
 {  
   BetterInvoice inner;  
   InvoiceErrorFlow(BetterInvoice inner)  
   {  
      this.inner = inner;  
   }  
   public void Add()  
   {  
     try  
     {  
        inner.Add();  
     }  
     catch (Exception ex)  
     {  
       inner.Error("Error while Adding Invoice", ex);  
     }  
   }  
   public void Delete()  
   {  
     try  
     {  
        inner.Delete();  
     }  
     catch (Exception ex)  
     {  
       inner.Error("Error while Deleting Invoice", ex);  
     }  
   }  
 }  
 class InvoiceMailer implements Invoice  
 {  
   InvoiceMailer(Invoice inner)  
   {  
      this.inner = inner;  
   }  
   public void Add()  
   {  
      inner.Add();  
     sendMail();  
   }  
   private void sendMail()  
   {  
     mailSender.From = "rakesh.girase@thedigitalgroup.net";  
     mailSender.To = "customers@digitalgroup.com";  
     mailSender.Subject = "TestMail";  
     mailSender.Body = "This is a text mail";  
     mailSender.SendEmail();  
   }  
   public void Delete()  
   {  
      inner.Delete();  
   }  
 }  
 class InvoiceBuilder  
 {  
   public Invoice create()  
   {  
      return new InvoiceMailer(  
        new InvoiceErrFlow(  
           new InvoiceLogging(new(  
             new InvoiceBasics())));  
   }  
 }  

I bet you weren't expecting that. I have been a developer for over 30 years and it has taken some time to see software the way I do. So lets evaluate this. Is the code better? It depends.
  • The code above has separation of logging, error flow, and notification and emphasizes composition instead of inheritance.
  • The main business logic (found in InvoiceBasics) is very simple and clean.
  • More functionality can be added without disturbing everything else.
  • It is far more testable than the original.
  • It conforms more to the Single Responsibility Principle than the original.
But there is a cost. There are more classes (that do less). It is probably easier to understand what happens in the original code, where you'll have to think about it with the newly refactored code. It took more time to do than the refactored version, but you gain in testability, flexibility, etc. So you have to take all this into account. You, as the designer, have to decide how much money to spend on this code and how long you expect it to live. It is not always easy to decide.