how testing can improve legacy code design

There’s no shortage of articles on how TDD improves the design of new code.  That’s all well and good.  But what about legacy code?

How it came up

This weekend, I had occasion to make a few enhancements to the email sending project at JavaRanch.  The one that got me thinking about the design was when I needed to add some logic to filter the e-mail list.  I was trying to allow specifiying the start and end index so we could resend to just part of the list if the process failed.  After all, you don’t want people getting two copies.

What I did

As the current filtering logic was in the BulkMailerProcess class, I decided to start there.  This class has a bunch of dependencies and isn’t currently tested.  Hello legacy code!  My first thought was that I would make my new filtering method package private and test that.  As I set out to write the test, I realized I needed to get access to the instance variable containing the list of e-mails.  Ugh.  This made me cringe enough to think about an alternate direction.

My next thought was that I really have a separate concept here.  I’m filtering e-mails.  So it one of the existing methods in BulkMailerProcess.  (It makes sure the e-mails are properly formed.)  Time to create a new class.  At this point it was easy.  My new class EmailFilter takes a list of e-mails and runs both filtering/cleaning operations on them.  It’s very focused and gets all that logic out of the main processing class.  I feel like I left the code cleaner than I found it.

The result

It certainly is more tested.  The method to clean the e-mails wasn’t originally tested since it was so embedded in everything else.  Now it is.  In fact now it is tested at 100%.  Without looking at the implementation of the method I copied in, I did write one test to verify the method gets called and at least does what it sounds like.  With this system I got to 96% coverage on the new EmailFilter class.  I could easily get to 100%, but that’s for another day.  The goal here isn’t to be perfect.  It’s to leave things better than they started out.

I left a comment in the code so nobody thinks it is more tested than it is.

/**
* Check valid format for email. This logic was moved verbatim from
* BulkMailerProcess class. Do not change without adding more detailed unit
* tests to verify behavior.
*/

What’s next

The only thing I’m less than happy with is the name of the class.  I started with EmailListUtil because I didn’t have a better name.  Then I changed it to EmailFilter.  But it’s not just a filter.  It’s doing validation too for cleaning the invalid e-mails from the list before filtering by index.  I’ll have to think about the name some more.

The other next step is to improve things a little more next time I touch the code.  I already have some ideas.  The key is to not attempt too much at once.  That would be overwhelming.  A little at a time makes it doable.