Why you shouldn't be using Code Climate's automated refactoring

Written by on

Code Climate released its long-awaited automated refactoring tool today. Our team has been really excited and ripe with anticipation for months now. We put a premium on refactoring bad code, so this is something that's right up our alley.

I won't post any screenshots to spoil the surprise, but I am here to say you shouldn't be using Code Climate's automated refactoring tool.

Here are my top 10 reasons why:

  1. We tried to refactor a god object, but the Code Climate Refactoring Bot simply stopped believing in it.
     

  2. The Code Climate Refactoring Bot is a junior developer in regards to git. Our code base is now littered with vague commit messages like "Deleted".
     

  3. You'll miss out on the fun of shotgun surgery.
     

  4. You won't be able to name-drop "loosely-coupled" anymore.
     

  5. Red commits are good, but in this case, it just broke our code base.
     

  6. We'd like to see better branch names from the bot. You should only refer to magic when talking to your clients.
     

  7. No CoffeeScript support yet.
     

  8. We started to miss face-to-face time in our daily refactoring meetings. Plus, usually someone brings donuts. We can't get any work done when we're apart.
     

  9. I could no longer drive our team into the ground by yelling like a drill sergeant, "Refactor! Refactor! Refactor!"
     

  10. None of the above is true and you should read their blog post about the automated refactoring tool.

 

The importance of playing around

Not everything you do has to be for a purpose.

Sometimes, it's ok to play around. It can even be helpful in the long run. You might not know why it's going to be helpful, or when, but it will come back to you - I promise.

Your life is built on past experience. What you do each day shapes your brain into what it will be tomorrow. As an engineer, you have inspiration all around you. If you read Hacker News or other blogs, it's almost a guarantee you find inspiration on a daily basis. If you walk around your apartment or clear your mind in the shower, you experience moments where your mind is free from distraction and the weight of your ever-present todo list.

That's why playing around is so powerful. Try it sometime. It's a trick to get you inspired or get you in the zone. Here are a few examples where playing around has helped our team:

Bruno, one of our developers, has been playing around with Chef.

Bruno wanted to dive into DevOps and learn Chef. EngineYard uses Chef, so we need to learn it as we go to make tweaks and setup deployment tasks. However, our team lacks quite a bit of knowledge in the Chef department (you know, setting up your own Rails application servers on AWS or Digital Ocean).

Bruno has been playing around with Chef for the last few weeks. In our weekly engineering meetings (I'll have to write a blog post about these), Bruno has been showing us what he's done. The first week he showed us how he spins up a Digital Ocean server via the command line. The next week he showed us the tools he was using to provision the server and setup all the necessary components (nginx, uwf, ssh, etc.). And just last week, he showed us how he setup Capistrano to complete the create, provision, deployment path.

We've also been wanting to build a pair programming gem (a one command pair-programming setup in vim and tmux for two people in different countries). As we were talking through his Chef experiments, we realized that Bruno had already built half of the gem. Part of the gem would provision a $5 Digital Ocean server to serve as a "meeting point" between the two remote developers.

Without somewhat playing around with Chef, we would still be without much progress on our gem.

In my free time, I've been researching Rails 4 and upgrading Tula at the same time.

We set an engineering goal for the year to move Tula to Rails 4. We'd like to take advantage of the Live Controller instead of using web sockets among other things. However, we haven't really set time aside to do it - we've had an overload of client work as of late and we've had higher priority items for Tula.

Instead of waiting until we were able to dedicate the time, I started playing around. I didn't really have a goal in mind nor a timeline of when I needed to get it done. I just wanted to see if I could get things working (at least a little bit).

So, I bumped the version of Rails to 4.0.3 to see what would happen. Of course, things started blowing up. I started reading through the Rails 4 upgrade guides and looking through the logs for deprecation warnings. We had quite a few scopes and queries causing problems due to changes in Rails 4. Since we weren't really on a timeline, that was ok - I slowly started fixing major problems.

Within a day or two, I was able to get things somewhat stable. We aren't done with the upgrade yet (I'm still spending just an hour or two at night), but we have a much better idea of where we're at and how much effort it will take. We're actually almost done.

If I hadn't started playing around, we still wouldn't know much about the Rails 4 situation with Tula.


Finally, here are some tips and ideas to think about if you want to try this:

  1. Let your mind roam free: You can play around with anything. Not everything you do has to ship to production.
  2. If you have 5 minutes, you have enough time to play around: Do you find yourself with 5-10 minutes at the end of your day, but you don't want to start your next big task? Play around.
  3. Play around when you're stuck: If your mind is stuck on a problem, free it by playing around. Go refactor something. Do something else. Come back to your problem later.

I'd love to hear how you play around. Find me on Twitter.

The right way to think about TDD

I've been noticing quite a few posts in the developer community about ways to write better code. In particular, the Code Climate blog has been inspiring. A few days ago, Justin Searls inspired me further with his outstanding post about teaching TDD.

The reason Justin's post resonated with me so much was its relevancy. Just a few days earlier, I had exchanged comments with Bryan from Code Climate about the same thing.

Last week, Bruno (an engineer on our team) and I were talking about the community and the themes you extract from paying attention to the conversation within it. If you're a bystander, you'd probably notice the following rather easily:

If you're not following TDD, you're doing it wrong.

People preach TDD as gospel. It's overwhelming at times.

It's so overwhelming that you can wrongly interpret it. A beginner might think they have to do it all the time to learn programming. Nothing could be further from the truth. When you're first starting out, you're just trying to get things to work. Why should we encourage people to stumble through something that doesn't really help them do that?

The same thing is true for senior developers. If you don't practice TDD, you somehow have a letter tattooed on your chest that labels your code as buggy and awful.

What is our goal as software developers? I hope it's not to have the best test suite. You don't get a special medal for that, and it certainly doesn't imply you've designed and maintained a great code base. If I had a choice between a test suite with 100% coverage or a well-designed code base, I'd choose the latter every time. I can quickly write some high level tests to help me make future changes, but it would take me at least a month to undo years of bad design.

This certainly doesn't mean I don't see the value in testing. I'm simply trying to jump on the bandwagon with these guys - testing needs to be in a different part of the conversation, not at the forefront.

That being said, I've noticed a subset of the community changing the topic of the conversation. No longer is it TDD first. Instead, it's design first. It's writing the code you wish you had. If you design first, nice tests usually follow if you know what you're doing. If you're having trouble writing tests (using too many factories, slow running tests), it's an indication you have poor design.

TDD alone won't save you from violations of the single responsibility principle. It won't save you from poor design choices, and it certainly won't save you from costly refactors. With a little work, I can achieve 100% test coverage and end up with a code base that's messy and not maintainable.

It's bandwagon time. TDD isn't the only way. Don't follow TDD blindly.

Our goal should be writing amazing code. You should feel good about it. Things like the single responsibility principle, the open closed principle, and law of demeter are what you should be studying and applying to your work.

If you're writing code that follows these principles, it's mostly simple. Your classes are responsible for one thing. Your churn is low (your classes don't change often). You don't care much about the rest of the domain (your classes care about themselves). Most of your classes are easy to understand and you can re-use them with ease (they're loosely coupled).

Simply following TDD doesn't give you the beauty of great design. It puts a band-aid on the issue.


Here's how we've been operating as a team.

We don't always write tests first because you don't know what you don't know. The goal is to get nicely written, working code and NOT perfect tests and a badge that says you followed TDD.

We do think about testing as we're designing. If we're creating a new class, a spec file is sure to follow.

We focus on great naming and try to use namespacing. Being able to come up with a good name is a sign you've broken down the problem enough. For example, if we wanted to create a class to notify us about comments on this blog, we'd use "Blog::CommentNotifiers::Email" which would notify us by email. It's simple and it will end up doing one thing: sending an email when a new comment is posted on the blog. Using a callback on the "Comment" model would be a poor design decision and require you to opt-out of the behavior whenever you don't need it.

We focus on unit tests. If we have to stub out all kinds of classes and methods (more than 10 or so), we treat it as a code smell and think about refactoring.

We've learned a lot in the past year about writing better code. When we have to write new code, we write code we wish we had. We don't look for the quickest way to fix the problem (although there are always exceptions). We focus on very clear class names and private methods that read in plain-English. We try to avoid obscure method names. We focus on low-level implementation details last - once we have the design complete.

I highly, highly recommended the following resources to assist you in writing better code:

Three principles for naming things

There's a saying most of us developers should be familiar with:

There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

Naming is hard.

Over time, I've realized you can make it easier on yourself. If you're a beginner, you can follow these three general principles.

It's never ok to use a variable named "p"

Storage space is cheap. You don't need to worry about using short variable names. It's not a valid excuse.

Stop being lazy. If your variable is holding a project, call it project - don't call it "p". You should be able to come back to your code six months later, look at that line, and immediately know that "p" is a project.

So you say you have an amazing memory? It's still not an excuse. Some poor guy will have to read your code in the future and he'll think "p" stands for prick.

It's never ok to use abbreviations

Writing code is all about asking the question, "If I leave this project and come back in six months, will I know what this means?"

It's a constant battle. Another way to win this battle is to stop using abbreviations. Have you ever tried to read someone else's writing where they use a lot of abbreviations? Isn't it hard? Why would you want to use abbreviations for the next poor guy working on your project?

The exception is reserved for conventions.

For example, the HTML standard uses "src" quite a bit which everyone knows to mean source. A common abbreviation in JavaScript is "evt" which obviously stands for event (if you're familiar with JavaScript).

These are conventions which most developers have agreed upon through code review and years of experience. Don't try to invent conventions simply because you're lazy.

Don't feel bad about using variables with long names

Don't worry about long variable names. Usually, the more descriptive you are, the better.

Here's an example. Suppose you query for archived projects and need to store them in a variable. Use "archived_projects" as a name for the variable.

If you broke the rules above, you might try "ap" or "arch_proj" or "a_projects". It might seem obvious to you now, but how will you know what these mean six months from now? It's not worth it.


These principles might seem obvious to you if you've been a developer for years, but it's something we should all keep in mind as we review code and teach others. These principles aren't obvious to beginners.

If you think back to when you first started, you weren't thinking about maintaining your code. You just wanted to get the damn thing working. However, you reach a point where you need to make it easier for yourself and others.

At the very least, if you learn how to be descriptive with names, you'll help everyone around you.

Rendering collections

Like anything as a developer, there are many ways to accomplish the same task.

While making some updates to code and doing some refactoring, I couldn't help but notice a smell in our logs. I hadn't really thought much about it before, but it caught my eye this time - for whatever reason.

For a code base that's been alive for two+ years, there are quite a few spots that we just haven't felt the need to touch. We've refactored heavily over the past few months, but we haven't touched everything. Our churn is almost 0 in some files for no particular reason.

This particular code smell showed up in our logs. It's a spot where we render a collection with some partials. More specifically, it's a place where we render comments for a particular user. Looking at the code now, it's obvious what we should do to improve it.

For the record, it's a file we haven't changed in over a year. And better yet, the 2nd time it's ever been changed was in late 2011.

Here's what I'm talking about:

Currently, we're using Example 1 to render comments. I've shown the code smell in the logs. My first question: Why do we need to render all of these partials? There's an obvious reason in that we have many comments. But, there must be a better way.

Use the built-in methods for rendering a collection

Most of you probably know that Rails offers a built-in way to render a collection of items with a partial. It's even optimized for you.

In Example 1, Rails doesn't know you're trying to render a collection. It simply does what you ask of it. Loop and render the item each time. It's not smart enough to tell you there's a better way.

In Example 2, you're simply providing a collection and letting Rails iterate. Not only will it iterate for you, it does it in a smart way.

In partial_renderer.rb above, there's a branching case where it checks for a collection you provide. If it sees one, it takes a slightly different approach.

It caches the partial and maps the results of the collection. It's something you can let Rails worry about.

You can see the results in the logs above. In Example 1, it looks up the partial 3 times and renders it. In Example 2, there's only one look up. You can imagine how this gets cumbersome if you're trying to render hundreds of comments.

Example 1 is O(n) in regards to looking up your partial. Example 2 is O(1). Although, to be fair, Example 2 will still have to render each partial with the proper data. Example 3 is just another way to write Example 2 (if you follow the the proper naming conventions).