"Learning to work on existing codebases."

Working with legacy code is probably the most common part of a developer’s job in the software industry. Unless you’re only going to work in super early phase startups or constantly get assigned to greenfield projects, you will encounter legacy code. There are many articles written on the internet talking about how to successfully understand and contribute to existing codebases. I am going to talk about two such instances in my not so long career, where I have had some what different results in attempting to familiarise an existing codebase. But before that let me share some of my observations.

I would say there are about three factors that help in understanding a repo authored by someone else. Firstly, how well acquainted are you with the programming language and the framework used. Software is built on abstractions and one of the major layer of abstraction is the programming language itself. If you’re trying to figure out a Ruby-on-Rails project, you have to start with understanding Ruby first (see 1). I know this is super obvious to most programmers, but it may not be for some people working in the software industry, which sometimes results in people getting assigned to projects they are not suited for.

Some languages might be similar enough that if you know X, you can learn Y easily. But in some cases it may not work that way. As a python developer I can probably pick up ruby on the job with considerable effort, but I am not sure about if it’s the same with something like Clojure or Erlang. If you’re working on web development projects then the framework too plays an important factor in grokking the codebase. Transitioning from Flask to Django was almost seamless for me, but something like Tornado or Twisted might have proved more challenging.

Another key factor in grasping the ins and outs of a codebase is having a nifty toolset. Although intuition does help to a certain extent, you have to get your hands dirty at some point if you want to consistently solve problems. Whether it’s an IDE, a debugger, logging frameworks or simply knowing your way around Linux shell, these tools help you gain a better picture of the application in a huge way.Postman makes a regular appearance in my API debugging encounters. So does pouring over git commits and checking git blame. Context matters while debugging a particular piece of code, and if the author of the repo is unreachable (which is the case 9 times out of 10) you will have to gather context in whichever way possible. Your toolset will play a major role in doing that.

Lastly, there’s the functionality of the application. If you don’t know what your application does, you may not get to figuring out how it does that. However functional knowledge, in my experience has been very difficult to communicate. Functional requirements evolve over time, and if it is not constantly documented you are often left with incomplete and incorrect picture of your project. The only thing that helps here is asking questions and accurately communicating your understanding to avoid misunderstanding or .. uhmm .. miscommunication. But if you do somehow get hold of the functional knowledge then it is going to be a significant addition to your arsenal, in terms of identifying and solving problems.


On a good day, I have sometimes pictured myself to be something like Zach Galifianakis playing black jack in the movie Hangover. Although a developer figuring out the root cause for a problem might not make an engaging movie scene, it is definitely engaging to some developers in the room. I am sure each one of us can remember at least one instance of being in awe, watching someone solve a problem that seemed mysterious to us. The following experience is probably the exact opposite.

Me in my head while debugging something

I was once asked to go through a PHP repository and quickly pick it up to a level that I can start making changes to it. I had been working with Python for a year at that point, but never had a chance to learn or try out PHP. I eventually managed to get a hold of it enough, but I did struggle to do it on time. It was not just a collection of PHP scripts supporting a web site that I was asked to master, but rather it was a full fledged project built on a comprehensive framework.

Often times I hear people talking about the state of projects that they are assigned to, and justifiably so. But it was completely opposite in my case. The developers had put together the project in a masterful way using appropriate design patterns and keeping it almost DRY. And I on the other hand was someone who was used to hacking out quick solutions to most problems without knowing the existence of patterns. So it was trying to figure how this thing worked, that was built on this another thing that I don’t know, and also built in a way that left me befuddled.

Being a novice, I clearly struggled. The lesson I learnt was that I cannot be learning both the language and the framework simultaneously. Although related, they are solving two different problems. It was silly to have underestimated the task, but thankfully I eventually got hold of it.

Recently I started on a new job. I got assigned to the Support team. I am supposed to switch between 3-5 projects depending on the tickets being raised. All of them written by someone else and already in production. I might write a different blog post about how it’s like to work as a full-time support developer, but the core of what I do is try and understand existing codebases on a daily basis. This time I have yielded fairly positive results until now. The tools are somewhat new, but the project being in Python and Django helped me lot in ramping up quickly. Django forces you to write code in a certain way, models - views - templates, it becomes quite straightforward for someone experienced in Django to understand any project written in Django. That holds the same for PHP-Symfony or Ruby-on-Rails.

There are probably some really important things that I have missed here, for e.g.: running the tests, using the debugger, and some other things which I haven’t come across yet. But all in all, I wanted to share some thoughts on what I consider the basics of understanding legacy code. Please do share your thoughts and comments below.

(1): I was once given a take home interview projects. They wanted me to debug a Ruby-on-Rails application. I started by going through a lengthy Ruby-on-Rails tutorial only to realise after like 40%, that I should be going through a Ruby tutorial first. Needless to say, I didn’t even complete the take home interview.

PS: Image credit - https://macrebisz.deviantart.com/