Strategies and Tools for rewriting/refactoring of a software application.


Recently the development team and myself have been engaged in a discussion about the need to do a complete rewrite on some elements of the software (broken and messy beyond repair) and refactor other elements of the code (salvageable code). I am looking for a few articles giving me strategies of how to approach this and some tools paid/freeware to manage this process.

Any advice is appreciated.

Thank you.
ZackGeneral IT Goto GuyAsked:
Who is Participating?
gr8gonzoConnect With a Mentor ConsultantCommented:
It's going to be difficult to find anything that is specific to your situation. I've worked with and led several development teams through these types of situations. In fact, I was specifically brought in on my last three positions to rework projects that were in a terrible state. The projects ranged from SaaS applications to mortgage desktop software, and each project was one of those products that allows a lot of customization.

An amusing anecdote: Salesforce is a great product, and yet was probably by far the messiest implementation - not because it was a bad product but because of how it was customized; we had a VP that loved to get his hands dirty and would build hooks and triggers everywhere to do little one-off things. Looking at their setup gave me a headache every day. Moral of the story - it doesn't matter how great the starting product is - you can make it be terrible.

Now, in my opinion, before you can fix the problem effectively, you have to understand how it got to that point.

In pretty much every case I've seen or worked on, the underlying problem was a lack of preparation and sufficient "resting" time. Each client was so focused on adding new features to their system and progressing through their very-ambitious timelines that they never made time to stop and examine where they were and where they were going.

In those projects, the upside AND downside was that they were already using Agile as a project methodology. It was great for their situations, but the client incorrectly interpreted "Agile" to ALSO mean "fast" and "never stopping." So the result was that the previous development team was always in sprint-after-sprint-after-sprint, always adding features onto the product. If the infrastructure wasn't already there to support the new feature, they'd build the infrastructure in order to meet the deadlines.

Then they'd have multiple bits of infrastructure that were all built quickly to meet the immediate needs. The system would grow in usage and those quickly-built components would stagger under the unexpected weight. Here's an example:

One client had a table that stored logs of training data for their employees. A new feature request came along that had a developer creating a report that could be configured by the client (so they could add/remove fields as they wanted). So the developer simply told the script to pull ALL the fields in that table. That way, if the client added a new field, it would already be in the data set. All was good for a while, until a DIFFERENT sprint a few months later had a different developer adding support for embedded files in the training log table so trainers could upload their worksheets and compliance paperwork. By itself, this wasn't a bad feature request, but about a year later, the dynamic report from the first sprint was just dying left and right. The problem was that over time, a lot of attachment data had gone into the system and since the report script was pulling ALL the fields, it was also pulling hundreds of megabytes of attachment data that was never even used in the reports. It put a huge strain on the database and nobody could figure it out. It was basically written off as the product meeting its limitations.

I was brought in to basically start over from scratch, build a new team, and find a "better" platform. So the first thing I did was stop development and have the existing team do a thorough review of what had actually been done to date and start doing some performance testing (cachegrinding can be a useful process for this). We found hundreds of areas that were all cumulatively bringing the system to its knees.

It wasn't any single developer or any single business request that caused the problems. It was a simple accumulation of different small problems and inefficiencies over time, combined with a complete lack of review and maintenance.

So if you have a broken product, take the time to understand how it got to that point. More than likely, it's the same story as my above experiences. All businesses hate overhead costs, and maintenance is often a hard sell. "You want us to pay you to rework the system in a way that will lead to no new features?" If your business has a CTO, you might be in luck, since they sometimes come from a background that will let them understand the importance of maintenance.

If you don't have a CTO, then just try to tell the reigning decision-makers to cut their own personal costs by choosing to never take their car into the shop to get an oil change and all the accompanying maintenance. Then see how they like driving their car after a couple of years with clogged air filters, no windshield wiper fluid, misaligned tires, and a smoking engine.

Maintenance is a fact of life with every product that sees a continual use. Software is no different.

Now, every situation is different, so I can't really comment on the exact steps you should take, but I'd recommend some general actions:

1. Don't give up on the current product yet. Sometimes things seem beyond repair but are not AS bad as they seem.

2. Stop any non-critical development and order a full analysis of the current code. Look in particular for any areas of the product that see heavier amounts of interaction (e.g. if you have 20 different features that all deal with a single area, pay attention to ways that those features can work together better instead of working independently of each other).

It may help to think of each feature as a member of a team standing at a toy assembly line. You may have added another person to the assembly line each time you added another part to the toy, but do you really need 20 people to assemble 20 parts? Or can you reduce it down to 5 people assembling 4 parts, making the assembly lines a LOT shorter?

3. Analyze the flow of data through the system as it stands today. Have you built some "quick fix" features to process data, but those features are now causing a domino effect? Are there any data flows that are unnecessarily circular? What flows are repeating processes and events, and can those flows be re-arranged to avoid the repetition?

I've found several times that data flows change over time as the business needs change, but they're rarely examined to see if things are going through in the most efficient manner. On the Salesforce project, the client had a workflow that was creating a record that was used for one particular data flow, and then promptly deleted if the user pursued any of the other data flows. Re-arranging the flow of things allowed the record to ONLY be created when it was actually needed, eliminating the extra database hits to delete the unnecessary records.

4. Look for opportunities to separate immediate-need processes from processes that can wait. For example, you may have a process that updates some statistics in a database table whenever a new record gets created. Does this process really need to be done right away, or can it be a process that can be handled later?

5. Determine if you have the right underlying hardware and infrastructure. It doesn't matter how fast your application is. If it's running on a 386, it's going to move like molasses. If you don't have load-balancing on a SaaS application, you're going to hit some dead ends at some point. If your product runs on a desktop, then make sure it doesn't try to take on jobs that require the horsepower of a server. Alternatively, if your product runs on a server (thick client setups), can it make use of distributed computing (separating out small processing jobs to the computers)?

Make sure your infrastructure supports your volume model and make sure you're not trying to make one entity do EVERYTHING. There's really no product out there that can do it all. The Oracle database used to be the end-all for large enterprises for dozens of years, and now we're in the age of big data, where enterprises are using Oracle for their immediate needs and offloading heavier jobs to separate systems.

These days you need a mix of heavy hitter systems to keep up with everything. You can't just rely on Microsoft Cosmos to be your front-end database or Oracle to be your back-end.

Microsoft is often years ahead of where the market is going to be. They had SBS for small companies and forced larger companies to separate Exchange onto a separate server because they knew that Exchange on SBS couldn't handle the load by itself, regardless of the hardware.

Moral of the story - make sure you're not trying to make one system do too much. For example, if your system does a high volume of email notifications, consider using a 3rd party mail service or product that is designed specifically for mass mailing. It'll probably do a better job, alleviate the load on the original system, and it'll probably come with a lot of features that the original system couldn't offer anyway. It's usually a better investment of your IT budget.

To summarize, an ounce of preparation here will go a long way:
1. Understand how you got to this point.
2. Understand the key problems today.
3. Understand the data flows.
4. Enforce regular maintenance points.

Altogether, that should lead you into the best next steps to take - whether those are to rewrite the system from scratch or build better pieces of infrastructure and migrate the existing system onto those new pieces.
ZackGeneral IT Goto GuyAuthor Commented:
Thank you for the essay answer to my question I really appreciate it, you have given me much to consider.
One final thought - all the above is all about strategy, because that's what your questions seemed to be asking about. Just in case, here are some quick tips on the more technical aspects.

#1. Make sure you're also using the right code architecture. For example, if you're building a SaaS application, it's probably a good idea to use an MVC (Model View Controller) approach, or even an established framework for your primary programming language. For example, you might consider Phalcon, Laravel, or CodeIgniter for PHP, or Rails for Ruby. There are good frameworks for every language / situation.

Frameworks will make it easier to stay within good programming practices. The flip side of that is that it can take some time to adapt to a framework, so you may have some slower development progress during that learning curve, but once you get going, you should have a more consistent development speed over time.

If you weren't using a framework before, then the framework might seem like an unnecessary pain in the ass, but it's like any workshop. Let's say you have a workshop and when you first start, all you have is a hammer and all you're doing is hammering metal. At that point, it would seem SLOW to get a toolchest and put away your hammer each time and try to keep your toolchest organized. I mean, with one tool, you don't need a toolchest or organization, right?

However, once your business develops and you have 1,000 tools and you're doing a large variety of things, you're going to wish that you had that toolchest and that you had adopted good organizational habits.

So using a framework can basically be the equivalent of using good organizational habits. You have to make sure you don't stray too far from their recommended practices. The end result will be worth it, though. It should make development easier and more consistent, as well as future maintenance.

#2. Also, ensure that you're taking time during your resting periods to deprecate old code and remove it from the system. It might be a pain to figure out where everything is used (again, frameworks will usually make this job easier), but it will greatly reduce the amount of time invested into code research and future maintenance.

#3. If you're not already using a version control solution (a.k.a. source code repository), get one set up, preferably Git. SVN and CVS are easier to set up and use, but they have serious performance degradation over time. If you don't have a good place to put the repository, use a cloud-based provider like GitHub. There are some excellent free GUIs for Git nowadays, like SourceTree, which can make the process much easier to adopt if you're not already using one.

#4. Make sure you go through some learning materials for your version control system. Learn about branching, merging, and tagging at the bare minimum, and get comfortable with it. You can use your version control system to make deployments easier in your development lifecycle and reduce coding conflicts.
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.