Troubleshooting Methodology 1 : Intro

Like a lot of people working in IT, I ended up doing this job because of the enjoyment I get out of fixing things and generally tinkering with various bits of technology. Curiosity about how things work would probably be the biggest driver in my career (and life – but that is another topic!) Thinking back on it, I would have to say that my earliest memory of troubleshooting would be “helping” fix the upstairs phone with my father (who is a telecoms engineer) when I was about 5. This was a great introduction to troubleshooting as it was such a hands-on problem, but more on this later.

Over the years I’ve spent working in technical support, I’ve noticed people tend to troubleshoot things in one of two ways:

  1. Working on intuition and essentially just randomly making changes with no real reason for making the changes (The Potluck Approach)
  2. Gathering some information, drawing some conclusions and then making changes (The Structured Approach)

The second method is rarer than you would think. Sometimes randomly pushing buttons will fix a problem faster, simply through luck, but overall taking a structured approach is faster and results in fewer disasters caused by pushing the wrong button. It also means you generally learn what caused the problem and thus can prevent it from happening again.

My approach to problems is straightforward enough but I’ve found it does help. Below is my general, step by step, approach:

  1. Scoping the problem:  What is happening and (sometimes) why is this a problem?
  2. Data collection: Varies from problem to problem but usually includes environmental information, version of software, when it last worked etc.
  3. Analysis: Looking at the data collected and seeing if there are any indications of a problem.
  4. Result and conclusions: This varies based on where the analysis has led you. Sometimes you’ll have to go back, change what you’re looking for, and take a different tack.

The result of the historic phone troubleshooting? We ran the cables and initially it didn’t work. Then we checked the first junction box and got a signal, so the problem was between the junction box and the terminating socket. Turned out it was the socket. This might be a basic example, but the benefit of a structured approach is that it applies to all problems. If we apply the structure to the steps taken, it would look like this:

  1. Scoped the problem: Phone wasn’t working.
  2. Data collection: Checked how far the signal was getting.
  3. Analysis: Problem was between junction box and socket.
  4. Results: As replacing the socket was easier than ripping the cable out of the wall, we tried replacing that and hey presto it worked without tearing the wall apart. Success!