I did a bit of troubleshooting today for a customer who was experiencing very slow logon times to VMware View desktops running Windows XP. I suspect the problem is a fairly common one so I thought I might share my troubleshooting methodology and the solution that got the login time back to normal. Following a rigid methodology may be overkill for many troubleshooting situations. If you strongly suspect a root cause to a problem, check the solution before digging in to analytic troubleshooting. A little bit of Googling may eventually get you to an answer for a particular problem, but having a firm troubleshooting process will help in all situations.
I’m going to lay out my troubleshooting methodology for you, with some VMware View specific examples. If you’re not interested in the lesson, scroll to the bottom for the probable causes and solution to my particular issue. If you want to learn a bit about a tried and true methodology for problem solving, read on!
My troubleshooting approach is borrowed from the Kepner-Tregoe process for Analytic Trouble Shooting as written about in their book The New Rational Manager. The Keppner-Tregoe methodology dates back to the 1950’s and has been used worldwide by corporate, government and other institutions to solve problems and make sound decisions. The Keppner-Tregoe Analytic Trouble Shooting method was used by NASA to help land Apollo 13, and has been identified by ITIL/ITSM as a recommended problem solving technique.
The first step in the method is to define the trouble statement. That is, what exactly is the problem we are trying to solve? The better your trouble statement, the quicker you can zero in on what or where the problem may be. It may seem simplistic or silly, but a trouble statement verbally stated or written makes sure everyone involved in troubleshooting is actually troubleshooting the same issue, not chasing down tangents, unrelated symptoms, etc. In this case, the opening trouble statement from the customer was pretty simple: “Domain account logons to VMware View desktops is slow and/or or doesn’t complete.”
As this was a new customer to me, the opening trouble statement pretty much covered the extent of my knowledge of their particular environment. I have a decent bit of working knowledge on VMware View that can carry me through most troubleshooting, but a more specific understanding of the problem limits the depth of memory (and overworking of already tired neurons) I need to get to the solution. We get more specifics by asking the right questions. The specifying questions you ask can be generalized across most any analytic trouble shooting effort (IT, mechanical, relationships, etc.). The specifying questions attempt to observe the problem (defect) from all dimensions to define a more exact trouble statement that you will use to begin to hone in on a root cause. Specifying questions, in and of themselves, do not attempt to identify the root cause. The questions attempt to answer the IS and the IS NOT of the following dimensions:
- WHAT: What is/is not the object, person or unit with the defect? What is/is not the defect on the object?
- WHERE: Where is/is not the object with the defect observed? Where is/is not the defect on the object?
- WHEN: When is/is not the object with the defect first observed? When is/is not the defect observed in the cycle of the object? What is/is not the pattern of the when?
- EXTENT: How much of the object is/is not affected? How many objects have/do not have the defect? Who many defects on the object? What is the trend?
The IS NOT in these specifying questions deals always, in all four dimensions, with a closely related object or defect which could be affected, but is not related to the problem at hand.
Some examples of specifying questions that could be used in troubleshooting the slow logon times for View desktops are (not all will apply to your particular situation, just some seeds to start you along): [Read more…] about Troubleshooting Slow Logon to VMware View Desktops