For many years, I studied a phenomenon I call "death-march" projects. You may think it's a strange term, and that it has little or nothing to do with your work - but I'm convinced that it has become the norm, not the exception, in the application development field. In this article, I want to discuss the kinds of software tools and software processes that should be used on such projects.
Death-march projects, as I define them, are those whose "project parameters" exceed the norm by at least 50%. This typically means one or more of the following constraints have been imposed upon the project:
- The schedule has been compressed to less than half the amount estimated by a rational estimating process; thus, the project that would normally be expected to take 12 calendar months is now required to deliver its results in 6 months or less. Because of the pressures of business competition in today's global marketplace, this is probably the most common form of a death-march project.
- The staff has been reduced to less than half the number that would normally be assigned to a project of this size and scope; thus, instead of a project team of 10 people, the project manager has been told that only 5 people are available. This may have come about as a result of management's optimistic belief that a new CASE tool or programming language will magically double the team's productivity. More commonly, though, it happens today because of downsizing, reengineering, and various other forms of staff reduction.
- The budget and associated resources have been cut in half. This may be the result of downsizing and other cost-cutting measures, but it can also result from competitive bidding on a fixed-price contract, where the project manager in a consulting firm is informed by the marketing department that "the good news is that we won the contract; the bad news is that we had to cut your budget in half in order to beat out the competitors." This kind of constraint often has an immediate impact on the number of project-team personnel that can be hired, but the consequences are sometimes a little more subtle - e.g., it may lead to a decision to hire relatively inexpensive, inexperienced junior software developers, rather than higher-cost veterans. And it can lead to a pervasive atmosphere of penny-pinching that makes it impossible for the project manager to order pizza for the project team when it spends the entire weekend in the office working overtime.
- The functionality, features, performance requirements, or other technical aspects of the project are twice what they would be normally. Thus, the project team may have been told that it needs to squeeze twice as many features into a fixed amount of RAM or disk space as their competitor; or their system has to handle twice the volume of transactions that any comparable system has ever accomplished.
The immediate consequence of these constraints in most organizations is to ask the project team to work twice as hard, and/or twice as many hours per week as would be expected in a "normal" project. Thus, if the normal work week is 40 hours, then a death-march project team is often found working 13- to 14-hour days, 6 days a week.
Another way to characterize such projects is as follows:
A death-march project is one for which an unbiased, objective risk assessment (which includes an assessment of technical risks, personal risks, legal risks, political risks, etc.) determines that the likelihood of project failure is =50%.
Even a project without the schedule, staff, budget, or functionality constraints described above could have a high risk of failure - e.g., because of hostile politics between the IS/IT department and the user community. But most commonly the reason for the high-risk assessment is a combination of the constraints described above.
Categories of Death-March Projects
Not all death-march projects are the same; not only do they involve different combinations of schedule, staff, budget, and functionality constraints, but they come in different sizes, shapes, and flavors. Size is usually the most important characteristic that distinguishes one death-march project from another. Consider four different ranges of projects:
- Small-scale projects - the team consists of 3-6 people who are working against nearly impossible odds to finish a project in 3-6 months.
- Medium-size projects - the team consists of 20-30 people, who are involved in a project expected to take 1-2 years.
- Large-scale projects - the project consists of 100-300 people, and the project schedule is 3-5 years.
- Mind-boggling projects - the project has an army of 1,000-2,000 or more (including, in many cases, consultants and subcontractors), and the project is expected to last 7-10 years.
Small-scale death-march projects are the most common in the organizations I visit around the world; and, happily, they have the greatest chance of succeeding. A tight-knit group of 3-6 people is more likely to stick together through thick and thin, as long as the commitment isn't for more than 6 months or so; and a group of highly motivated people is more likely to be willing and able to sacrifice their personal lives for 3-6 months, as long as they know that the regimen of long nights, wasted weekends, and postponed vacations will come to an end in a matter of months.
The odds of successful completion drop noticeably with the medium-size projects, and disappear almost completely with large-scale projects. With larger numbers of people involved, it's more difficult to maintain a sense of cohesive team spirit; and the statistical odds of someone quitting, being run over by a beer truck, or succumbing to the various perils of modern society increase rapidly. What's crucial here is not just the number of people involved, but the time- scale: Working 80-hour weeks for six months may be tolerable, but doing it for two years is much more likely to cause problems.
As for the "mind-boggling" death-march projects, one wonders why they exist at all. Perhaps the systems development efforts associated with the NASA project that landed a man on the moon in 1969 could be considered a successful example of a death-march project, but the vast majority of such projects are doomed from the beginning. Fortunately, most senior managers have figured this out, and most large organizations have banned all such projects. Government organizations, alas, still embark on them from time to time, though even government is downsizing.
In addition to project size, we can characterize the "degree" of a death-march project by the number of user-organizations involved. Things are hard enough when the project team only has to satisfy one user, or one group of homogeneous users within a single department. Enterprise-wide projects are usually an order of magnitude more difficult, simply because of the politics and communication problems involved in cross-functional activities of any kind. As a result, the systems development projects associated with BPR projects often degenerate into a death-march status; even though the development effort is modest in terms of hardware and software effort, the political battles can paralyze the entire organization and cause endless frustration for the project team.
Finally, we should distinguish between projects that are incredibly difficult, and those that are fundamentally impossible. As John Boddie, author of Crunch Mode (Prentice Hall, 1987), points out:
The combination of excellent technical staff, superb management, outstanding designers, and intelligent, committed customers is not enough to guarantee success for a crunch-mode project. There really are such things as impossible projects. New ones are started every day. Most impossible projects can be recognized as such early in the development cycle. There seem to be two major types: "poorly understood systems" and "very complex systems."
Death-March Processes
If you remember only one word from this issue of ADS, it should be triage. You might have assumed that a discussion of software processes would concentrate on familiar methodologies like structured analysis, or formal process disciplines like the SEI Capability Maturity Model (CMM), or various prototyping approaches. These are all important and relevant ideas, but the most important idea of all is this: You don't have enough time in a death-march project to do everything the users are asking for. If you build your processes and methods around that sobering fact, you have a chance of succeeding; if you begin the project with the notion that coding can't commence until all the structured analysis data flow diagrams have been approved by the user, you'll definitely fail.
This doesn't mean that we should ignore all of the other process-related ideas and strategies. I'll discuss them later, but I believe they should be introduced as part of a strategic corporate decision, rather than foisted upon a death-project team as a desperate tactical ploy to avoid what would otherwise be a failure. And the concept of triage applies here, too: If pressed, a death-march project team will abandon what it feels is unhelpful or unessential, and devote its resources to whatever it feels is most helpful.
The concept of triage
A good definition of triage is: allocating a scarce commodity in such a way as to derive the greatest benefit from it. Or, as Stephen Covey puts it in First Things First, "the main thing is to make sure that the main thing is the main thing."
Most prototyping and RAD approaches are compatible with triage, and a few even mention the concept explicitly. But the emphasis in most RAD approaches is simply to get something - anything! - working quickly, so that it can be demonstrated to the user in order to (a) prove that tangible progress has been made, and (b) solicit feedback on the functionality of the system and (mostly) on the user interface. That's all very useful, but if the project team has devoted its resources and energy to building initial prototypes with "sexy" but nonetheless nonessential features, then the team and the user are wasting their time. A true triage approach must focus on prioritization.
This issue is fraught with danger because of the subtle, but insidious, assumption made by most software engineering methodologies - whether they are based on the classical "waterfall" life cycle, or the more recent "spiral" and prototyping methodologies. The assumption is, "somehow, we'll get it all done by the time the deadline arrives." The unspoken motto of many project teams seems to be, "we will leave no requirement unfulfilled."
A noble motto indeed, but almost always unachievable in a death-march project. Most death-march projects have "official" requirements that exceed the team's resources - specifically, people-resources and time-resources - by 50%-100%. The naive death-march project team's response is to hope that by working double overtime, the deficit can somehow be overcome; a cynical "suicide-mission" team's response is to assume that the project will be 50%-100% behind schedule, just like every other project. But even the cynical team is usually wrong, for they still assume that sooner or later (usually much later!) they'll eventually implement all of the functionality requested by the user.
The key point about death-march projects is that not only will some requirements remain unfulfilled when the official deadline arrives, but some of them will never be implemented. Assuming that the familiar "80-20" rule holds true, the project team might be able to deliver 80% of the "benefit" of the system by implementing 20% of the requirements - if they implement the right 20%. And since the user is often desperate to put the system into operation far earlier than the project team thinks reasonable, the user might take that 20%, begin using, and never bother asking for the remaining 80% of the system's functionality.
This is extreme and simplistic, of course, but in virtually all of the death- march projects that I've been involved with, it made enormously good sense to separate the system requirements, triage-style, into "must-do," "should-do," and "could-do." The meaning of these three terms is obvious, and the fact that there are only three prevents any irrelevant squabbles as to whether a specific requirement should be categorized as a "priority-6" or "priority-7" requirement. Having performed such a triage, the obvious project strategy is to focus on the "must-do" requirements first; if there is time left over, then focus on the "should-do" requirements; and if a miracle occurs, then work on the "could-do" requirements.
Failure to follow such a strategy from the beginning of the project usually leads to an "ugly crisis" toward the end of the project; in addition to the nasty politics, it also produces what my colleague Dean Leffingwell of Requisite Inc. refers to as "wasted inventory." To understand why, consider the way things unfold in a typical death-march project.
When the project begins, nobody is willing to admit that the schedule is unrealistic - least of all the user and senior management! The project manager and the team members may have a bad feeling in the pits of their stomachs that they've stumbled into a suicide mission, but if they're optimistic they may believe that it will be a "mission-impossible" project where a miracle saves them later on. The key point here is that the deadline is far enough away - typically six months or a year - that nobody has to face up to the reality that the objectives are impossible.
Indeed, the political pressures and the team's naiveté may even prevent a reassessment midway through the project. Ironically, the problem is often compounded if the project team has been following some form of RAD/prototyping approach, for it's probably demonstrated one or more prototype versions of the system to the user, which can prolong the illusion that everything will be done on time. But by now, the project team members are probably beginning to realize that they're in over their heads; and if it's the manager's first death-march project, he or she often has the naive belief that senior management and the user will eventually come to their senses.
Alas, things don't usually work out that way. An "ugly crisis" finally occurs when the user and/or senior management finally has to face the undeniable reality that despite the demands and the sincere promises from the project manager, the system is not going to be delivered on time. This often occurs a month before the deadline, sometimes a week, sometimes the day before the official deadline! Depending on how the political battles have been proceeding up to this point, and depending on how exhausted and frustrated the project manager has become, there are several possible outcomes. But what often happens is that senior management concludes that the entire problem is the fault of the project manager; that hapless individual is summarily fired (if he hasn't quit already!), and a new project manager is brought in with blunt instructions to "clean this mess up, and get the system delivered."
The replacement manager may be a battle-scarred veteran from within the organization, or perhaps a consultant from the outside. And sometimes the new manager does find that his predecessor made a number of basic management mistakes; sometimes, the new manager's 20-20 hindsight concludes that the original manager was doing the right things, but couldn't avoid becoming the sacrificial scapegoat when senior management finally had to accept the fact that their original demands were impossible to achieve.
But whatever the assessment, one thing is almost certain: The replacement project manager has to address the fact that the complete set of project requirements cannot be finished in time for the original deadline - if that weren't the case, the original project manager probably wouldn't have been fired in the first place. So what does the replacement manager do? The two most obvious options are:
- Renegotiate the deadline; and
- Renegotiate the requirements for the system.
The first option might be the most "rational" one, but it's unlikely to be acceptable in a death-march project: After all, the reason the users were asking for an unreasonable schedule in the first place is that they desperately need the system to cope with some business demand. And since the negotiation being carried out by the replacement project manager is taking place at a point in time relatively close to the original deadline, there's a good chance that the user community has already begun making plans to put the new system into operation. The last thing they want to hear is that it's going to be delayed another 6-12 months.
Thus, the most common - and successful - negotiating ploy involves a triage of the original requirements. Note that the replacement project manager is negotiating from a position of strength: It's not his fault that the project is in such a mess, and there's an unspoken awareness that management and the users were pretty stupid to have gotten themselves into this situation in the first place. The new project manager may even base his acceptance of the assignment on a successful outcome of the negotiations - e.g., with a statement like, "If you want me to take over this disastrous project, then you're going to have to accept the fact that we can only deliver a small percentage of the original functionality in time for your deadline. That's the situation; take it or leave it."
So far, all of this is fairly straightforward - even though it's discouraging, as a consultant, to see it happening over and over again. But this is where Leffingwell asks the question, "What about the inventory?" That is, what about all of the work-in-progress created by the project team before the "ugly crisis" occurred and the new project manager took over? Chances are that the project team had written a lot of code, and maybe some test cases; they might have had some documentation, some design models, and some structured analysis models. What happens to all of that "inventory" of partially completed work? The sobering answer: Most of it gets thrown away.
This might seem like an unnecessarily pessimistic statement. After all, why not simply put all of that partly finished work aside, and return to it later on? In the best of all worlds, this is exactly what happens; but it presumes the existence of a good set of tools and processes for version control, configuration management, source-code control, etc. - much of which is often abandoned in the heat of battle, when the team is trying to produce as much output as possible.
But the real reason why all of this partially completed work ends up being wasted inventory is that no one will ever have time to come back to it. Assuming that the project team (now under the control of a new manager, whom it may or may not respect) is able to deliver the "bare minimum" of critical functionality, it's usually so exhausted that half the staff quits. And the users are so disgusted with the project that they never bother asking for the rest of the unfinished functionality; or conversely, they're so satisfied with the minimal functionality that they never bother asking for the rest of the system. Even if they do, and even if the original team is still intact, there's a good chance that so many architectural changes were made in the attempt to deliver a "bare-bones" system that the half-finished pieces of work (which relate to noncritical requirements) can no longer be used.
Note that none of this discussion has anything to do with structured analysis, or the SEI-CMM, or any of the other "textbook" forms of methodologies and software processes. It's just common sense; but it's critical common sense in a death-march project. For it to work, all of the project stakeholders have to agree as to which requirements fall into the "must-do" category, which ones are "should-do," and which ones are "could-do" requirements. Obviously, if the project owner categorically insists that all of the requirements are "must-do" items, and that nothing falls into the other two categories, this whole discussion is a waste of time. And if the various stakeholders cannot reach a consensus about the triage items, then the project team will be paralyzed, attempting to do everything for everybody when they lack the resources to do so.
Unfortunately, the "ultimate reality" is that most organizations lack the discipline, experience, or political strength to deal with these issues at the beginning of the project. Nothing that I've described in the preceding paragraphs is rocket science, and even the most technologically illiterate manager or business user can understand the issues. But even though everyone understands the issues intellectually, the political battles surrounding death- march projects make it very difficult to reach a consensus on a reasonable triage. It's only when the "ugly crisis" occurs that the various parties finally agree on something that they should have agreed on when the project began.
The exception to this gloomy prognosis is the organization that has adopted death-march projects as a way of life - i.e., every project is consciously intended to be a death-march project. Obviously, users and senior managers are not stupid, and they usually do learn from their experiences - even if it takes three or four disasters for the lessons to sink in. As mentioned above, the original death-march project manager is usually a victim of the inability to perform an early triage, but the survivors gradually figure out what it's all about.
Formal vs. informal processes
Some project managers might read the preceding paragraphs and complain, "Wow! That sounds much more formal than anything we've ever done!" Having encountered such a reaction in some consulting engagements, I'm often stymied. On the one hand, I believe that the prioritization and management of project requirements is essential (regardless of what tools or techniques are used to accomplish the task). On the other hand, I worry that if an entirely new, alien process is introduced into a project team that already has more than it can handle, then any new concept - e.g., requirements management - may turn out to be the straw that breaks the camel's back.
Indeed, I don't have a good answer for this dilemma, other than hoping that perhaps the project team will be able to manage one new idea among their collection of tools and processes. But I worry even more when I see teams embarking on a death-march project with the decision (or more commonly, the edict foisted upon them by the "methodology police") that they must embrace a formal process approach such as the SEI-CMM or ISO-9000. Formal processes are great if you know what you're doing, and if you've used the processes before. But the reality is that such formal processes typically haven't been used at all in the organization; the death-march project is the pilot project for structured analysis or ISO-9000.
What insanity! It really is the straw that breaks the camel's back, in many cases; after all, the typical death-march project is trying to do something that's never been done before, and the team often consists of people who have never worked together before. As if that wasn't enough, now they have to learn how to use an unfamiliar methodology or process, one which they're not sure they believe in, and one which they're convinced will slow them down. Why is it that the methodology police are so surprised to see resistance in circumstances like these?
To succeed, the death-march project team needs to agree on which processes will be formalized - perhaps source-code control and change management - and which processes will be carried out on a completely ad hoc basis (e.g., user-interface design). There's no point mandating a particular software process if it's not going to be followed; the methodology police are wasting their time if they try to impose a rigid methodology, and it will cause the project team to waste its time, which is far more valuable.
This means that the death-march project manager must impose the processes that he or she feels essential, in a dictatorial fashion - e.g., "Anyone who modifies our source code without going through the change-management process will be fired summarily!" Or the project team must sincerely agree to adopt the process, because the team believes that it will be cost-effective. This is more likely to occur if the project team has worked together before, so that it shares a common experience with various software development processes. It's less likely to occur if one team member stands up and says, "I deeply believe that structured analysis is critical to the success of our project," when the other team members have no idea what he's talking about. Another corollary of this principle: It's usually a disaster to introduce a new, unfamiliar process into a death-march project, even if the team collectively believes that it will help. The learning curve, and the inevitable confusion and bickering over the details of the process, will usually outweigh its benefits.
And this means that such formal approaches as SEI-CMM, ISO-9000, or the introduction of new analysis and design methodologies should be done somewhere outside the realm of death-march projects. The sensible thing is to introduce these processes as part of a long-term corporate strategy, to experiment first with a pilot project (which should not be a death-march project), and then support it with appropriate training. If all of these things have been done, and if all other development projects are already operating at level 3 on the SEI-CMM scale, then it becomes interesting to ask whether such processes should also be used on a death-march project. As Watts Humphrey once remarked in a conference speech about SEI-CMM, "If a process can't be used in a crisis, it shouldn't be used at all."
I'm not sure how many project managers would agree with Humphrey's assertion, particularly if the death-march project is viewed as a once-in-a-lifetime exception to the norm. If indeed this is the case, then perhaps it does make sense to abandon the formal processes, and let the death-march team use whatever ad hoc techniques they feel appropriate. But remember my earlier assertion: Death-march projects are becoming the norm, not the exception. If this is the case, then the official corporate processes should be amended, as necessary, to make them suitable for a death-march project. Then, and only then, does Humphrey's statement make sense.
Good-enough software
The triage prioritization discussed above can go a long way toward making a death-march project "rational" in its behavior. For success, it's not required to implement all of the requirements; the project is "good enough" if we can implement the "must-do" requirements and a reasonable number of the "should-do" requirements.
But there's another aspect of software development that causes difficulty in death-march projects: The implicit demand for absolute quality. This is usually expressed in terms of defects (bugs), but it may also be expressed in terms of portability, platform independence, flexibility, maintainability, and a few dozen other "-ilities." It's hard enough to achieve these objectives in normal projects; it's almost impossible to do so in death-march projects. Instead, the project team has to decide - and if at all possible, get concurrence from the shareholders and stakeholders - what's good enough.
The reason this is so important is that the achievement of absolute "-ilities" consumes project resources - especially time. If you want to develop a certifiably bug-free program, complete with a mathematical proof of correctness, it's going to take time; it may also require a higher level of talent than the project team can provide. It's also going to consume some of the energy of one or more people on the project team, which means that those people won't be available to work on other requirements. In short, achieving such "-ilities" as reliability, portability, and maintainability requires a tradeoff, and it has to be taken into account as part of the triage prioritization discussed previously.
Death-march project teams have to confront this unpleasant reality, because the alternative is usually "perfect" software that isn't finished when the all- important deadline arrives. It's better if the team is aware of the pragmatics of good-enough software at the beginning of the project, but my experience has been that many traditional software developers accept the notion of good-enough software development only when their backs are against the wall - e.g., when they're facing the "ugly crisis" discussed earlier, a month or two before the deadline.
Up to that point, they'll complain, "How would you like it if we used your 'good-enough' approach for the software in a nuclear reactor or an air-traffic control system?" The answer, of course, is that I wouldn't like it at all; and if someone proposed a death-march project for those kinds of high-reliability applications, I would stop flying on airplanes, and I would move as far away as possible from nuclear power plants. But we don't usually see death-march projects of this kind; it's more likely to be the payroll system for the nuclear power facility, or the airline reservation system used by the airline. Payroll systems and airline reservation systems aren't supposed to fail either, but the immediate consequences of a failure aren't quite as serious.
In any case, perfect reliability, maintainability, portability, etc. are not necessary, practical, or even desirable in most death-march projects. Indeed, perfection isn't possible even in normal projects - it's just that we can afford to set our standards much higher because we usually have fewer constraints on time, budget, or personnel resources. For death-march projects, though, what the users really want is a system that's cheap enough, fast enough, feature-rich enough, stable enough, and available soon enough. That's their definition of "good enough."
In the January 1996 issue of ADS, I discussed many of the issues and strategies associated with "good-enough" software. You may want to review that issue - and remember that it's especially applicable for these kinds of projects.
Best and worst practices
In the preceding discussion, I've warned about the dangers of allowing the methodology police to impose a set of rigid methodologies or software processes upon a death-march project team. The same advice holds for external consultants, gurus, witch doctors, faith healers, snake-oil salesmen, and textbooks. That includes ADS; if I've recommended something that doesn't make sense and that the project team can't carry out with enthusiasm and sincerity, then ignore it!
But this is particularly true of methodologies and software processes. Rather than following a set of practices that somebody else has recommended - or even worse, a set of practices imposed in a top-down fashion by managers and methodology committees who usually don't know what they're talking about - it's far better to follow a set of practices that the team itself regards as "best" for the circumstances. That's the essence of the "best-practices" approach that has been gaining popularity in the past couple of years: a grass-roots approach to identifying, documenting, and evangelizing software development organizations that real developers have found successful.
Unfortunately, death-march project teams often don't have much to go on, because theirs is often regarded as the first such project within the organization. Or even if it isn't the first one, it's still regarded as an exception - so nobody has bothered cataloging the techniques that worked and the techniques that didn't. Even worse, the death-march projects tend to have a high mortality rate (otherwise, they wouldn't be called "death-march" in the first place!). Thus, the people who would be most likely to provide useful advice for the next such project have quit, been fired, committed suicide, suffered a nervous breakdown, or withdrawn into a shell of cynicism.
If you are indeed embarking on the first death-march project the organization has seen, then it's likely that the best you can do is document whatever practices and processes have worked in your project, for the benefit of the next death march that follows. One way of doing this is by conducting a "project audit" at the end of the project - but this rarely occurs, and the results are usually so boring that nobody bothers reading them. The reasons are obvious: As mentioned earlier, the project team is so exhausted, frustrated, and frazzled by the end of the project that the notion of documenting its experiences is likely to be met with hoots of scorn. Furthermore, many of the most valuable contributors have long since disappeared by the end of the project.
Thus, what you should consider as an alternative is a series of "mini-audits" throughout the project. If you have mini-milestones (sometimes known as "inch- pebbles"), such as delivering a new version of a prototype to the user, then schedule a half-day mini-audit immediately after the inch-pebble. What practices worked well, and which ones were a disaster? What should be emphasized more heavily for the next milestone, and what should be abandoned? This kind of self-reflection is useful for the project team itself; the fact that it will also be helpful to future death-march project teams is icing on the cake. Also, the team is usually in better spirits during these intermediate inch-pebble meetings, and their comments are likely to be fresher, more candid, and yet less cynical.
For the organizations that have no best-practices material available, I'll recommend a few sources. I covered the topic in one chapter of my book, Rise and Resurrection of the American Programmer. You should also look at the World Wide Web site maintained by consultant Christine Comaford at http://www.christine.com for another collection of best-practices material. Perhaps the most ambitious project under way today is the Airlie Council's efforts within the U.S. Defense Department; I discussed the Airlie Council's work in the May 1995 issue of ADS, and you can find this information on the Web at http://spmn.com.
One of the most important contributions of the Airlie Council is the notion of worst practices; this is especially important for death-march projects, where it's often more important to avoid disasters than it is to find the best possible way to do things. The Airlie Council's list is summarized below:
- Don't expect schedule compression of >10% compared to the statistical norm for similar projects. (Of course, if you really believed this one, you wouldn't even start a death-march project!)
- Don't justify new technology by the need for schedule compression - you've got enough problems in a death-march project without debugging new tools and technology by using beta versions of software from your friendly tool vendor.
- Don't force customer-specific implementation solutions on the project - useful advice for any project.
- Don't advocate the use of silver-bullet approaches - something worth remembering when your managers propose (right after they've been visited by a persuasive vendor!) that your project can be "saved" by some new-fangled tool or development methodology.
- Don't miss an opportunity to move items that are under external control off the critical path - if your project team can't control it, then having it on the critical path makes it all the more risky. This applies to things like vendor tools, hardware boxes, software packages, and other components from external vendors; but it also applies to tangible deliverables and political decisions made by various project stakeholders surrounding the project.
- Don't expect to achieve an accurate view of project health from a formal review attended by a large number of unprepared, vocal reviewers - the project team doesn't have to worry about this, for they already know that such review sessions are political rituals. The advice is aimed more at the senior managers who watch the death-march project from afar, trying to find out whether it's in trouble.
- Don't expect to recover from a schedule slip of >10% without a >10% reduction in software functionality to be delivered - this is crucial advice for a death-march team, because there's a good chance that the schedule will slip by more than 10% during the course of the project. Indeed, even a 10% slippage is dangerous in a death-march project, for the team is probably already working so much overtime that they don't have the additional capacity to work 10% more each day. But the point of this suggestion from the Airlie Council is to remind us that people-time and software functionality are not exchangeable in a linear fashion.
During the past year, I've posed two questions to several hundred software managers in seminar audiences around the world: "If a colleague of yours was about to embark on a death-march project, what is the one thing that you would advise him or her to do in order to succeed? And, what is the one thing you would advise him or her not to do?" I've been intrigued to see that nobody has ever identified tools or technology as the "one most important thing," nor has anyone mentioned formal methods or techniques such as structured analysis or object-oriented design. A few people have recommended peopleware strategies (e.g., "hire good people," and "make sure that the team is really committed to succeed"), but most of the recommendations have centered on the issue of negotiations, scope management (which is handled well by the triage concept discussed earlier), and risk management (which I'll discuss below).
One last concept from the Airlie Council might be useful for death-march projects, though it's likely to be used more by the managers outside the project than the manager or team members inside the project. It's called the "breathalyzer test": What questions would you ask a death-march project team to quickly determine whether it was so out of touch with reality that it should be shut down? These are also the sort of questions that consultants often ask when they are asked by senior management to review the status of a project. I've been in that position myself, and I can usually tell the project is in trouble when I see the glazed eyes of the project manager, who looks like a deer caught in the headlights of an onrushing car. Here's the breathalyzer test from the Airlie Council:
- Do you have a current, credible activity network supported by a work breakdown structure (WBS)?
- Do you have a current, credible schedule and budget?
- Do you know what software you are responsible for delivering?
- Can you list the top 10 project risks?
- Do you know your schedule compression percentage?
- What is the estimated size of your software deliverable? How was it derived?
- Do you know the percentage of external interfaces that are not under your control?
- Does your staff have sufficient expertise in the project domain?
- Have you identified adequate staff to allocate to the scheduled tasks at the scheduled time?
As mentioned earlier, the reason the breathalyzer test is administered is that someone in the organization - usually not the project manager, but someone much higher in the management ranks - has a "gut feeling" that the death-march project is in trouble. For their own political survival, the project manager and the entire team should ask the same questions of one another periodically. And the project manager should be on the lookout for other signs that the project is in trouble, even when things look okay on the official PERT chart:
- Key project-team members are quitting - this may occur for a number of reasons, but it's important to get a sense of whether team members are losing faith in their ability to finish the project. If key members begin quitting, others may follow.
- The "inverse Dilbert correlation factor" - the more Dilbert cartoons pasted on office doors and bulletin boards, the less well off the project is.
- Excessive gallows humor - if the project team begins wearing black shirts to the office, or piping funeral dirges through the Muzak system, you're in trouble.
- New names for the project, e.g., "Project Titanic" - another form of gallows humor, but usually a more serious indication that the project team has lost faith, lost respect, and lost any real interest in whether the project will ever succeed.
- An ominous silence from end users and senior management, who used to ask on a daily basis how the project was coming along - by the time you recognize this, it may be too late to recover, but you should have at least a few days to update your resume.
- Thrashing - lots of activity but no sign of forward progress. Avoiding this is what the "inch-pebble" idea and the "daily build" strategy are all about.
The "daily build" concept
In the discussion about prototyping, milestones, and inch-pebbles above, there was an unspoken assumption that the incremental "deliverables" produced by the project team would appear at intervals measured in months or weeks. That's what most of us are accustomed to from our past experience with "normal" projects, and it's consistent with the usual pace of business life - e.g., weekly staff meetings, monthly status reviews, quarterly presentations to senior management, etc.
But death-march projects, as we've seen throughout this issue, typically need a different approach. When it comes to prototyping and incremental development, it often makes sense to organize the entire project around the concept of a "daily build." By this, we mean: Compile, link, install, and test the entire collection of code produced by the team every day, as if this was the last day before the deadline and you had to ship whatever you've got to the user tomorrow morning.
Realistically, you can't start the daily build on the first day of the project. And while it might be possible to build the equivalent of a "Hello World" subroutine on the second day of the project, it won't impress anyone unless everything about the project involves completely new technology (e.g., many of the Java-based projects that are currently under way). But there's usually a point well before the first "official" demonstration or delivery of a prototype version of the system when the software developers have a reasonable collection of components, subroutines, or modules - at least a few hundred lines of code, perhaps even a few thousand - that actually accepts real input, does real calculations or processing, and produces real output. That's the point when the daily build should begin, and a new (and hopefully better) version of the system should be built every day thereafter.
Why is this so important? As Jim McCarthy, Microsoft's Visual C++ product manager and author of Dynamics of Systems Development (Microsoft Press, 1996), likes to say, "The daily build is the heartbeat of the project. It's how you know you're alive." And there can hardly be a more important priority for the manager of a death-march project. If a week goes by while everyone is spinning their wheels and nobody has quite had the nerve to tell the project manager that they just can't manage to get the newfangled object-oriented database to communicate properly with the client-server application they're working on, the project may have fallen hopelessly behind schedule. As long as the project manager hears status reports delivered in a verbal fashion, or documented in written memos (or with data flow diagrams), it's all too easy to confuse motion with progress, and effort with achievement. But if the project manager insists on physically observing the behavior of each day's "daily build," it's much more difficult to hide whatever problems are plaguing the project.
Some project managers will nod their heads and confirm that this is how they've always done it; but most will admit that they've settled for weekly builds, or monthly builds, or semiannual releases of a system. While nobody can rightly claim to have "invented" the daily-build concept, many feel that Dave Cutler should be given the credit for popularizing it during the development of the Windows NT operating system at Microsoft. It's also interesting to note that the Windows 95 development project also used the daily build concept; the final beta version before the production system was released in August 1995 was known as "Build 951."
It's important to recognize that an approach like this effectively becomes part of the project team's process for developing the system: Imagine what it must be like to be part of a team that has to demonstrate a working version of its software on 951 consecutive days! Furthermore, in order to be effective, the daily build should be automated, and should run unattended in the middle of the night, when all of the programmers have gone home (or have climbed under their desk and into their sleeping bags!). This implies the existence of automated configuration-management and source-code control mechanisms, as well as automated "scripts" of some kind to carry out the compiling and linking activities. But most important, it implies the existence of an automated test- management system that can run all night long, pounding away on the new version of code to see if it still runs yesterday's test cases properly. Thus, to make the daily build concept work, it's almost certain that a reasonable set of tools are already available.
A few small tricks can add even more value to the daily build concept:
- The project manager should move his or her office to the test site, or operation center, once the daily build process begins. Dave Cutler did this at Microsoft, and there are apocryphal stories of the tantrums that he threw when he arrived at the office and found that the daily build had crashed in the middle of the night. Tantrums or not, the point is that the project manager wants to be very visible and very involved in the daily build process, rather than being the commanding general at the rear of the army, receiving daily reports on a battle taking place miles away.
- Since it's likely that the daily build will require at least a small amount of manual supervision while it runs in the middle of the night, it may help to establish the following policy: Any programmer whose buggy code causes the daily build to crash gets the honor of supervising the operation of the (nightly) daily build until the next victim causes a crash. At the very least, it makes the whole concept of the daily build much more "real" to the project team!
- Assign one of the programmers who normally comes into the office early in the morning the task of checking whether the daily build ran successfully, and then posting the results in a visible place. If nobody is willing or able to show up early, then hire a college student. One company instructed the student to plant a flag outside the building to warn everyone whether it was going to be a good day or a bad day when they arrived: A green flag meant the daily build had succeeded, while a red flag meant it had failed.
Risk management
If requirements management - particularly the triage prioritization of requirements - is the most important process within a death-march project, then the second most important is risk management. (If "risk" weren't such a critical issue, we wouldn't apply the adjective "death-march" to the project in the first place.) It's interesting to note that one of the breathalyzer test questions identified by the Airlie Council concerns identification of project risks; and while that question might draw a blank stare from the manager of a "normal" project (even if that normal project has gotten into terrible trouble), it's one that can typically be answered fairly crisply by the manager of a death-march project. A manager would be a naive fool if he initiated a death march without giving some serious thought to the primary risks and how they might be mitigated.
Alas, things sometimes get out of hand as the death-march project continues. That is, because the risk-management activity is addressed in terms of ad hoc emotions and instinct rather than as a formal process, the manager often misses the emergence of new risks as the project continues. In the best case, the risks that were visible at the beginning of the project will be eliminated; in the normal case, they continue to be worrisome risks throughout the project (e.g., the risk that a key team member will quit). But entirely new risks - things that nobody anticipated - can suddenly emerge, and because the team typically has very little "slack" or "reserve" capacity in terms of schedule, budget, and resources, these new risks can be killers.
My biggest concern is for the project manager who has survived several "normal" projects with an intuitive, ad hoc risk-management approach; that usually won't work in a death-march project. Indeed, it's the existence of an effective, formal software risk-management (SRM) process that makes some organizations willing to "go out on a limb" and consciously take on a death-march project that would otherwise be certain suicide.
There is a substantial body of literature on risk management, and it's beyond the scope of this issue of ADS to cover it all. For example, some death-march project managers follow a very simple process of having the team identify and monitor the top-10 risks in the project; these can be printed on a one-page form, and their status can be reviewed on a weekly basis.
Obviously, other approaches can work just as well; but the key is to ensure that it's one that will be understood, accepted, and followed by everyone on the project team - for it's the peons at the bottom of the hierarchy who are usually the first to see the emergence of new risks. In a death-march project, we don't have time to let the information trickle up to the top of the management hierarchy by whatever antiquated communication mechanisms are used to convey other forms of political information; the risks have to be pounced on and attacked by the team as a whole in order to prevent them from getting out of control.
The word "control" is crucial here, for the project team has to distinguish between risk assessment, risk control, and risk avoidance. In the worst case, the project team reacts to risks as they occur - e.g., by allocating additional resources for additional testing in order to alleviate the consequences of a bug. This kind of a "fix-on-failure" approach, where the risks are addressed after they have surfaced, often leads to a crisis-mode form of "fire fighting" that can result in the utter collapse of the death-march project team. Risk prevention is better, and it means that the team agrees to follow a formal process of assessment and control in order to preclude potential risks from occurring.
An even more proactive form of risk management seeks to eliminate the root causes of failure and risk; this is often the focus of quality-management initiatives within an organization. It tends to expand the scope of risk assessment to a broader horizon, in order to allow anticipation of risks; and it can lead to a very aggressive managing culture, which incorporates a "risk- taking" ethic by engineering the degree of risk that the organization can tolerate. I'm all in favor of such an approach, but it's a more strategic issue that ought to be discussed and implemented outside the context of a death-march project. The death-march project team has a very tactical perspective: it's not trying to change the culture of the organization, but merely survive and finish the project.
Death-March Tools and Technology
Obviously, we don't expect a death-march project team to survive with only one tool. Most teams - even for normal projects - have a wide variety of tools, and quite an assortment of technology, to accomplish their day-to-day work. But sometimes they have too much, and sometimes they have technology that's too new, and sometimes they have tools they don't want foisted upon them by Dilbertesque managers. And, in some cases, they're prevented - for financial, political, or cultural reasons - from getting the one tool they believe is critical for accomplishing their objectives.
I'm not going to advocate esoteric, advanced software tools that somehow communicate telepathically with the programmer in order to generate well- structured code from disorganized thoughts. But I do believe in the concept of a "minimal toolset" for death-march projects. I also want to emphasize the critical relationship between tools and processes, especially since the processes in a death-march project are likely to be different from those used in the rest of the organization. And finally, I want to issue a warning against introducing completely new tools, of any kind, into a death-march project team environment.
The minimal toolset
The concept of triage also applies to tools and technology for the project team: There are some tools the team "must have"; some they "should have"; and a bewildering variety of tools they "could have." And there are some obvious reasons for applying the triage prioritization in a conscious, cold-blooded fashion at the beginning of the project.
The most obvious reason is economics; even if the tools worked and everyone was familiar with them, it would cost too much money to acquire them. And it would take too long to order them: by the time the procurement process in a normal corporate bureaucracy was finished, the project would be finished. In many death-march projects, it's important to focus on a few critical tools, and then try to persuade senior management (or the "tools police") to acquire them.
But suppose the team is operating in a large environment that already has hundreds of different tools that have been acquired over the years. Should they all be used? Obviously not! Even if they all work, the mental effort required to remember how they work - and the additional effort to make them all work together - usually exceeds the incremental benefit obtained. Exactly which tools are critical, and which should be ignored, is a decision the death-march project should be allowed to make on its own - regardless of whether it conforms to organizational standards. I'm staggered by the number of organizations I visit where the death-march project manager tells me sadly that there's an organizational mandate that all projects must be done in COBOL (or, in other organizations, Visual Basic, or Oracle, or whatever ...), even though that technology is utterly inappropriate for his project. Baloney! Throw it out! Use the tools and technology that make sense! To do otherwise is roughly analogous to someone telling the leader of a Mount Everest mountain-climbing team, "Our committee has decided that your project team should take along a detailed map of the New York City subway system, because most projects have found it very helpful."
However, it's essential that the team members agree on common tools within the project; otherwise, chaos will ensue. Obviously, this has to be interpreted with a certain degree of common sense; it probably doesn't matter which word processor the team members use to write their documentation, but it probably is important that they all use the same compiler for their C++ code. One of the problems with a death-march project is that the software developers believe that it creates a license for complete anarchy at the individual level: If they want to use an obscure C++ compiler they downloaded from a university Web site, they believe it's their inalienable right. Not so: it's the team that has the inalienable right, and the project manager must enforce this strictly in any area where incompatible tools could make a significant difference.
This means that unless the team members have worked together on several previous death-march projects, they will have to come up with a "minimal" toolset that everyone agrees to use. Thus, triage emerges again: The "must-have" toolset is also the "must-use" toolset. Once a consensus has emerged on that set of tools, then the team can discuss the "should-have" tools, where the problems are likely to be a combination of consensus-building within the team, and management approval for the purchase of new tools. Beyond that, there may or may not be sufficient time and energy to discuss the merits of the nearly infinite number of "could-have" tools that various team members might be interested in.
I suggested above that the project manager has to be prepared to enforce the consensus; indeed, this could be one of the criteria used by the manager to select potential members of the team. Note that the same could be said about the software processes discussed earlier. It's even more important than that - because tools and processes are intimately related to one another. With all of these caveats in mind, it's impossible to casually enumerate a universally applicable list of tools for a death-march project. But this is the list of tools I would normally look for:
- E-mail, groupware, Internet/Web tools - This is at the top of my list, because electronic-interaction tools are not only a means for communicating much more efficiently than memos and faxes, but also because they facilitate coordination and collaboration. Basic e-mail and access to the Internet is something I would insist upon, though I would be happy to negotiate as to which vendors and products should be used.
- Prototyping/RAD development tools - Almost all death-march projects use some form of prototyping or incremental development approach; consequently, they need tools to support this effort. It's hard finding a popular development environment today that describes itself as anything but a RAD environment, and the majority of such tools today have a visual, drag-and-drop user interface to help the programmer get more code developed more quickly. Whether the tool should be based on Delphi, C++, Visual Basic, or Smalltalk (or a dozen other possible choices) is something I can't recommend on any kind of global basis. But remember the comment above: it's not sufficient to have a consensus that we're all going to use a language like C++ or Smalltalk; we have to agree on a common toolset from a common vendor. To have part of the team using ParcPlace- Digitalk's VisualWorks environment while the others use IBM's VisualAge for Smalltalk product may be technologically feasible, but it's still downright stupid.
- Configuration management/version control - Several of my colleagues feel that this should be at the top of the list. There is an obvious benefit to having the configuration management (CM) tools well-integrated with the other primary development tools. Thus, Microsoft's SourceSafe may or may not be the best version-control software, but the fact that it's well-integrated with Visual Basic is a big argument in its favor. Similarly, many other development tools are integrated with InterSolv's PVCS, IBM's ENVY/Developer, or other comparable CM tools.
- Testing, debugging tools - Many of us would automatically include this with the "basic" development tools that allow us to create code, compile it, and run it. But as we moved from mainframe online applications to GUI-oriented client-server systems, we gradually realized that an entirely new set of testing tools were not only appropriate, but often essential; and tools from vendors like SQA and Mercury Interactive still aren't distributed widely enough in the organizations I visit. Similarly, project teams moving into the world of the Internet and Web-based application probably need a whole new set of testing and debugging tools.
- Project management (estimating, scheduling, PERT/GANTT, etc.) - There's a tendency to think of these as the "manager's toolkit"; perhaps it's only the project manager that needs to recompute the project's "critical path" on a daily basis. But in this same category I would include estimating tools like ESTIMACS, CHECKPOINT, and SLIM; these are essential tools, in my opinion, because they support the dynamic reevaluation of schedules and deadlines throughout the project.
- Toolkit of reusable components - If the project team is familiar with the concept of software reuse, and if they regard it as a strategic weapon with which to accomplish high levels of productivity, then a toolkit of reusable components needs to be on the list of "must-have" tools. This might be a collection of VBX components for Visual Basic, or the ParcPlace-Digitalk Smalltalk class library, or Microsoft's MFC class library for C++; obviously, it could also include some in-house components developed by other project teams within the organization. The choice is usually language-dependent, and it's another area that needs to be used consistently by everyone within the project team.
- CASE tools for analysis and design - Some project teams regard CASE tools as a "crutch" for novice developers, but others consider them as essential as word processors. My preference is for the CASE tool that's simple, inexpensive, and flexible; aside from that, I won't recommend any particular tool or vendor, because the real answer to the question of which CASE tool to use is, "it depends...."
Tools and process
The issue of CASE tools, mentioned above, is probably the most obvious example of a truism: Tools and processes are inextricably linked together. There's no point using a structured analysis CASE tool if you've never heard of the acronyms DFD and ERD. Such a CASE tool is not only useless, but an incredible burden, if the project team sincerely believes that ERDs and DFDs are meaningless forms of bureaucratic documentation produced solely to get the methodology police off their back.
The situation is not so black-and-white in many cases: The project team might feel that data flow diagrams are useful, but only as an "informal" modeling tool. Thus, a "flexible" CASE tool might be considered a benefit, while a "hard-line" CASE tool would be rejected. Consider the obvious analogy with a word processor: We all appreciate the benefits of the spell-checker, but we wouldn't want to be forced to use it; and it's quite likely that we never use the grammar-checker because it's too slow and clumsy. We would be even more annoyed if the word processor steadfastly refused to allow the word "ain't" within a document. A few more "features" like that would be enough to make us all go back to paper and pencil.
What this means is that the death-march project team must first agree on the processes and methodologies it intends to follow, and it must decide which of those processes are going to be followed religiously - and which ones will be honored in spirit, but perhaps not to the letter of the law. Once this has been decided, the tools and technology can be chosen - or rejected! - accordingly. In this same fashion, the project manager may decide to adopt a particular tool in order to enforce a process that everyone agrees on intellectually, but is likely to practice in a sloppy fashion; a good example is version control and configuration management.
One of the biggest myths about software tools in any software project - and a particular danger in a death-march project - is that the tool will be a "silver bullet" that will somehow accomplish miracles. Miracles, of course, are what senior management is looking for; and even the project manager may be tempted by the vendors' advertising claims that programming, testing, or various other activities will be improved by a factor of 10 through their tools.
Aside from the problem that such tools are usually brand-new and that nobody knows how to use them, there's a more fundamental point to consider: the only way such a tool could be a silver bullet is if it allows or forces the developers to change their process. Hardly anyone objects to the prospect of using improved technology that permits the elimination of processes that were considered boring and tedious. But it's more difficult to introduce new technology that requires us to add processes, or modify processes that we were comfortable with. A good example is the process of reuse, and the associated technology of reuse libraries, browsers, and related tools. The project teams that use this technology can effectively raise their level of reuse from approximately 20%-60% or more; indeed, if the technology is matched with a corporate-wide reuse process, then the level of reuse can reach 80%-90% or more.
This is all very exciting - indeed, worthy of being called a "silver bullet" - but it's utterly irrelevant if the project team (and ultimately the entire organization) is unable or unwilling to change its software processes with regard to reuse. The irony is that most organizations will blame their failures on the technology itself: They'll buy an expensive class library, or they'll replace their old software development methodology with object-oriented techniques on the theory that objects are synonymous with reuse. And when they eventually find they've achieved no measurable increase in reuse, they'll blame the problem on objects, or on the vendor of the class library, or on whatever other technology they've depended on. Meanwhile, the process is exactly the same as it was before: The culture of the organization is expressed with the phrase: "Only wimps reuse other people's code; real programmers write their own damn code!"
From the perspective of a death-march project, there's a simple moral here: If the introduction of new tools requires the team's "standard" process to be changed dramatically, then it will add significantly to the project risk and probably contribute to the failure of the project. This sometimes gets muddled with the issues of training and learning the mechanics of how to operate the tools. But the more fundamental problem is usually that of changing behavior. It's hard enough to do under normal circumstances, where we feel that we have a lot of time and a supportive environment to slowly become comfortable with the new process. But it's usually a disaster in a death-march project, when we don't have enough time and we don't have a supportive environment.
Conclusion and Summary
Does all this mean that we should use no tools at all? Should we assume that technology can never save us? Obviously, common sense should prevail in all such discussions. In the best of all worlds, the software developers will have had a chance to learn, experiment, and practice with high-powered tools in a less-risky environment. Indeed, in the best case, advanced tools and processes have already been deployed throughout the organization, and are part of the culture and infrastructure of the organization.
And in this case, we wouldn't need to have any discussion about tools and processes at all; we would simply pick up our tools and go to work on the death- march project. The reason for the perspective in this issue of ADS is that an organization is using mediocre tools, or someone believes that a completely new form of technology, just announced breathlessly by a start-up vendor last week, will somehow save the day. The former scenario is depressing, but all too common; and the latter scenario is also common, for the simple reason that technology advances quickly and relentlessly in our field.
If new technology could be introduced without any impact on our software processes, and if it didn't require training and practice on the part of the developers, then we would be faced with a simple cost-benefit decision. And since the natural instinct of many higher-level managers is to assume that a problem can be eliminated by simply throwing money at it, I find that there is far more brand-new technology used on death-march projects than on normal projects. The irony is that the new tool can be the straw that breaks the camel's back; and the project failure is then blamed on the tool.
So, use whatever tools make sense for your death-march project, regardless of whether the rest of the world thinks they are advanced or old-fashioned. And remember that if you do use new tools, it's going to have an impact on the people and the processes within the death-march project. As Thoreau put it so eloquently 150 years ago:
But lo! men have become the tools of their tools.
-- Henry David Thoreau, Walden, "Economy" (1854).