While Agile is pretty mainstream by now in Web and app development, it is still a major challenge in system design, where software plays only a part of the game, although that piece is steadily increasing. Whether we're talking about manufacturers of cars, chips, or medical devices, they all need to respond to the increasing pace in the market. Only one or two decades ago, these industries were content with product cycles of three to five years. Today, some chip manufacturers are capable of delivering a new version of their product every second month, causing excitement for their customers and despair for their competitors.
Obviously, Agile in these industries means something different than in pure software organizations. There are two major reasons for this. First, these products require extremely large teams. Usually there are several hundred to several thousand engineers involved. Although it's always an interesting exercise to challenge the size of these endeavors (do we really need more managers than engineers?), the sheer number of professions needed to build these products exceeds the maximum size of a cross-functional team by an order of magnitude.
Second, the value chain of these products usually involves steps with very high transaction costs. Putting a new chip design into production costs about US $5-$10 million. Arranging a factory for a new car model is one or more orders of magnitude more expensive. Again, it is worthwhile to challenge these steps, too, but in many cases you run into some very tricky problems that require some fundamental research. I expect these costs to decrease dramatically in the next decade with techniques such as rapid prototyping maturing into mass production scales. But this doesn't help solve today's problems.
So what is the difference between a company that plans to deliver a new product after one year and finally delivers after two or three years as opposed to their competitor, that delivers a new version every two months? I think, there are two closely related key ideas differentiating these two: thinking from the end and managing flow.
The traditional way to design highly complex products is to collect requirements, bundle them into products, have a team of system engineers break these down into bits and pieces for each component, and then start execution. When a certain milestone is reached, all these components are integrated and then the product is ready. Or, to be more precise, then all hell breaks loose — with months of weekend work and overtime, all the problems and bugs must be chased down and fixed, making all delivery plans void.
If this reminds you of the old waterfall approach in software development, you're probably right. However, if you face a three-month production cycle and $10 million transaction cost for a single iteration, you may think differently about this than you if you can exploit a full continuous deployment environment.
On the other hand, there are specific problems coming with this approach: overengineering, communication problems, integration problems, schedule overruns — all the problems we know from traditional waterfall approaches in software.
Thinking from the back in these environments means setting a stable cadence for the most expensive step in your value chain. Every eight weeks there will be a new version of this particular chip. Every six months the car factory closes down for one week to accommodate the new model. Whatever the verification guys have accepted for release at that moment comprises the new product. Verification verifies whatever integration and development have prepared for them. And system engineering designs whatever is needed for the next feature, just in time. The focus of management in this model is on ensuring a steady flow of delivery, rather than executing unrealistic project plans and devoting 50% of their bandwidth to blame transfer.
This approach requires some serious investment. It is heavily based upon the organization's ability to reliably test new functionality on a simulated environment. But serious simulators that cost $20-$30 million in development have paid off if they have saved two or three buggy chip versions. In addition, I often observe the misconception that "real tests" need to be done on full system level and the simulators have to be as close to the final product as possible. In contrast, you get the most effective testing if you devote 90% of your test effort to component testing, because it is far easier to create weird constellations there and to fix the bugs. The fact that there are some bugs you cannot catch at this level is no excuse not to avoid all the others.
Thinking from the end and managing flow also require the ability to cut the functionality into small pieces that actually can flow. Many engineers object to this approach because you may have to touch some components several times and you need to do far more regression testing. I agree, but it depends on your preference: would you prefer not to deliver in an efficient manner or to deliver less efficiently?
If these ideas remind you of Kanban, you’re on target. The more sophisticated Kanban concepts, such as Service Delivery Kanban, Enterprise Services Planning, and Discovery Kanban are of tremendous help here. However, the most important step has to be made in the heads of senior management.