Sunday, October 9, 2011

The Failure of Reductionist Thinking in the Creation of Software Intensive Systems

My career was dedicated to helping large organizations build software intensive systems that would solve their business needs. This was an honorable calling and one that had many moments of intellectual joy. With each level achieved, I perceived that the projects were often doomed to sub-optimal achievement by decisions made earlier in the process. I was drawn inexorably to these higher levels like a moth to a light. Since my capabilities allowed me to ascend toward management ranks I was allowed to see how these decisions were arrived at and to make my attempt to avoid the pitfalls I had witnessed in earlier projects. The culmination for me was to be project manager for several large, complex, high-risk projects, work that I eventually found soul crushing and caused an existential crisis that I am only now recovering from. But this experience has left me with a clarity of vision on how management and the engineering staffs they employ come from such different cultures that to find they achieve anything at all is a testament to the underlying unity of human society.

Projects are funded to solve business problems. This goal directed nature of these efforts gives them a vitality that is lost if the goals cannot be succinctly expressed or are not shared by the stakeholders. This deep understanding of the goals is the sole province of the leaders of that organization. Many projects fail at this level do to the annunciation of commandments or the articulation of such laudable, but ultimately vacuous goals as to be useless for the guidance to a satisfactory conclusion. At this level, at a minimum, a project must be able to articulate how a successful conclusion will be recognized. There is no great harm in allowing untestable or abstract statements. However management must recognize that their job is not complete until the project can articulate a set of concrete business objectives that will justify the capital and expense of this project. It must also accept that this is a contract with the project team and the project stakeholders that constrain them to accept this as a stated end position and only change it with the full cooperation of the project team. After all, things change and knowledge is learned during the articulation of the goals. But too many projects lose focus and support when the vague goals they started with are never reified into something tangible enough to drive decisions and the most energetic members of management move on to other more alluring goals leaving the project with the heavy lifting of ensuring at least some reasonable goal is achieved with the resources expended.

Traditional waterfall methodologies inherited from the success of very large well disciplined organizations to manage large complex projects may never have worked very well but where they were employed with thought, consistency with well trained and stable staffs, they worked better than anything else that had previously been tried. The dominance of the model of project process led most managements to view the creation of software systems as more akin to the creation of an automobile on the assembly line than the creation of a work of art with inscrutable processes and unpredictable outcomes. Yet the experience of the past 40 years suggests that the latter may be a closer model of systems creation than the former.

The most recent understandings of project failures involve the consequences of the earliest design decisions on some of the most valuable properties of the system to be created, the emergent qualities of the complete ensemble. This view is supported by the thesis of SEI which posits that most, if not all, "non-functional" qualities of a software system derive from the was the system is decomposed and constituted during the design process. If is oft stated that you can't tack on (take your pick here: usability, security, performance, etc) on the end of an already existing product. It must be a design goal from the beginning due to the cross cutting concerns. That is to say that qualities are orthogonal to the functional needs. If you think about it, it cannot be otherwise since we know we can refactor code in any number of ways without making any change in its dynamic behavior. Yet one or two inappropriate decompositions can make the achievement of some desirable qualities difficult or impossible. Yet these earliest design decisions are made long before there is any significant understanding of even the problem-domain, no less the solution-domain for this project. This reality puts architecture-centric techniques into a tension with Agile techniques which are embraced by managements for their fast delivery of tangible results. To balance these two competing forces requires a management that can weigh short-term gain against long-term investment; that can participate in the technological vision of an architect and create a safe-space in which difficult concepts can be given enough time to mature without engaging in a blind trust that time and money will necessarily result in a better product.

Another danger faced by management at these early stages of the project is the lack of maturity both within management and within the engineering community to properly analyze the problem-space. From an engineering perspective, the requirements engineering phase is at its best if it results in a set of models which express views of the problem-space in a way that enables the designer to project a new and different version that can be understood by both the business and the development organization(s). Yet no matter how powerful, these models are often the source of a great deal of damage. At this point I always have Magritte's famous "pipe" painting in mind which proclaims "this is not a pipe." The point made is that the painting is a depiction of a pipe, not an actual pipe. In software even more than in art, the error of reification is so easy to make that it is almost impossible to avoid. No model of human or organizational behavior can ever capture the true richness of behavior that it is capable of. But rather than embrace, or even acknowledge this truth, management will attempt to use a model to constrain behavior in an attempt to achieve uniformity, uniformity that may be driven by a desire to reduce the skill set needed by the worker, to ensure that different organizations create an identical product or merely to believe that they are achieving some business goal that is never expressed. This reduction of human work to a standardized process has been studied since the dawn of the assembly line and doesn't need to be rehashed. But what is important here is how this mentality can cause the project to mistakenly believe that the model of the organization into which a system will be crafted really expresses the behavior of the people and the jobs they do rather than allow for the extra-system work that in most cases allows the system adapt and remain resilient to unexpected input. The model is not the process.

The biggest dangers in systems development are seen when the team believes they understand the problem in sufficient detail to fully articulate the solution specification. This is the realm of design. This design exists on a continuum from the selection of previous designs that satisfy this requirements, as if the problem is a burnt out bulb and the solution is the replacement with a functionally equivalent one, to creation of a novel and innovative solution to a problem that had never been tackled before, such as the creation of a spacecraft to support colonization on Mars. Too often a project is funded as if the problem space is burnt out light bulb while the stakeholders want to go on a mission to Mars. Project management, with support from senior management, must ensure that everyone on the project is guided to the same place on this continuum and that this scope is normalized to the budget it is given.

Another pitfall during this phase of the project is the abdication of management decision making. Design is all about tradeoffs. I haven't met a business manager yet that will not quip when asked if they want it fast or cheap will not answer "yes." It is not unreasonable for business managers to seek a quality product that is instantly available, does everything they could ever think of for that product and for it to be free. But we all know that it is impossible. Decisions must be made and if the business will not make them, the designers will. For the business to not engage in this decision making (assuming that the designers do not usurp that responsibility) is a clear abdication of their responsibility. While a talented designer may actually make more astute decisions than the business, the business is left poorer for its ownership of those decisions and it leaves a flaw in the crystalline connections of forces that connect the problem to the solution via a line of empowered managers. What is sought is a product that once delivered is known to represent the best decisions possible from this organization and one that cannot be disowned by those managers.

This brings me to what I believe is perhaps the most intractable danger in systems development; the achievement of emergent properties in the system. The drumbeat for "quality software" has been growing over the years and will surely increase as we attempt ever more complex systems in ever more demanding areas like health and transportation. Yet what we are grappling with is an engineering approach to achieving properties that have more in common with chaos and stochastic processes than they do with Newtonian mechanics. The Alexandrian pattern languages will enable us to evolve solutions through trial and error. Yet to an engineer this is an unsatisfactory place to be. By what principles, by what mathematics can be envision some emergent property and then back into the simple rules that will allow this property to be expressed? Kurtzweil suggests we will see singularity before 2050 but even if we do, I don't think it will do us much good. Reaching the tipping point where our manufactured logic machines have comparable statistics to the wetware in our heads does not guarantee that they will become HAL-like. Unless and until we know more about the brain than the number of neurons, dendrites, axons or even their wiring pattern, I don't think the hardware achievement will mean much beyond the new economics of computation. There is still a new mathematics waiting to be created out there, one that does not fall prey to reductionist thinking.

Tuesday, September 27, 2011

Are They Really Requirements?

The paradigm we get from the waterfall model is that a client states their requirements and they lead unidirectionally to the specification for the system. We already know from the popularity of the Agile methods that waterfall is not the correct model, that for many projects it is more productive to approach it as an artistic product and make many iterations that can be shared with the client. The benefits of this approach have been well documented but it still leaves the idea that there are some set of requirements that just need to be uncovered and refined. I now believe this is a fatal flaw in this part of even the Agile methods.

Iterative methods gain their power from the inclusion of the client in the design process. In the older waterfall approaches, after the requirements were documented and signed off, the client would often see little progress (except bills and excuses why the project was behind schedule) until an advanced level of testing allowed the client to see the complete system as it is intended to work. The pain this caused when it was realized that what was built was not an acceptable solution to the client's problem was always significant since so much had already been sunk into the wrong solution. Agile at least boxes the risk into a single iteration and if they are small enough, the damage can be limited. One good trait of all these Agile methods is that they support faster failure of dysfunctional projects.

Clients have always had misgivings about requirements signoffs. While they could not articulate it, they felt they were being setup for failure. Their business expertise did not prepare them for the task of directly specifying the product they needed nor is the average business analyst prepared to completely understand the business function that is within the scope of development. Neither is generally prepared for the design tradeoffs that are often done without any direct client involvement, even if the team is articulate enough to explain how a particular design decision impacts the various emergent properties of the system under construction. The current methodologies simply don't work during the requirements elicitation and analysis phases of even the most Agile methods. Instead they substitute what I like to call the waving of the hands form of requirements where everyone simply talks about what is needed, very often in "I'll recognize it when I see it" kind of terms. The design team does their best to interpret these statements, goes away and comes back with something that can be critiqued. The progressive elaboration of models and prototypes will allow an experienced team to eventually drive to a solution.

So what is wrong with this approach? After all, it does produce a solution. Is it optimal? Who knows? Is there any traceability to the design decisions made? Probably not. Does this sound like engineering? Emphatically no. We can do better.

What this approach lacks is the top-sight and planning that enable an architecture centric approach. Many solutions do not depend upon an architecture centric approach to achieve business success. When the solution is self-contained, highly derivative of a prior effort or does not have exceptional quality requirements, the emergent properties of the system will probably not be difficult to achieve when there is little or no "big analysis up front." But this is not the forefront of software engineering today and the solution to these challenges has not yet been found.

Emergent properties of a software system do not derive from one or even a small number of design decisions made when designing the product. Rather they emerge from a collection of many, most or sometimes all of the elements that comprise the system that have cross-cutting concerns addressed in a manner that allows the ensemble to achieve the collective goal. When the top-level decomposition of responsibilities has been properly done, these emergent properties can be achieved by independent developers working on sub-systems within the larger ensemble. But too often poor choices made in the first few design decisions can block the achievement of an emergent behavior even when all purely functional properties exist in the finished product. If the emergent properties that exist do not satisfy the business need, the product is rejected. But if the tradeoffs provide a product that "satifices", the product will be accepted and since the knowledge of what might have been is at best speculative (at worst, vindictive), the missed opportunity will never be known.

This can be improved by recognizing the key decision making role the client plays throughout the design process. The key challenge faced in trying to include the client, though, can be easily seen in a near universal scenario. The client has established a timeline and budget for the project. Then the requirements gathering begins. At some point, an astute engineer will observe that it is possible to get the needed functions and qualities a, b, and c at levels of x, y, and z and this combination is judged unacceptable. The engineer turns to the client and says, which are you willing to sacrafice. The client says "none. I want them all." What is really being said here is that the client is abrogating their responsiblity to be a decision maker in the project. If the engineer is correct that the qualities cannot all be achieved simultaneously (time and budget being two of those qualities), then the decision is foisted upon a team member to make a decision for the client. Just as the client felt cornered when asked to sign off on the requirements document, so too does the designer feel abandoned just when he needs the client the most. This is, after all, a business decision, a business decision that can have real business consequences. What is needed is for the client to engage in a form of negotiation to explore the possibilities, ensure that it really is impossible to achieve all goals simultaneously and then to take responsibility for the ultimate responsibility with the help of the business analyst, architect or whoever is in the lead design role.

So you can see that it is not as if requirements are completely elaborated at the time the project team ordinarily asks the business to sign off on them. This is an important first step and one that cannot be taken lightly. But neither can this be seen as the final word for business input. For an architecture centric approach, it must include sufficient elaboration of the qualities that can only be achieved with cross-cutting concerns that span a large portion of the code base to be developed. For any development it must at least be accurate even if it is not complete. But the impact of the aspirational goals cannot be known until the design progresses to a lower level of design and only then can the implications of that set of aspirational goals be known. As soon as possible thereafter the client must be brought back into the design process to negotiate the tradeoffs that should be made.

This process is much closer to advocacy than it is to specification. As leverage points appear there is someone to argue for the most advantageous tradeoffs that achieve their client's goals. For the client representative(s), it is to collaborate with the chief designer. For the project team, it is to do the most professional job possible in predicting the properties that will be present in the final product extended from this design.

Monday, September 12, 2011

from

Tim Bender to me

show details Sep 10 (2 days ago)

In discussing a large software project with an artist, I conjured the analogy that a software product is like a painting. Each engineer is an artist with their own style and they must paint a small portion of a large masterpiece, often without ever being able to see the whole thing. Recently, I watched a movie which made me recall this analogy and I pondered it further as a way of explaining in quite a simple way some of the rather complex interactions that occur in software engineering.

The idea centers around giving groups the common task of creating the simple image of a house with a lawn, a small family, the sun, a bird, and a flower. The image would need to be simple enough to be easily reproduced by an individual, but complex enough to offer varying entry points for concept learning opportunities.

Some of these concepts are weak and need some fleshing out.
1. Requirement solicitation:
Scenario: Give a team a sheet of paper and some coloring pencils. Express to them the importance of this drawing and that it must look exactly like what is being requested. Tell them to draw "a house with a lawn, a small family, the sun, and perhaps a bird and flower or something". Being purposefully vague and leaving them to either create only the initial description, or go ask follow-up questions.
Challenge: Requesting help and/or more information.

2. Specifying interfaces:
Scenario: Give a team the exact image they are to produce and a collection of transparencies attached to construction paper (making them non-transparent). Tell the team that each person must draw something on their transparency. Inform that the transparencies will be stacked to create the complete image.
Challenge: Specifying interfaces clearly to minimize integration failures.

3. Organizing for a task:

Scenario: Give a team the exact image they are to produce and a blank canvas with some pens. Tell them that each of them must contribute something and that they will be asked to state their contribution. From there, let them self-organize.

Challenge: Skill auditing. Some members will be excellent at sketching/drawing. Some may be terrible. Those that are not gifted in this area could take on a small portion of the image or provide administrative support. A variation might include purposefully leaving out supplies that a team member must requisition.

I was curious if you would have any thoughts on this.

***********************

Me? Have opinions? Perish the thought!

This is an interesting metaphor but I probably don't see it the same way you do.

On my visit to Vienna I had a curated tour of the Kusthistoriches (Art History) Museum. In the pre-20th century tradition of large art studios, it was common for the studio to accept commissions in very must the way you describe and in a way that does have some interesting parallels to software engineering. Each studio is headed by a master who is the creative genius that gave the studio its fame and provides the marketing effort to keep it employed. But given the output of the studio, it was not possible for the master to personally paint all the canvases. Raphael ran such a studio and we spent time talking about how to tell a Raphael from his hand versus one that he may have never touched.

In the studio system the various artists had their individual talents. The less talented might be relegated to painting landscape backgrounds, another buildings, etc. Only the most talented would paint the main subject which was always a human being. The ability to achieve life-like form and tone was prized and only the very best could capture the "essence" of a human subject. For historical paintings that had many figures, the master might paint a few of the figures but would allow the studio to fill in the remainder. This form or organization seems to be very similar to how a software project develops. There is someone with experience who is responsible for the early launch and concept. But what is different is that unless it is an architecture driven development, there is no chief designer who will oversee the design from start to finish. With each phase transition it is the project manager who oversees the progress toward the goal and not an architect. A project manager is usually more focused on the more mundane aspects of the project such as time and budget rather than the less externally visible properties of the system under development.

There is a fundamental difference between art and software. For art, the effect of the completed product is what is valued and not the qualities of the individual pieces, no matter how good. So it is with a software system. A property like security is not solely dependent upon any single component, although that cross-cutting concern will be seen in many of the modules. The lack of attention to even one point in the design can doom the security of the entire system. To one extent or another, this can be true of all the quality (non-functional) requirements placed upon the system. Performance is famously lost through a weak-link in the processing as is availability; modifiability by the inappropriate decomposition of the modules; usability by the lack of a facility for an un-do in the transactions. These qualities can be reduced to an engineering model that provides quantitative, or at least, testable properties. This is not true of art which is often said to lie in the eye of the beholder; not a good quality in an engineering project.

Ironically, I have seen many project proceed as if they were art project. The project management, the client, the business analysts, and the coders all proceed with the functional requirements only and wing it with the crap that is often offered as "non-functional requirements." This forces the team to make decisions for the client as to what qualities and what relative priorities to give to those qualities. This causes the client to not perceive the lack of some quality until they see an early prototype of the product, or worse, the finished product. Then, and only then, does the performance/availability/fault tolerance/usability/security get the attention that it deserved often with disastrous consequences to the design concepts that had guided the many hands that crafted the code.

The number 2 scenario sounds exactly like the way cell animation is done. I think the key difference here is that the transparencies are opaque and therefore block what is behind. This makes the "interface" irrelevant since there is no need for coordination between Bambi and the forest trees behind her. As long as there is no interleaving between two transparencies, there is no need for the careful coordination that is suggested with interfaces. Software, of course, is very different since most any decomposition of a function into smaller modules will create some form of interface that must be spec'd.

For any product that will be created by many hands I cannot believe that self-organization, as it is often understood, can work. I think most people take self-organization to be some form of egalitarian 'let's all get along' I do not believe this is what the Agile manifesto suggests. Rather I believe the Agile manifesto suggests that experienced and skilled professionals are more effective of organizing themselves for the task without an explicit plan from some "manager" who is not as experienced with the technology. However all the same roles will be filled. Group dynamics ensures that a leader will be selected and that the group will go through the steps of forming, storming, norming and performing. Self-organization will also do nothing to prevent the group dysfunctions that affect traditional forms of management as well.

Going back to your example, if a group of artists are given a commission and left to self-organize, the leader that emerges is not guaranteed to be the best artist. It is often true in engineering circles as well. Arrogance, self-confidence, domineering personalities are the most likely to be the group leaders. When this is also the most talented individual and has the leadership skills to bring the team along, the results can be striking. But it can just as easily lead to disaster or a mediocre product too.

Monday, August 15, 2011

Availability versus Fault Tolerant

From Quora...

What is the difference between a highly fault tolerant and a highly available system?

Edit

Add Question Details

Add Comment • Flag Question

3 Answers • Create Answer Wiki

Edmond Lau, Quora Engineer

26 votes by Todd Lipcon, Kimberly Wang, Scott Nyce, (more)

While availability and fault tolerance are sometimes conflated to mean the same concept, the two terms actually refer to different requirements. Designing for high availability is a stricter requirement than designing for high fault tolerance.

Availability is a measure of a system's uptime -- the percentage of time that a system is actually operational and providing its intended service. Service companies, when offering service level agreements (SLAs) to their customers, usually quantify their availability in nines of availability. Carrier-grade telecommunication networks claim "five nines" of availability [1, 2], meaning that the network should be up 99.999% of time and experience no more than 5.26 minutes of downtime per year. Amazon's S3 covers three nines of availability (99.9% uptime) in its SLA [3] and offers a service credit if it is down for more than 43.2 minutes per month.

Fault tolerance refers to a system's ability to continue operating, perhaps gracefully degrading in performance when components of the system fail. RAID 1, for example, by mirroring data across multiple disks, provides fault tolerance from disk failures [4]. Running a hot MySQL slave that can be promoted to a master if the master fails, or eliminating Hadoop's NameNode as a single point of failure [5] are other examples of making a system more fault tolerant.

Making individual components more reliable and more fault tolerant are steps toward making an overall system more highly available; however, a system can be fault tolerant and not be highly available. An analytics system based on Cassandra, for example, where there are no single points of failure might be considered fault tolerant, but if application-level data migrations, software upgrades, or configuration changes take an hour or more of downtime to complete, then the system is not highly available.

--------
[1] http://en.wikipedia.org/wiki/Car...
[2] http://www.windriver.com/announc...
[3] http://aws.amazon.com/s3-sla/
[4] http://en.wikipedia.org/wiki/RAID
[5] http://www.cloudera.com/blog/201...Suggest Edits

Friday, August 12, 2011

What is a Software Product?

I have just completed another seminar at SEI and I want to return to the subject of the prior post. However I am going to recast it into an answer to the question "What is a software product?"

I think most people are likely to take the question to refer to a consumer product like Word or Windows 7. In this case, the product primarily consists of an executable module but no source code. In addition, to make it fit for its intended use, there are several other assumed artifacts associated or imbedded into the product. One is either an installer application or instructions on how the product is to be placed into the intended environment. In addition, there is an implied deliverable which will instruct the end-user how to perform the tasks for which the product intended using the product. These days that is often left implied more than explicit as many products are designed to be self-evident. This means that the intended end-user has sufficient experience with similar user interfaces that the use can be learned with some exploration and no explicit training. Of course this is not always the case and products may either embed the user manual into the product or provide it as a separate artifact.

There is another sense of software product though. If you consider the automobile factory that turns out cars, the cars are products to be sold to consumers. However the factory itself is an asset that can be sold to another auto maker and retooled for their use. In a similar way, source code is a major component in a product that allows a software vendor to create software products. This "software factory" is a tangible asset and is recognized as that through intellectual property laws and common sense. (To avoid some awkward language, from here on, when I refer to a software product, I am using this second sense of the term and will ignore products that only include executables. )But is the source code the entire asset? If we were to sell our intellectual property to another, what can reasonably be considered part of the asset?

In the rush to market, many software vendors are willing to sacrifice the quality of the allied artifacts of the complete system. While this helps to achieve the end goal of creating a working system, it creates technical debt in the deferred maintenance on the artifact with a measurable increase in cost to the total cost of ownership. If the product will never be modified, the complete lack of supporting documentation for the product is reasonable. Even if some was produced during the initial construction of the product, there is no reason to keep it since it will never be read. But how many software products are created that are never intended to be modified? The correction of latent defects, changes in the environment, new requirements, are just a few of the many reasons why the maintenance phase of a software product life cycle is typically the costliest. To ignore the needs of this phase of the product life-cycle are foolish and self-defeating. Therefore any technical debt incurred from the initial development will eventually need to be paid if the consequences of that debt are to be avoided.

The platonic ideal that is sought is some form of self-documenting code. While local comments, when done well, or even well written code without comments, can often be understood without additional commentary, large software systems cannot be self-documenting since many issues transcend individual modules. A software engineer may be able to reproduce the information needed from the source code alone, or at least find a way to insert the modification without it, but this is most likely a more expensive fix than it would have been had the recreation work not been needed. The lack of adequate support for the code base increases maintenance cost. Further, until a code base automatically includes self-documenting ways to explain design choices that transcend individual module, this kind of documentation will continue to exist outside of the source code itself.

This highlights a weakness in the current practice of software engineering management. The true value of this artifact is discounted by management. Since the increased support cost can never be quantified, there is no self-correction of management attitudes. The continued ratio of maintenance cost to total life-cycle cost will remain the same. Worse, there are two compounding effects from this. First, the best information about the design of the system often only exists in the minds of the developers. When they leave the project, it is usually lost for good. Even if they remain in the same organization, the fidelity of the information is lost over time. Second, without an adequate understanding of the design principles used in constructing the product, a maintainer is likely to undermine some of the design clarity that requires restraint on the part of the coder due to ignorance. It is as if the architect leaves no plans for a building behind and the maintenance engineer tries to knock out a wall only to discover an unexpected support post or beam. Even worse would be to not recognize the structural nature of the element during the remodeling and remove it, thus weakening the structure increasing the risk of structural failure.

As Agile methods gain wider use in industry, some inexperienced developers are likely to believe that producing good design documentation is just a bother. A lack of professional development, the desire to just get on with their career, an ignorance of how to best document the design, or some combination of all three likely contributes to the poor quality of this product artifact. Often professionals do not even consider the design artifact to be part of the product itself but view it as some form of construction artifact that serves no purpose after delivery. Certainly many management teams do not appreciate, nor know how to evaluate, the quality of these allied artifacts.

The lack of attention given to the maintenance phase infrastructure and staff needs is a significant blind spot. It has always been a function that has suffered from lower status than the initial construction. As the product life cycles of these software products has gotten longer, there is likely to be at least some far sighted management teams who will eventually realize that long-term profits can be improved through a reduced maintenance phase in their software assets. Once that begins to happen, the types of design documents that most efficiently support the maintenance needs will become the object of study. Until then, these support staffs will continue to be expected to do their jobs with less than ideal knowledge transfer and the need to continually read the minds of developers who are no longer around to ask.

Tuesday, August 2, 2011

Program Documentation and Software Architecture

A friend wrote me about the frustration of performing software maintenance in an environment largely devoid of good program documentation. He finds that he must spend a great deal of time just trying to understand how the various classes relate to each other before he can begin to focus on finding the source of a bug or trying to decide how to add a piece of functionality. I feel for him. I have been there and I suspect most programmers at some point in there career have as well. It is an interesting way to peel back some truths about the software development life cycle.

The first truth is that modern languages do a poor job of capturing the larger abstractions of their design in a way that is self-maintaining. It is possible to create a code base that has good supporting documentation of the design underlying the code. However this documentation is separate from the code. Without proper management discipline, the supporting documentation will deviate from the as-built system, if it ever matched in the first place. At the higher levels of abstraction in the design, this is exactly what is described as the system architecture.

What a maintenance programmer wants in the software system is the quality of maintainability. This is greatly enhanced by the presence of good supporting design documentation. One of the key attributes desired in this documentation is the tracability of a requirement to the specification to the code which implements that quality. When the only artifact available is the source code, this is rarely possible since the relationship is never one-to-one and usually not even many to one. Instead it is a many to many relationship between requirement and code. The consequence of this is unintended consequence and it has cost many programmers sleepless nights that they search for ways to undo the unintended consequence of their fix or enhancement. With current, accurate and complete documentation, the maintenance programmer has a far greater chance of more quickly understanding how the code implements the function (or non-functional quality) and making informed decisions about how best to make the change or fix the bug consistent with the original design and without introducing a new bug.

The sad fact is that the majority of shops do not have even inaccurate design documentation that reflect their production systems. There are a few reasons for this but in the end it always is indicative of poor management. First, there must be management discipline to enforce the delivery of design documentation with a system. Second, the professional staff must have the skills and discipline to create usable documentation. Third, the tradition of looking upon maintenance programmers as less-than developers is short-sighted when viewed from the perspective of the entire system life-cycle.

If the development staff and maintenance staffs are part of the same organization, management has an excellent opportunity to evolve a site standard for the design documentation that will be most helpful to the maintenance organization. When they are within the same larger organization this is a straightforward management task. Peer reviews by the maintenance staff of the documentation to be turned over and the empowerment of the maintenance organization to resist turnovers that are incomplete or inaccurate will allow the maintenance organization to provide the kind of efficient and quality service that is expected of them.

If the development organization is not within the same organization, many other problems occur. Often development is out-sourced to a professional organization. While they will offer a high-value service, they will be constrained by the terms of their contract. Sadly, the needs of the maintenance and operational organization are often not given proper thought if they are even considered at all. Yet the statistics show that many systems have extended production and maintenance periods and that the money spent in those periods far exceeds what is spent in the initial development. Providing artifacts that reduce the cost and increase the efficiency of these processes is enlightened self-interest. Assuming the maintenance organization has a standard way that they document their system design, the development organization should be contracted to provide an acceptable product at delivery.

Since the beginning of system development, the emphasis has always been on the finished, working product with no regard for the artifacts that can be associated with that. Few shops even have a filing system in place to keep the products of the system development in a way that allows their review. More often than not, those products are boxed and remain in the project manager's office until he leaves the organization and are then discarded with no review.

This emphasis on the end-product alone extends to management decisions regarding what is important when a project inevitably is pressed to deliver and the schedule has slipped. Rarely will the staff be driven to deliver documentation contemporary with the product. This separation of the product and the documentation subverts the review process (if it exists) and diminishes the quality of that product. Often errors in the documentation are only caught much later when the maintenance staff must use it to perform their work.

Since the design artifacts are never directly delivered, they often depart from the as-built system. Without the professional and management commitment to create and preserve high-quality design documentation, it will not happen. This is as much due to the lack of professionalism within the management of the maintenance staff. A seasoned maintenance staff with experience will push for and receive good documentation that will support their job function.

Saturday, July 30, 2011

Overspecification

One of the key concepts in software engineering is the need to avoid over specification. It is natural that at some point in the decomposition of the specification of a system to be created that the specifier resorts to a description of how to do it instead of a statement of what must be done. When you drill down into this the problem is the difference between denotational semantics versus axiomatic semantics. In the first, the problem to be decomposed is given in what is hopefully the most abstract algorithmic way possible by the specifier. The problem is that it assumes there is only one algorithm possible for the solution and it locks the implementor into that algorithm.

The alternative is the axiomatic semantics of stating the pre- and post-conditions as well as any invariants in the required solution. This at once gives the implementer the choice of algorithm and implementation choices possible in the solution set. But at the same time it gives the implementer no direction as to how it can be achieved. Traditionally in commercial work the axiomatic method of specification has not been used merely because of the difficulty of making these statements about the required implementation. They are seen in formal methods but the difficulty of implementing formal methods is well known.

Friday, July 29, 2011

A worked case study

For yucks, I am going to work through a trivial design alternative in two directions to see how this progresses. The problem to be solved is a simple lookup where the user enters a term and the system responds with a definition or some other static text associated with the term. There are two alternative architectures to be explored. The first is a standard pc app type where everything would be loaded onto a personal computer. The second is a web app using a client server pattern. I am interested in seeing how the progressive elaboration of the client server pattern evolves as I drill down to the lower levels of design. I will be taking this much further than a typical architecture design since after specifying the client-server pattern, an architecture design would probably only go one level further down unless there were a quality sought that could not be assured at that level.

So the first cut is to recognize the client-server pattern. There is a client module, a server module and a relationship which is the internet. The next cut will be to provide specification to each of those pieces.

Thursday, July 28, 2011

flows

[Bass2003] calls out that views for data flows and flows of control are projections of already discussed flows. I think this is an important point to make in lecture. Many informal architectural diagrams are often data flow or control flow diagrams.

a linguistic approach to software architecture design

Discussions of software architecture depend upon the meaning of the words and symbols used by the people in the discussion. Like a human language, the tokens can be densely packed patterns or styles or they can be low-level primitives to show how a design can be refined. It seems to me that the same cognitive mechanisms are at work in both.

When I talk about a client-server system I conjure up an abstract concept just as if I had said chair. We recognize both by a set of characteristics that define the concept. Yet to be helpful both must be modified to assist with design. I may speak of a dining room chair or lounge chair. I may talk about a web server and client or one on a proprietary network that uses a protocol other than HTTP/HTML. The concept lends itself to restriction and extension. In language we usually do this with adjectives. As a class concept, this would be done with sub-classes. I saw some references to work like this that I must check out.

This is also highly consistent with the Lakoff/Johnson way of looking at these patterns as metaphors. We invoke the metaphor and then call out the ways in which they are extended into concrete forms or in which we define a new form by substituting one thing for another.

Problem Space and Design Space

In the design literature there are discussions of the problem space and the design space. I assume that a grad student at CSUS in software engineering or computer science will be new to these terms. Since the concepts bring forward several important aspects of the process of design they are worth discussion.

The problem space is that area of problem statement and analysis. After all, if this is to be a solution there must have been a problem (or opportunity) that justified its creation. At least for me, I like to imagine this problem space as a line with inflections that represent the various needs, constraints and desires of the problem to be solved. During analysis, the line will not be distinct, but rather fuzzy and broad with a great deal of imprecision regarding what is really needed. Ideally this line (or surface if you want to think of it as 2 dimensional) is completely drawn to a specific level of detail before design begins but that is often not so.

What the designer attempts to do is create the ideal product that will match the requirements of the problem space with a product with capabilities complementary to the needs. The fitness for purpose will be the reflected in the gaps between the problem space and the solution space.

Architecture Tools

The design of a viable software architecture for a system depends upon its fitness for use. Unless and until we have a model which can perform a quantitative evaluation of a given architecture against its stated use, the evaluation of a proposed software architecture must depend upon human processes that are not algorithmic and inherently non-deterministic. What I want to talk about is this schism between those tools used in the design and evaluation of a software architecture between those which have some hope of machine implementation versus those which are far less likely, or impossible, to ever fully capture in any program.

Software engineering academics have been exploring software metrics for several decades and have come up with various measures that seem to show promise of building a predictive theory for software. Measures for cohesion and coupling as well as the cyclomatic complexity are some. While these were originally envisioned for code level analysis, they have been used at higher levels of abstraction with some success. This is a well recognized area of research and one which will continue to develop. It must be within the scope of the education for a well-educated software architect.

But in contrast to these quantitative metrics, there is a great deal of attention paid to the methodologies that are to be used in the system's development life cycle as they relate to the creation of a software architecture. While there is a good case to be made that the study of these methodologies is more properly in the scope of management information science, I believe it is a mistake to completely separate the methodologies from the more quantitative aspects of software engineering. A multi-disciplinary approach is required since there must not be a bright line between software architecture and the project roles through which a software architect may rise. The development of a talented software architect must balance the "harder" engineering knowledge with the "softer" management knowledge to achieve the synergism that will result in the most capable worker.

Most software engineering education focuses on the lower level aspect of design, specifically at the level of the object or module. This is necessary since without an understanding of the very basic concepts of data structures, control structures, formal syntax and object-oriented methods, a true understanding of a software-oriented system is not possible. The inherent limits of the tools of software engineering is needed just as strength of materials is needed for a civil engineer or basic chemistry is needed for a chemical engineer. These are the hard stops that can be encountered and the engineer must understand them if successful designs are to be created.

However as the size and complexity of the software systems grow, the layers of abstraction must also grow if the resulting system is to remain comprehensible. The literature is rife with case studies of systems that were created by individuals over an extended period of time whose design is really only known by the creator and exists in no communicable artifact anywhere. With luck, there is always some underling who is prepared to fill the role of the designer if that person leaves the organization. This illustrates several different ways in which software architecture begins to separate from the lower levels of design.

First is to note that if you envision a system small enough that it is the result of a single person, that suggests that the person single-handedly design and implemented the system. There was no separation of roles between the designer and the builder. It should be self-evident that this organizational structure presents limits to the size and complexity of the system to be built and the time frames in which it can be built. Some exceptional people have build sophisticated systems in short periods of time but for the purpose of an engineering discipline we ignore these outliers since these feats tend to be difficult to repeat with any dependability. This is outside the realm of what we aspire to in the day-to-day world of engineering. We all hope that we will rise to these ranks but it should never be the expectation that a sole person create something so extraordinary.

Once the problem becomes so large that it is not reasonable for a single individual to create the system that is required, it is natural that separation of concerns and focus on skill sets begins to create distinct roles and that these roles are brought together into a team environment. Given the complexity of the modern programming languages and the tools needed to create them, the role of the coder was long ago established. This often provides entry level positions for the software engineer since the specification for a module can be very highly structures so as to leave relatively little latitude for the coder yet allowing him to learn the business architecture of his client and the existing architecture of the system under construction (or maintenance).

Defining the role of the coder immediately creates a new role; that of the person who writes the specification for the module to be coded. The practice once was that this responsibility fell to a business analyst. It became their responsibility to gather requirements and specify the modules that needed to be created.

Since there would often be many coders and often different levels of coders depending upon their capabilities, the project team would be sufficiently large that it required the talent of a team lead who would report to the client on management matter such as schedule and budget.

In many ways this naive team structure hasn't existing in exactly this form for a generation or more. It has been found to be wanting in the same way that the waterfall methodologies that were created in the formative stages of software engineering were inadequate to explain what was actually done as opposed to a helpful intellectual model for what was supposedly the ideal way. But before we explore the ways in which this was left behind, let's continue by looking at the legacy that these early methodologies left.

Much systems thought probably comes from the work in the military-industrial complex. These large, mission critical projects taken on by these large organizations inherited the culture from the military of how to take a very large effort and deconstruct it into a set of smaller tasks with the needed oversight and control. Whatever else is said about the waterfall model, it has been the reference for how a large work team should be organized. I'll assume for now that you understand that model and plow on.

As befits its origin, systems development methodologies reflected a mechanistic attitude toward the creation of a software system with all the inputs, processes and outputs neatly laid out in a graph reflecting the predecessor tasks and artifacts of each process and specified their output. If the inputs and processes are correct, the outputs will be sufficient for the successor tasks. The entire process would flow as smoothly as a well oiled machine.

The most basic assumption of a strictly enforced waterfall project plan is that everything that is needed to make important decisions is known at the conclusion of the requirements phase of the project. Details may need to be worked out but the ability to see the structure to be built is sufficient to allow for prediction of the cost and effort.

This assumption has been more wrong than right in practice. Successful projects seem to be a product of shrewd negotiators with enough experience to argue for sufficient resources in the absence of hard data to support it, to secure that funding from business managers and to manage the project by limiting the scope to the money and time available, not to some theoretical document that adequately articulate all of the requirements for the product to be built.

This cynical discussion is tangential to the main point I will make but important to provide the context in which software engineering takes place in the real world. To ignore the real world and embrace some model of perfectly logical business managers is about a realistic as an architect designing a building in which the wind will never exceed 10 mph; it is fantasy or art, not engineering. In engineering you substitute reality for desire and accept the limits whether they be logic or the results from social science.

The second big fallacy in the well developed waterfall methodologies of the past is that they assume that the future is like the past, that the system to be created is sufficiently like the other systems that all the tasks and artifacts can be predicted. Since many of the failures are attributable to failures in the requirements gathering phase, those errors are expensive to fix, if they can be fixed within the time and budget constraints. A sufficiently experienced team may know the needs of the client better than the client. In those cases, the project can be steered toward success even when the requirements gathering phase has technically been incomplete, inconsistent or incorrect. In many captive development shops, this has been the state for many years. The success may be attributed to the methodology but in reality the success is because of the staff, not the tool.

This suggests another reason why the waterfall methodology is flawed. Business managers who must make hire and fire decisions must provide the workers to staff the project. Yet their ability to assess the capabilities of the untested workers is limited. Hence, just because someone fills a particular role on a project team is no guarantee they will do their job well. Some of this can be addressed by a good quality assurance program but often the organizations with many new workers, business managers with little experience staffing projects and a project with tight time and money constraints are the same organizations that have poor quality assurance programs. Again, the historical roots in the military-industrial complex are not carried over into a commercial environment since the imperative of the mission critical project means something wholly different in a military context than it does in most commerce.

There have been two very different reactions to the failure of these waterfall methodologies. One reaction is to impose a quality assurance program. By its nature, a quality assurance program requires artifacts against which verification and validation can be performed. For very large projects, these artifacts are complex and expensive to produce. Yet without them no QA can be performed. The somewhat logical response of management to project failures was to improve the quality processes creating greater emphasis on the artifacts or adding to the artifacts that must be created in an attempt to detect project problems earlier and mount corrective action. It must be clear that this can become a self-reinforcing feedback loop. After a few cycles the systems development life-cycle becomes a bureaucratic morass of paperwork to be filled out and documents to be created which go to committees for review and approval before work can officially progress.

The frustration software engineers felt when caught in this kind of environment led to the Agile Manifesto which was a cri de coeur from the software engineers that saw the folly of this progression. Here is one version:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

Here is how I interpret this declaration and what it opposes. The opposition to processes (i.e. methodologies) is explicit. A methodology is something that exists to promote the proper interactions of the individuals on the team. But it had become a straight-jacket forcing developers to ignore their instincts and shutting down debate rather than supporting it. This is particularly attractive to the gen x crowd and the changing realities of software engineering decisions. The culture of top-down management and the unidirectional communications no longer made sense in such a complex development environment. People needed to work more cooperatively and exert greater thought and depend less on "tools" (whether they are programming languages which can lead to academic discussions among well educated software engineers that are beside the point, rubrics which were probably best adapted to technology from a prior generation at best, or project management systems that sought to measure and categorize every hour expended on the development) There was an intuitive understanding that people needed to talk to each other and to develop a sense of shared commitment for success.

The second point of the manifesto was a reaction to the exceedingly long lag times between project initiation and the delivery of a working product. Even in the best of circumstances, the business reality can fundamentally change in that period of time. Consumers, commerce and government wanted to be more nimble and to be able to respond to changes more quickly. While human processes can be changed relatively quickly, automated processes were proving very difficult to change. Once created software was not modifiable. Even while the project was ongoing, responding to a change request was often contentious and difficult to assimilate without impact to the budget and schedule.

What the Agile Manifesto implicitly required was a form of iterative development where the delivery of working software was accelerated even when the product was not necessarily something that completely solved the problem. One of the stated reasons is that since requirements gathering and documentation was so infallible, why do it at all? Why not create a prototype that would demonstrate what the developers believed was needed from conversations with the client and then demonstrate it? Clients respond more favorably, and more constructively, to a working prototype than they do to an abstract document that they are never sure they completely understand. The cycle of development can become much shorter, ideally measured in weeks, and the ultimate product developed by continually iterating, getting closer to the final product with each iteration.

The call for customer collaboration was a reaction to the inherent animosity that the unbridled waterfall methodology created between the client and the development organization. The model envisioned that the customer could collaborate on the creation of a requirements document that would act as a contract for development. Either explicitly or implicitly the client was asked to sign off on the requirements document. Inevitably developers would attempt to design the system against this document. Misunderstandings or errors of incompleteness, inaccuracy or inconsistency would eventually be found and the impact to the budget and schedule would lead to acrimonious discussions between the development organization and the client regarding the interpretation of the requirements document.

Agile wanted to sidestep this unhelpful dynamic by stressing the continuing role of the client throughout the development process. If they client could not, or would not, commit the proper resources to answer questions as they arose instead of depending upon a requirements document that was never complete enough, then the failure would become a major indicator that the project was already in trouble long before code was created. The sense of shared commitment and responsibility has always been needed for success projects. The Manifesto reminded everyone of it.

Systems developed under the waterfall methodologies were often brittle and unmodifiable. Attempts to create systems that were modifiable often led to very complex designs as the developers attempted to make as much as possible easier to change. Inevitably their attempts failed as the cost of this complex design was difficult to deliver at an acceptable price and there always seemed to be one more inflection point in the design that had not been handled.

What the Agile Manifesto stressed was the inevitability of change and the need of everyone involved in the development effort to remain flexible, expect that change will happen and respond to it with a client-centered acceptance instead of a reactionary and defensive posture.

At this point the clash between at least what SEI espouses for a development methodology and the Agile Manifesto is brought into the classroom. Students now are well indoctrinated into the Agile Manifesto and the need for iterative design. However the creation of an architecture for a large software product requires a fair amount of Big Analysis Up Front (BAUF) in order to perform the kind of decisions that are needed for the first few decompositions. How can this be resolved?

So methodologies are an important part of the toolbox for large-system creation. It is unlikely that there are right and wrong methodologies in any absolute sense but rather drivers of the specific methodology that should be adopted by a specific project for a specific organization for the creation of a specific product. It must be driven by the risk factors of the effort, the organization and relationships between the developing organization and the client, and the novelty of the product to be created.

Besides these human process tools, there are more engineering oriented tools which assist with the technical aspects of design. They include tools to help in the task of architecture reconstruction, architecture presentation and documentation, and architecture design.

Procedural Tools for Architectural Analysis

ATAM, CBAM [Bass2003], SAAM, an earlier version of ATAM [Kazman94], quantified design space [Jum90][Hou91]

Tools for architecture reconstruction/recovery

Dali [Bass2003], Sneed's reengineering workbench [Sneed98], the software renovationfactories of Verhoef and associates [Band97], rearchitecting tool suite by Philips Research [Krikhaar99], Rigi Standard Form [Mueller93][Wong94], [Bowman99] outlines a method similar to Dali for extracting architectural documentation from the code of an implemented system, Harris and associates outline a framework for architecture reconstruction using a combined bottom-up and top-down approach [Harris95], [Guo99]outlines the semi-automatic architecture recovery method called ARM for systems that are designed and developed using patterns.

Tools for Architectural Design

Universal Connector Language, UniCon (http://www.cs.cmu.edu/afs/cs/project/tinker-arch/www/html/1997/lectures/24B.UniCon/base.004.html),

AESOP (http://www.cs.cmu.edu/~able/aesop/aesop_home.html),

Architecture Language Tools

Module Interconnection language (MIL),

Interface definition languages (IDL),

WRIGHT architecture-specification language (http://www.cs.cmu.edu/~able/wright/) (http://www.iturls.com/English/SoftwareEngineering/SE_sb.asp),

ACME (http://www.cs.cmu.edu/~acme/docs/language_overview.html),

Tuesday, July 26, 2011

How do we capture qualities in use?

The method promoted by SEI for the collection of qualities in use is in what they call a quality scenario. Before you can talk about the quality in use, you must first describe the use. This is traditionally done with a use case scenario. In this UML diagramming technique, the system is represented as a single box with the user interacting with it. The user interacts with the software system within the context of some task to be done that includes some real world interaction such as the need to look up a phone number so a phone call can be made. The user will search for the phone number in the machine by entering criteria such as perhaps name and location information. This may be sufficient for the machine to respond with the phone number. This highly abstract description of the interaction serves as the basis for further analysis as the functionality is broken down. What happens if there is more than one phone number? What if none is found? Last name first? How is location entered? etc. But more importantly, many aspects of the interaction are left unstated allowing each reader to make their own determination about those unstated requirements. What is the response time? Do you enter and then walk away knowing it may take a day or more for the answer to come back? (This is not unreasonable if the query must be handled by a human at the other end for some countries where good records are not kept or where security may limit who can know the phone number requested.) or, as is more common, the assumption is that it will be an instantaneous response. But even this can be subject to different interpretations. If you are doing call center work with a client on the line and already irate at some perceived fault, even a two second response may be considered far from instantaneous. The business analyst must consider whether response time is something that warrants further investigation and documentation.

Let's say that in the case of the phone lookup, it turns out that what is needed is not a human lookup but an automated box to be connected to a predictive dialer on a call center system. In this case, the business model requires some predictability for the response to achieve ideal overall performance. In fact, the predictability may be more valuable than the speed with better results possible if the response is always 2 sec rather than 0.2 sec for most lookups but 2 sec for some. That detail may make a significant difference in the decisions the designer will make. In this case, the business analyst must note this in the documentation of that use case scenario.

Note that in this last example the specific quality sought, predictable response time, was explicitly stated. But even more important, some measure for that quality was eventually stated. It may have been stated with very simple terms such as "no response will be greater than 2 seconds" or it may have been stated in a more complex way such as "99% of the response will be 2 sec +/- 0.003 sec and fewer than 1 in 100,000 will be greater than 3 seconds." In either case, the way that a tester can construct a Boolean test to ensure that the quality is present is clearly suggested by the measure. In all cases, the stimulus and response for the quality measure matches that of the use case scenario on which it is based.

Earlier, we mentioned that the taxonomy of qualities in use is more academic than pragmatic. The reason for this is that in the human processes of requirements elicitation it can easily lead to unproductive discussions of the taxonomy and the semantics of words used to name these qualities. In the end what is needed is a quality scenario that specifies the quality measure, whatever it is called. To avoid the segues into these unproductive discussions, it is advisable to avoid becoming to concerned with resolving the semantic discrepancies that will come up on the discussions of software qualities and instead drive towards the quantitative specification of that quality given some use case scenario.

Those qualities of the system that are important to the owner but not observable to the user lend themselves to the same treatment. The only difference is that many of these use case scenarios are rarely documented since the scope of the testing effort and extended product life cycle processes are rarely included in the project. Therefore, these specific use cases must be named and identified before the quality attribute can be specified.

The cost and other global attributes of the product do not lend themselves to this treatment. Since they depend on the collection of all the decisions together, they must be handled differently. Later in architecture analysis, we will see at least one technique that can be used that includes cost as a factor in architectural decision making.