Booker Randy Mathis: Advanced Software Estimating with Near Zero Variance

Near zero-variance is defined as the ability to achieve schedule and budget predictability with an absolutely minimal variance between project estimates and actual. It is achieved through an organic software estimating approach that is contextually based. Organic software estimating is a component of the Organic Software Engineering Approach that consists of new software engineering perception, paradigm, behaviors, and practices. The source of the estimating problem is the software incarnation of Frederick Taylor’s scientific management. We call it the SDLC; its goal is efficiency; and that is achieved through budget and schedule attainment. As a result, estimators rely on empirical cost models that use empirical data from “best in the class projects”. It is the software equivalent of Taylors time and motion studies. The only problem is that you cannot perform time and motion studies on the human brain.

Any proposed solution to the software problems must deal with estimating. The ability to meet schedules and budgets is a core problem that must be tackled. Achieving both system and project predictability is contingent upon the accuracy of the estimating system. Current estimating models grossly underestimate resource requirements so most projects are doomed from the start. The carnage caused by poor estimating practices is devastating. In economic terms it can be calculated in the billions and even trillions of dollars in wasted software costs and lost economic opportunity costs. Socially, it can be calculated in terms of voluntary and involuntary terminations, demotions, and careers put on hold. The same mechanistic paradigm that gives the world Sweat Shops give software developers Death Marches.

Estimating has always been problematic. The idea that some cost model based on empirical data could accurately estimate all projects past, present and future discounts the uniqueness of software development organizations and their projects. In order to improve estimating accuracy, estimators have been researching and creating various quantitative software development models to describe the software development process. However, the models are too generic and lack the granularity needed to estimate specific software projects. The default parameters can not accurately represent the wide variability of software project contexts. All of this is understandable, without organizational, technical, and project specific data the accuracy of the cost model are limited. These organizational, technical and project specific variables include but are not limited to:

1) Software organizations use different organizational factors, management policies, technologies, and management to create development cultures with different behaviors, attitudes, norms, and values. The culture determines attitudes towards organizational objectives, the amount of enthusiasm and energy applied to tasks, how work is performed or not performed, how internal groups support or hamper project teams, and ultimately how things get done or not get done.

2) There are significant productivity differences in software engineering approaches like the OSEA and SDLC, methodologies, and artifacts. This includes the waterfall, incremental, evolutionary, RAD, Agile or other life cycle models. The life cycle approach represents the difference in the project's course of development and hence the project's phase, events, activities, work products and tasks. It significantly impacts effort, schedules, and costs.

3) Project categories, source, and type have a major estimating impact. Whether a team is developing software, purchasing a package, replicating a system, or converting a system have significant impact on productivity. So does project source like internal development, project developed under contract, project produced for marketing, and many others. Project technical types embedded systems, time critical real-time, scientific or engineering, system programming, distributed and network, data processing and database, expert and artificial intelligence, image and pattern recognition, large scale simulation and modeling

4) Language groups and generation including 5GL, 4GL 3GL and assembler languages have a major impact on productivity.

5) System size and complexity have a major impact on productivity. As the size of the software product increases linearly, project resource requirements for development, communications, integration, coordination, and documentation increase exponentially.

6) The alignment of the development system used to develop the system with the develop system needed to develop the project is important. The greater the contradictions or redundancies between development system and project reality in terms of needed technology, processes, and artifacts, the lower the productivity.

7) Variations in staff caliber including skills, knowledge, experience, and motivation vary widely and affect productivity. Development team context includes development team experience, project support, and project management experience, management support, technical support, and project team quality.

8) User context includes user knowledge, experience, attitudes, support, and organizational turbulence. User context is a key productivity element. User context and the complexity of the problem will help determine the stability of the project and the subsequent volatility that causes rework.

9) Problem context includes problems size, problem complexity, precedent, problem domain maturity, and partial functionality. In the problem context, the major cost drivers are precedent. Precedent refers to domains that are new the organizational and includes technology, computer science, and application domains. The uncertainty that comes with precedence is a major reason for estimating errors. In many situations it takes twice the effort to accomplish unfamiliar tasks as it takes to accomplish familiar tasks.

10) Business context includes purpose instability, business importance, and business integration. Purpose instability leads to product instability including changes in scope and requirements. High business importance requires extensive and often unproductive business management scrutiny. High integration requires extensive coordination, communication, collaboration, and cooperation.

11) Platforms have the potential for major differences in productivity. Platforms and their languages share different characteristics that affect productivity. Mainframe, PC, server, Internet/Web, embedded real time, robotics, mobile system, network, palms, and games systems all have the potential for significant variations in productivity.

12) The quality philosophy has a significant impact on productivity. In the SDLC, the primary quality issue is defects. Does the tool work or not? Organic software estimating is customer oriented so quality includes purpose achievement, customer value, satisfaction, business productivity and performance. The OSEA performance include functionality, system support, human factors, security, software quality factors, system operational performance, and defects.

Information Architecture

A key concept of organic software estimating is that organizations share common characteristics with living systems including plants and animals. People makeup organizations so thinking of an organization as a living system instead of a machine enables managers to harness the natural, organic, tendencies of organizations. For example, like any other living system, organizations send outputs to its environment and receive energy-inputs from the environment. High-order living systems also need training, feedback from experience, and a memory in order to survive, grow, and proliferate. As organizations are living systems, it is intuitive that in order to survive, grow and flourish, there is a need for feedback-control mechanisms and a corporate memory in the form of databases.

The information/decision architecture contains the project management assets including the data bases, tools, methodologies, processes and methods for estimating, planning, scheduling, monitoring, and managing software projects. History databases provide the data needed to create the profiles, relationships, and models. Estimates are more accurate because the estimator does not have to make assumptions inherent in cost model development. Estimators can align both the technical and project variables so that the software organization can statistically listen to their own organizational data, identify what was said, and apply the knowledge towards better management of the software development project. Databases allow a manager to find similar sets or clusters of projects that most closely match the new project and use statistical analysis routines to identify the projects with the highest degree of similarity and relevance to the new project. The results are a set of role model projects that become the basis of the quantitative models that guide the project team towards project success.

The databases contain history data from previous project and are used to create environmental profiles, relationships, and models based upon variable projects or clusters. Variable projects are selected by the estimator at estimating time and clusters involve a semi-fixed set of statistically similar projects in the project database. An environmental profile consists of characteristic and distribution profiles that typify a normal project in a cluster. Characteristic values identify standards in terms of process, performance, product, and quality. Distribution profiles describe standards in the form of the allocation of effort, duration, staff, and costs. Relationships are equations derived from statistical analysis of previous similar, history projects in the cluster. Relationships predict the values of unknown factors based upon known factors. For example, from software size, an independent variable, we can create equations to estimate size, effort, duration, and staff counts. Models are developed by averaging the behavior of similar sets of history projects in the local database. Models describe the typical behavior of a project over time.

Software Functional Elements

Software functional elements are the basic estimating elements used in organic software estimating. Software functional elements are also used in organic software requirements engineering and organic software development. Software functional elements consist of systems, subsystems, software functions, capabilities and features. Software functional elements are hierarchic systems that consist of relational components that are also hierarchic systems. Each system, subsystem, software functions, capability and feature consists of eight objects. The objects are purpose, environment, boundary, inputs, components, outputs, restrictions and feedback control. Features as the prime software estimating element can either compliment or replace function points and other related size measures. Feature can probably be categorized by the number of object elements such as the number of inputs, process services, and outputs.

The software functional element is the basis for organic software development because of the user orientation. Customers and users don't know anything about software requirements but they do understand the capabilities and features that they need to do their work. In addition, each software functional element has emergent quality needs that often cannot be decomposed to a lower level software functional element. Finally, software functional elements are self-organized systems that are design to support a specific external function including a business function or system function.

Organic design is based upon the concepts that software functional elements are more like biological functions than machine functions although machine functions do exist in living organisms. Muscles and bones are good example. Generally, biological systems contain vertical, multi-level relationships in a system hierarchy. Think in terms of cardiovascular, endocrine, digestive, lymphatic and respiratory systems. Biological systems are self-organized systems where each system supports a body functions. For example, the respiratory system is the system of organs in the body responsible for the intake of oxygen and the expiration of carbon dioxide. In mammals, it consists of the lungs, bronchi, bronchioles, trachea, diaphragm, and nerve supply. Organic design structures projects and systems like biological organisms. Just like the respiratory system support the intake of oxygen and the expiration of carbon dioxide, software functional elements support specific business elements including processes, sub-processes, functions, activities and tasks.

Software functional elements can be estimated top-down where the requirements are determined for the systems, subsystems, software functions, capabilities, and features. Software functional elements can also be estimated bottom-up as the lower level software functional elements are used to create higher level elements. The lowest level software functional element and the core estimating element is the feature that provides the services needed by the business community. The new quality engineering approach is based upon the organic service model. The objective is the creation of adaptable, flexible, customer-specific, and self-organized systems that are designed to provide the services that meet the specific needs of the business. Services need a high level of interfaces with its customers so software projects must be designed for a high degree of user interactions. Software services tend to grow so there is a need for design plasticity and flexibility. Instead of rigid software products, developers should build self-organized systems that are designed for self-maintenance, self-renewal, and self-evolution.

Project Specific Modeling

Project specific modeling applies a Business analytics (BA) approach to estimating. BA is defined in Wikipedia “Business analytics (BA) refers to the skills, technologies, applications and practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning. Business analytics focuses on developing new insights and understanding of business performance based on data and statistical methods. In contrast, business intelligence traditionally focuses on using a consistent set of metrics to both measure past performance and guide business planning, which is also based on data and statistical methods.”

Project specific modeling uses a human approach to estimating through the comparison of dissimilarities and similarities. Project specific modeling provide profiles, relationships and models that are based upon the needs of a particular project team in a particular organization that is staffed with particular set of people who contain particular skills, knowledge and character attributes. The project team use particular technology, artifacts, and processes to develop a particular system for particular users that execute particular business processes in a particular business environment. Similarity is based upon the concept that in the same organization, similar types of projects will evolve in similar ways. Similarity is a key concept in the development of models, profiles, and relationships. The greater the similarity, the more similar will be the experiences and the more comparable the evolution of the project. The more similar the experiences and comparable the evolution of the project, the more consistent are productivity and other relationships. The more consistent are the productivity and other relationships, the more accurate the derived quantitative models. The more accurate the quantitative models, the greater the predictability of future project outcomes. The greater the predictability of the future project outcomes, the greater the likelihood of project success.

Similarity is a common concept. Appraisers use similarity in assessing the market value of homes. Consider the data of recent home sales as local data, your neighborhood as the development organization, the size of the house as the size of the software product, and the features as special software cost drivers. The appraiser looks at recent sells of homes of similar square feet and the number of bedrooms and baths in the neighborhood. Then special features and other amenities are added to the primary estimates. The estimated market value is validated by comparable homes sold in the particular zip code or neighborhood. This is in contrast to empirical estimating methods where estimate the prices of home on the basis of the country or world-wide sales.

In project specific modeling the estimator select similar reference projects by searching the database for subsystems that shares the same attributes as the new subsystem. An estimator will select multiple sets of projects with similar independent variables from the history database and use the set of data points with the highest correlation coefficient to estimate the size and productivity of new project. The selected similarity sets are passed to a statistical model that displays the regression equation, R-squared value, and correlation coefficient for each similarity group. The group of similar projects with the highest R-squared has the strongest statistical relationship. The derived relationships are used to estimates of other size factors, productivity, duration, staff count, dollars and schedules. The results are a set of role model projects that become the basis of the quantitative models that guide the project team towards project success.

Then the project team must make adjustments. For example, to estimate a high risk project, an estimator develops an attribute profile of the new project and searches through the database for similar projects or a specific cluster. In this case, the estimator looks for technical attributes, contextual attributes, and project specific attributes. There are no exact matches but the more similar the characteristics the better. For this example, this project team consisted of junior developers and was contextually characterized as low development team experience. Let’s say that Insight, the advanced project management prototype that is the basis of this article, identified these reference projects as the most statistically significant. So the estimating equation would be based upon the referent history projects where teams had very low development experience.

The context is called “Low Development Team Experience”. If the project manager can replace the juniors with more senior staff, then this particular human resource variable is controllable. Otherwise, if he is struck with the juniors, then the variable is uncontrollable and mitigation is required. Mitigation is the process by which systems adjust the relational elements in order to compensate for a risk or deficiency in another element. In this case, the project’s four subsystems (software engineering, project management, human resource, and project design) are adjusted. For example, the software engineering subsystem options call for plans that are more detailed with an increased sampling rate of inspections, code readings, and walkthroughs. The project management subsystem options calls for a longer development cycle and an increase in schedule and duration. The human resource subsystem options calls for the availability of technical leads with good technical mentoring skills to help the juniors. The project design subsystem calls for smaller teams so that the juniors can get the attention that they need from the technical leads.

Booker Randy Mathis

Wednesday, December 7, 2011

Advanced Software Estimating with Near Zero Variance

No comments:

Post a Comment