Master
Thesis in
Software Engineering
Thesis no: MSE-2001-XX
August 2001
Size Estimations in Software Projects
Henrik Löwendahl-Nyrén
Department of
Software Engineering and Computer Science
Blekinge Institute of Technology
Box 520
S - 372 25 Ronneby
Sweden
This thesis is submitted to Department of Software Engineering and Computer Science at Blekinge Institute of Technology in partial fulfilment of the requirements for the degree of Master of Science in Software Engineering.
Henrik Löwendahl-Nyrén
Villa Viola
372 36 Ronneby
E-mail: pt97hlo@student.bth.se
Homepage:
http://www.student.bth.se/~pt97hlo
Cellular
phone: +46 70 45 45 762
Professor Claes Wohlin
E-mail: claes.wohlin@bth.se
Office
phone: +46 457 38 58 20
Department of
Software Engineering and Computer Science Homepage: http://www.ipd.bth.se
Blekinge Institute of Technology Phone: +46 457 38 50 00
Box 520 Fax: +46 457 38 50 57
S - 372 25 Ronneby
Sweden
Size estimation in software projects is an interesting topic as
well as blablabla...
I would like to thank my supervisor and the different groups that have helped me and provided me with information.
Contact information 2
Author: 2
University Advisor: 2
Abstract 3
Acknowledgments 3
Table of Contents 4
Introduction 5
Background 6
Hypothesis 6
Purpose and goals 6
Limitations 6
State of the art 7
The process of estimating 7
Effort estimations 7
Size estimations 9
State of practice 10
Summary 11
Conclusion 12
Future work 13
References 14
Terms and abbreviations 15
When a software development project is to be planned it is essential to be able to make good and accurate cost and effort estimations. ...
The process of estimating the activities in a software project is one of the hardest and most difficult in the planning of the project.
In the course Software Project Management (dpt403) we performed 18 interviews with project managers in the industry and on the question "What are the most common causes for deviations?" 13 of 18 answered "underestimation" or "time-optimism". On basis of these answers I plan to continue to collect more detailed information about the estimation process in some of the companies already interviewed.
Within the area of estimations there exists a lot of models to help you do the work (e.g. COCOMO) but they are often used to estimate the cost or effort and thus takes size as an in-parameter. However, the result won't be good unless you have good estimations of the size. The COCOMO uses lines of code (LOC) as input to estimate the effort and cost, but other alternatives are Function Points and other functional size metrics. There are two ways of using the functional size metrics to estimate effort and cost. The first is to use the metric it self to do the estimations and the second is to use the functional size to estimate the number of LOC and then use that to estimate the effort and cost.
The focus on my work will be on the way the size of the different parts of the project are estimated.
Hypothesis here...(?)
Purpose and goals here...
Limitations here...
In this section I will give an overview of what the literature says about the current research in the area of software effort and size estimations. Fist a short part of the overall process of estimating the size and effort of a software product and then more specific parts about effort and size estimations respectively.
When you are about to execute a software project you (in most cases) have to know how long it will take and how much it will cost, before you get to it. The most common way to do this should be to collect the requirements from the customers (internal or external), make some kind of work breakdown structure and then make a "guesstimation" of how large the project will be. This is done in order to set a price and deadline on the project.
According to C. R. Symons in [Symons 1991], there are three main
components that combined gives the size of a software project. The
first is the size of the required functionality to be delivered. As
Symons advocates Function Points, he defines the size as some kind of
product of transactions, inputs, outputs or -as Symons writes -
"whatever". The second component is the size of the
technical requirements, such as high performance, ease of use or ease
of maintenance requirements. The third component is the
performance-drivers. This includes project management approach
choices, risks in the project, the skills and experiences of the
project members and what programming language to be used.
To
estimate the effort and "size" of the entire development
task, all three components are used, but to estimate the system
size, only the first two are
considered.
Several efforts have been made to improve this process and among the most famous is Barry Boehm's COCOMO [Boehm 1981]. Since then Boehm has updated the model to COCOMO 2 [Boehm et. al. 1998].
Effort estimations are made to get an idea of how long the project will last and how much resources are needed to complete it in time. There are several ways of estimating the effort. One way is to make a "guesstimation" by making a qualified personal estimation of it. Another is to use existing models to help calculate it. The COCOMO is probably the most known and is an algorithmic model.
According to Boehm [Boehm 1981] there are seven different methods to estimate the effort of a software project.
Algorithmic Models use one or more algorithm to calculate the effort to develop the software product with major cost drivers as variables. COCOMO is an example of this.
Expert Judgment is when you ask one or more expert and they give a qualified "guesstimation" of the effort needed. The Delphi Technique is an example of this type of estimating method.
Analogy can be used if you have plenty of data collected from earlier projects. The idea is to relate the actual cost of a completed project, to an estimate of the cost of a similar new project.
Parkinson's principle is about estimating the effort to fill available resources.
Price to Win is used when you estimate to the prize you believe will win the contract or to the deadline you believe is needed to be first on the market.
Top-Down is when you estimate the effort of the entire system and then split it up between the different modules or components.
Bottom-Up is then you estimate the effort of every module or component individually and then sum it up for the total effort.
As COCOMO is one of the most known effort estimation models, I give a brief overview of how it works here. The COCOMO is divided in three levels of detail.
The first is the basic COCOMO model which estimates the number of man months (MM) it will take do develop the most common type of software product, in terms of thousands of delivered source instructions (KDSI) like this:
MM = 2.4(KDSI)^1.05.
A man month is 152 hours and one delivered source
instruction is basically defined as a line of source code including
declarations, excluding comments.
There are also equations for
estimating productivity, schedule and average staffing.
The intermediate COCOMO model has 15 cost driver attributes to make the estimation more accurate then in the basic COCOMO model. The 15 factors are divided in four categories: software product attributes, computer attributes, personnel attributes, and project attributes. Each factor is estimated on a range from "Very Low" to "Extra high" (6 levels) and then get a value according to a table in the book [Boehm 1981]. There are detailed descriptions on how you should value the different factors, as well. To make the intermediate COCOMO model even more accurate, you can do the estimations per component, instead of the entire system. This is useful when different components in the system have significantly different complexity or the personnel have different experiences of different components for example.
In the detailed COCOMO model, a few more aspects are introduced to
make the estimations more accurate. A three level hierarchical
decomposition of the software whose cost is to be estimated is used.
The three levels are the module level, the subsystem level and the
system level. The different cost drivers are applied to different
levels, since for example, the modules complexity and programmers
ability are applied to the module level, while storage constraints
and the use of tools are applied to the subsystem level. The top
level (the system level) "is used to apply major overall project
relations such as the nominal effort and schedule equations and to
apply the nominal project effort and schedule breakdown by
phase."
Another big difference, from the basic and
intermediate COCOMO models, is the use of phase sensitive effort
multipliers. For example, high demands on computer response time will
lead to increased cost during coding and test phases, while a low
level of application experience will lead to increased pre-study and
analysis, but at the end of the development, the personnel have
become familiar with the application.
Since 1981, when the COCOMO was published, Boehm et. al. has been working on a new version - COCOMO II and the book "Software Cost Estimation with COCOMO II" was published in 2000.
There are several major differences between the original COCOMO and COCOMO II and I'll try to make a short summary here below.
As in the original COCOMO, COCOMO II takes an estimate of the software size as one of the inputs, but instead of just having KDSI (now just called Lines of Code (LOC)) as size measurement, you can also use Function Points (FP). One advantedge of the FP measurement is that it is based on information that you have early in the project. The FP used in COCOMO II are the Unadjusted Function Points, since the usual FP procedures involves assessing the degree of influence (DI) of fourteen application characteristics on the software project. This is inconsistent with COCOMO experience, so COCOMO II uses Unadjusted Function Points for sizing [Boehm et. al. 1998].
"The software under-sizing problem is our most critical road block to accurate software cost estimation."
Three reasons for underestimating size:
People are basically optimistic and desire to please.
People tend to have incomplete recall of previous experience.
People are generally not familiar with the entire software job.
Boehm's experiences from looking at sizing data:
To be able to make a good estimation you have to be very precise in the specification and when you are precise enough, the software is almost finished. Typ?
Software products that solve the same problem(s) can vary in size significantly.
Blablabla
Summary here...
Conclusions here...
Future work here?
[Boehm 1981] Boehm B., Software Engineering Economics, Prentice Hall, 1981
[Boehm et. al. 1996] Boehm B. et. al., The COCOMO 2.0 Software Cost Estimation Model - A Status Report, American Programmer, July 1996, pp 2-17.
[Boehm et. al. 1998] Boehm B. et. al., COCOMO II Model Definition Manual, http://sunset.usc.edu/research/COCOMOII/, 1998
[Symons 1991] Symons C., Software Sizing and Estimation MkII FPA, Wiley, 1991
|
LOC |
Lines of Code |
|
SLOC |
Source Line of Code |
|
COCOMO |
The Constructive Cost Model |
|
KDSI |
thousands of Delivered Source Instructions |
|
FPA |
Function Point Analysis |
|
FP |
Function Point |
|
DI |
Degree of Influence |