Let’s Go Agile!�Data-Driven Agile Software Cost and Schedule Models ��
First International Boehm Forum on COCOMO®
and Systems and Software Cost Modeling
November 9-10, 2022
Disclaimer: The contents of this paper reflect the views of the authors and are not necessarily endorsed by the Department of Homeland Security
Wilson Rosa
Sara Jardine
Agenda
2
Introduction
Agile Project Dataset
Effort and Schedule Models
Results
What is the Problem?
3
DoD
DHS
“The Department [DHS] needs a credible and accurate method for estimating the cost of software development programs that can be tracked over time and provide insight into whether a program is behind schedule or is forecasted to exceed initial cost projections.”
- Stacy Marcott, Acting Chief Financial Officer, May 30, 2019
Policy mandates the application of agile software development best practices
What is the Solution?
4
Benefits to Program Management
5
FOC = Full Operational Capability
At any stage of the agile program acquisition lifecycle, the PM can choose a sizing measure to estimate the software development cost and schedule
Notes: MNS = Mission Needs Statement
CONOPS = Concept of Operations
RTM = Requirements Traceability Matrix
IOC = Initial Operational Capability
|
|
|
|
Evolution of Agile at DHS
6
2010
OMB issued a 25-point plan to reform IT projects and called on federal agencies to implement shorter delivery timeframe
2016
DHS Agile Development and Delivery for IT Instruction Manual 102-01-004-01
2017
DHS USM directed DHS CAD to find ways to improve agile software development programs [1]
2018
DHS Agile Methodology for Software Development and Delivery for IT, Policy Instruction 102-01-004
2019
DHS CAD launched the cross-agency Joint Agile Software Innovation (JASI) Cost IPT
2021
DHS Agile Guidebook
Collecting Agile data requires an agency data collection policy, close collaboration with PMs, and frequent iterative reassessment
Agile is an iterative approach to deliver solutions incrementally
Study Breakthroughs
7
Functional Story
Unadjusted Function Point
Simple Function Point
Story
Story Point
Issues
1
2
3
4
5
6
Agile Project Dataset
Data Collection
9
Data Sources
10
Effort
Schedule
Size
Context
Variable Selection�(Common Sense)
11
Effort
Schedule
Functional Story
Issue
Story
Story Point
Unadj. Function Point
Simple Function Point
Scope
Dependent
Independent
Categorical
Data Normalization:�How did we measure effort?
12
ID | DHS IT WBS Element |
1.i.1 | Program Management |
1.i.2 | Systems Engineering |
1.i.4.2 | Software Development |
1.i.4.3 | Data Development & Transition |
1.i.4.5 | Training Development |
1.i.4.6.1 | Development Test & Evaluation |
1.i.4.6.1 | Cybersecurity Test & Evaluation |
1.i.4.7 | Logistics Support Development |
1.i.7 | System Level Integration & Test |
1.i.8.6.1 | Help Desk/Service Desk (Tier 3) |
1.i.8.6.4 | Software Maintenance |
Reporting labor at the total level (as opposed to software development alone), is recommended since most DHS agile development contracts are FFP or T&M, and generally do not breakout effort by major cost elements as seen in traditional cost-plus contracts
Why use total labor?
Data Normalization:�Counting Functional Story, SiFP, UFP
13
Extract Product Backlog (JIRA)
Find Story
Find and Count Functional Story
Calculate Function Point
Step 1
Step 2
Step 3
Step 4
Issue Status |
Deferred |
Done |
In-Progress |
Issue Type |
Other |
Task |
Story |
Bug |
Category |
Functional Story |
Non-Functional Story |
Calculation |
SiFP |
UFP |
Notes:
*Performed by a Certified Function Point Specialist
**Functional Stories (from product backlog) = Functional Requirements (from RTM or FRD)
Dataset Demographics
14
Sample Size:18 Projects
Automated Information System
Majority (13) used cloud-hosted Amazon Web Services
Majority (15) used FFP or T&M Contracts
2 to 4-week Iterations
Dataset Demographics
15
Descriptive Statistics
16
Months
Descriptive Statistics
17
Relevant Range
When selecting a regression model, consider the relevant range of each independent variable
Effort and Schedule Models
Effort Benchmarks
19
Category | Benchmark | 25th Quartile | Median | 75th Quartile | StdDev | CV |
Effort | Hours/Functional Story | 410 | 494 | 653 | 261 | 47% |
Hours/UFP | 61 | 81 | 107 | 40 | 46% | |
Hours/SiFP | 57 | 71 | 100 | 39 | 47% |
Practical Application:
Schedule Benchmarks
20
Category | Benchmark | 25th Quartile | Median | 75th Quartile | StdDev | CV |
Schedule | Functional Story/FTE/ Month | 0.19 | 0.28 | 0.32 | 0.14 | 47% |
UFP / FTE / Month | 1.2 | 1.8 | 2.1 | 0.9 | 50% | |
SiFP / FTE / Month | 1.3 | 2.1 | 2.3 | 1.1 | 52% |
Practical Application:
Effort Estimation Models
21
Model | CER | N | SE | R2 | R2adj | R2pred | MAD |
1 | | 15 | 0.39 | 89.6% | 88.8% | 85.9% | 31.3% |
Effort (E) = | Total final development hours | ||||||
REQ = | Functional stories from backlog, RTM, or FRD | ||||||
Model | CER | N | SE | R2 | R2adj | R2pred | MAD |
2 | | 15 | 0.66 | 70.2% | 67.9% | 59.0% | 54.1% |
Effort (E) = | Total final development hours | ||||||
STORY = | Total stories obtained from JIRA backlog | ||||||
Effort Estimation Models
22
Model | CER | N | SE | R2 | R2adj | R2pred | MAD |
4 | | 14 | 0.39 | 84.4% | 83.1% | 78.2% | 32.7% |
Effort (E) = | Total final development hours | ||||||
STY_PTS = | Story points obtained from JIRA backlog | ||||||
Model | CER | N | SE | R2 | R2adj | R2pred | MAD |
3 | | 15 | 0.64 | 71.5% | 69.3% | 59.4% | 51.6% |
Effort (E) = | Total final development hours | ||||||
ISSUES = | Sum of stories, bugs, tasks, epics, or any other fixes | ||||||
Effort Estimation Models
23
Model | CER | N | SE | R2 | R2adj | R2pred | MAD |
6 | | 15 | 0.35 | 92.0% | 90.7% | 86.5% | 25.9% |
Effort (E) = | Total final development hours | ||||||
SiFP= | Simple Function Point | ||||||
D1= | Dummy variable (scope), where full development =1 and enhancement =0 | ||||||
Model | CER | N | SE | R2 | R2adj | R2pred | MAD |
5 | | 15 | 0.38 | 90.0% | 89.3% | 85.6% | 31.6% |
Effort (E) = | Total final development hours | ||||||
UFP = | Total unadjusted function points | ||||||
Schedule Estimation Models
24
Model | SER | N | SE | R2 | R2adj | R2pred | MAD |
1 | | 15 | 0.26 | 88.4% | 86.5% | 81.8% | 18.8% |
Schedule (S)= | Total final development months | ||||||
REQ= | Functional stories from backlog, RTM, or FRD | ||||||
D1= | Dummy variable (scope), where full development =1 and enhancement =0 | ||||||
Model | SER | N | SE | R2 | R2adj | R2pred | MAD |
2 | | 15 | 0.27 | 88.1% | 86.2% | 80.8% | 18.6% |
Schedule (S)= | Total final development months | ||||||
UFP= | Total unadjusted function points | ||||||
D1= | Dummy variable (scope), where full development =1 and enhancement =0 | ||||||
Model | SER | N | SE | R2 | R2adj | R2pred | MAD |
3 | | 15 | 0.27 | 88.0% | 86.0% | 80.4% | 18.4% |
Schedule (S)= | Total final development months | ||||||
SiFP= | Simple Function Point | ||||||
D1= | Dummy variable (scope), where full development =1 and enhancement =0 | ||||||
How do the Regression Models Rank?
25
Rank | Independent Variable | R2adj | R2pred | MMRE |
Effort Models | ||||
1 | | 90.7% | 86.5% | 25.9% |
2 | | 89.3% | 85.6% | 31.6% |
3 | | 88.8% | 85.9% | 31.3% |
4 | | 83.1% | 78.1% | 32.7% |
5 | | 69.3% | 59.4% | 51.6% |
6 | | 67.9% | 59.0% | 54.1% |
Schedule Models | ||||
1 | | 86.5% | 81.8% | 18.8% |
2 | | 86.2% | 80.8% | 18.6% |
3 | | 86.0% | 80.4% | 18.4% |
Simple Function Point (SiFP), Unadjusted Function Points (UFP), and Functional Stories (REQ) are stronger predicters to both effort and schedule for agile projects
Results
Model Limitations
27
Internal Threats
External Threats
Constructive Threats
Main Takeaways
28
Wider range of sizing measures (6) to estimate future agile software programs and evaluate contractor proposals
SiFP and UFP proved to be the most accurate predictors of agile software development effort and schedule
Analysis reveals that “Functional Story” is an effective predictor of effort and schedule, and easy to obtain
Popular agile measures such as story points, stories, and issues are not as effective predictors of effort and schedule
SiFP and UFP can be calculated early in the program allowing estimation from contract proposal through IOC (when popular agile measures are difficult to obtain)