SAT-based Analysis of
Large Real-world Feature Models is Easy
July 23, 2015
SPLC 2015, Nashville, TN
Jia Liang, Vijay Ganesh
Krzysztof Czarnecki, Venkatesh Raman
Feature Models
2
SAT-based Analysis
3
Feature model to CNF and Back
4
Feature Model
(6000+ features)
Propositional formula
CNF
(62000+ variables)
(270000+ clauses)
Variable Assignment
Product
Tseitin transformation
CDCL
SAT Solver
a or ¬b or c
b or ¬c or d
Problem Statement
Q: Why is CDCL SAT-based analysis so effective in analyzing large real-world feature models?
5
Modern CDCL Features?
Q: Is the SAT-based effectiveness on analyzing real-world feature models attributable to modern CDCL features?
“Empirical study of the anatomy of modern SAT solvers”,
Katebi et al., SAT 2011
6
Modern SAT Solvers
Empirical Study of the Anatomy of Modern SAT
Solvers (2011) identifies 4 main features:
7
Without Modern Features
8
Few Backtracks
Q: Why do feature models cause very backtracks in CDCL solvers?
9
Variable Branching
10
a
You are here
false
true
exit
exit
Unrestricted Variable
Either extension (true or false) leads to a satisfying assignment.
11
b
false
true
exit
exit
Restricted Variable
One extension (true xor false) leads to a satisfying assignment.
12
exit
exit
a
false
true
Unsatisfying Assignment
Partial assignment is unsatisfiable.
13
c
false
true
exit
exit
Linux Model Variables
14
Lots of exits
15
Hard Feature Models?
Q: Are SAT problems derived from realistic feature models always tractable?
“Automated generation of computationally hard feature models using evolutionary algorithms”,
Segura et al., Expert Systems with Applications 2013
“SAT-based analysis of feature models is easy”,
Mendonca et al., SPLC 2009
16
Always easy?
… No, if CTC allowed for disjunctions of length 3
17
HARD CNF
a
b
c
d
…
HARD CTC
How much is easy/hard?
Q: How much of feature models can be simplified in polynomial time?
18
How much is hard?
19
Feature Model
(62000+
Boolean variables)
Worst-case
Exp-time
Worst-case
Poly-time
Simp Model
(~50
Boolean variables)
Simplification
20
Equivalent Variable Substitution
Replace with x and y with a fresh variable z
21
Boolean Constraint Propagation
Set x=true.
22
Simplifications
23
Post Simplification
24
Horn
25
Treewidth Complexity
Q: Does treewidth characterize the complexity of real-world feature models?
“Measuring the Structural Complexity of Feature Models”,
Pohl et al., ASE 2013
26
Tree-width
27
Spearman’s rank correlation coefficient between SAT solving time and lower/upper bound of treewidth: 0.257
Summary
QUESTIONS?
28