1 of 16

Fail4Lib

Failing with grace and style... or not.

Andreas Orphanides

and

Jason Casden

NCSU Libraries

(akorphan|jmcasden)@ncsu.edu

2 of 16

Part 1: Welcome to Fail4Lib

Low-stress introductions

  • Name and other identifiers
  • Position and institution
  • Describe your role if non-obvious (optional)
  • What’s something you hope to gain from this session? (optional)

3 of 16

Outline

  1. Case studies
  2. Lightning talks
  3. Group therapy
    1. How do we recognize/understand failure? How do we grow from it?
    2. How do we make our own organization/work more tolerant of failure and risk? How do we communicate the value of failure?

4 of 16

Outcomes

  1. Gain experience with failure analysis.
  2. Reflect on our own experiences.
    1. Acknowledge that it happens. Get comfortable discussing failure.
    2. Get comfortable with failing. Think about how to fail gracefully.
  3. Accept failure/risk in our work practices.
    • How can we talk about it in a productive way?
    • How can we improve the ways we handle, seek, or discuss failure?
    • How can we maximize the value of our failures?
  4. Commiserate in a safe environment

5 of 16

Some flavors of failure

  • Technical failure
  • Failure to effectively address a real user need
  • Overinvestment
  • Outreach/Promotion failure
  • Design/UX failure
  • Project team communication failure
  • Missed opportunities (risk-averse failure) (!)
  • Failure to launch

6 of 16

Readings

  1. Sowing Failure, Reaping Success (New York Times) http://learning.blogs.nytimes.com/2012/05/07/sowing-failure-reaping-success-what-failure-can-teach/
  2. On Being Wrong (Kathryn Schulz via TED) http://www.ted.com/talks/kathryn_schulz_on_being_wrong.html
  3. You Are Solving The Wrong Problem (UX Magazine) �http://uxmag.com/articles/you-are-solving-the-wrong-problem
  4. The Curious Case of Polywater (Slate) -- how wrong can science get? http://www.slate.com/articles/health_and_science/science/2013/11/polywater_history_and_science_mistakes_the_u_s_and_ussr_raced_to_create.single.html

7 of 16

Case studies: Failure anxiety!

(If you think you’ve got it bad...)

8 of 16

Case Study 1: healthcare.gov

MORE PARTS, MORE PROBLEMS

“The larger the system, the greater the probability of unexpected failure.”

Le Chatelier’s Principle:

“Complex systems tend to oppose their own proper function.”

Quotations from John Gall, Systemantics. New York: Quadrangle, 1977

9 of 16

healthcare.gov

  • When is a system implementation “complete”? If you’re a contractor? If you’re a client? If you’re a user?
    • A major failing of the healthcare.gov rollout was the fact that the government’s definition of “complete” didn’t match the contractors’ definitions. How do we ensure that all participants are working with the same set of expectations?
  • One of the challenges of the healthcare.gov implementation was the fact that design criteria were often not fully provided to contractors, for unrelated political reasons. How do you find success -- or minimize the chance of failure -- in an environment where information is incomplete, unstable, being actively withheld, etc?
  • Establishing a rigid go-live date before a problem is fully analyzed is a recipe for disaster -- but sometimes it’s inevitable. How can we mitigate the risks associated with milestones that are outside our control?
  • How can you minimize or mitigate the liabilities associated with inflexible contracting policies, such as lowest-bidder requirements?
  • What are the optimal evaluation strategies for a system that’s too big to fail -- and possibly too big to test?
  • When a product rollout is massively disappointing or disastrous, what are the best strategies for rapid recovery -- both technically and culturally?

10 of 16

Case Study 2: Space Shuttle Challenger

ENGINEERING FAILURES and CULTURAL BLINDNESS

“When a fail-safe system fails, it fails by failing to fail safe.”

Fundamental Law of Administrative Workings:

“The real world is whatever is reported to the system.”

Quotations from John Gall, Systemantics. New York: Quadrangle, 1977

11 of 16

Space Shuttle Challenger

  • How do you balance risk tolerance and design cost/complexity against the severity of that risk? In what ways did NASA fail to achieve this balance?
  • Some limitations of the design process incur risk by their nature. What are the thresholds where those risks are acceptable? How do we manage risk in situations that include:
    • Systems experts without cultural authority
    • Complex technology and intrinsically risky tasks
    • The expectation to balance costs against safety
  • In the days preceding the Challenger launch, there was immense political pressure on NASA to launch in a timely manner. Further delays, it was feared, could have cost the continued existence of the space program. In this sense, was the setting aside of engineering concerns in some part justified? Is it possible for existential risk to an organization to outweigh human risk? Was this such a situation?
  • How do we foster a culture where the first response to a notification of real risk is frank evaluation and judgment, rather than denial? How do we balance engineering demands against cultural ones?
  • Almost exactly 17 years after Challenger, the Space Shuttle Columbia was lost in a disaster with strong echoes of the Challenger disaster. In what ways did NASA fail to learn the cultural lessons of Challenger? How can we ensure that we absorb and retain hard lessons after a major failure, even years later?

12 of 16

General discussion

  • How could these problems have been avoided, or their damage mitigated?
  • How can we manage the need for assigning blame? How do we focus on moving forward after a failure? Are there cases where finding a responsible party is warranted?
  • What liabilities are associated with too great a focus on blame/responsibility? What liabilities are associated with setting aside the assignment of responsibility?
  • What are the worst case scenarios for your own work? How does this affect your risk management choices?

13 of 16

Lightning talks!

14 of 16

Group Therapy: Understanding and dealing with failure in your own practice

  • What are the symptoms of failure?
  • How do you identify an incipient failure and try to recover/adjust?
  • What do you do after you encounter a failure in a project? How do you make failure valuable? (Post-mortems, recovery, etc....)
  • How do you plan for the unknown when beginning a project?
  • How do you manage risk to mitigate potential damage when undertaking work in new areas?

15 of 16

Group Therapy: Surviving failure, risk, and the unexpected with grace.

  • How do you prepare colleagues for unexpected outcomes?
  • What is your organization’s approach to risk and failure?
    • Is risk well-tolerated/well-managed?
    • What are the consequences of a failed project?
    • Is failure seen as an endpoint -- a negative outcome to an endeavor -- or merely a step in the development process?
    • When is failure unavoidable? Useful? Desirable? How do you tolerate failure in high-risk/cutting-edge contexts?
  • How do you talk about “failure” with your colleagues? Supervisors? Stakeholders? Patrons? Reports? What kind of language do you use?

16 of 16