Postmortem example
The version of the browser you are using is no longer supported. Please upgrade to a supported browser.Dismiss

View only
Priority for 10/2 discussionTitleExplanationMain stakeholdersEldad's commentsSusan's commentsJoAnnaRenato / MelodySuggested decisions / next steps (fill-in during 10/2 discussion)
1Release date inconsistencyAlthough we announced 9/18 as the release date we actually released features during 9/17. That causes issues for the business team as they try to plan training and set customer expectations accurately.Renato & Melody, Nir, JoAnnaNot trivial solution - it isn't due to lack of planning. The release action actually takes time. If we want everything to be available morning of the release date we have to do it a day earlier.Need to decide exactly when to "flip the switch" on a release. I don't think it is a big challenge.[[Nir] to Eldad:] Understood, but then we need to have the materials and information required to train internal and external teams earlier - it can't be that features are releasing before teams or clients are trained/aware[ren] We need consistency. It's ok if release "switch" is tripped on Tuesday at 6pm PST, as long as that is consistent for every release.Define a precise release window of an hour or two, as needed, during the actual published release date
Between 5-8pm on Tuesday, prior to day of release
1UAT cutoff not enforcedWe defined a 2-week UAT cutoff and haven't been enforcing it. Some projects are allowed to be released depite not passing UAT on time or even not being completed until the very last minute or even beyond. As rtesult we risk quality issues and make it impossible to conduct training or communicate plans accurately. AllWe want the UAT cutoff to be taken seriously althougfh we also want to be able to use the buffer period to close some loose ends. We should agree on a red line we do not cross. We can also achieve a good compromise if we exclude projects from the planned release if they miss the deadline but still keep an option to release them off-cycle if necessary and approved by all. Eldad: makes sense, I like the redline idea.[ren] I too like the redine idea and, imo, it has been previously agreed-to by many of us. The challenge has been enforcing this practice.

5 business days prior to the release, training needs to have a list of features that are included in the release. That list must not change. So, if releases are on Tuesdays (per item row above), we need to know EOD on the previous Tuesday what's in the release. Note that in the past we've discussed 10 business days prior; we're ok with 5 days prior.

The method of our discovery, too, must be consistent. Our agreed-to tool has been one document that is the source of truth that is published at the time of the "redline". We do not need the ever-green version—it's over communication for us. And we prefer to not use a Jira filter for this purpose as its changes aren't controlled.

Regarding workflow, the following is (would be) fine:
• PM trains technical team and training team 8-10 business days prior to release. Training hours should be 9-3pm PST
• Final list of features in the release ("the redline") is produced 5 business days prior to release.
• Training to all (by training team) occurs 3 business days prior to release.
• CSM communicates features, as needed, to select customers 1-2 business days prior to release.
• Product release announcement (from Product Marketing) is sent to customers 1 day prior to release.
2-weekbefore release: highlight projects at risk and escalate to Nir & Eldad to determine if we shopuld postpone or try to make it.
1 week before release: red-line. Projects that have not passed UAT either postponed to next release or be put forward for off-cycle consideration.
Red line cutoff should be end of day Tuesday before the release
1Difficulty in tracking true status of projectsInconsistency between records concerning release dates, status (Jira, planning sheet, etc.). Training team unable to determine what's actually happening. CS counting on release and not aware of changes (e.g. Chris committing to AOL to conduct OCR audit)Melody, Eldad, JoAnnaWe clearly need to maintain one source of truth that is always accurate reflecting real status for all projects. There are too many people involved so it's very hard to determine what's going on. I suggest we make squad coordinators responsible to work with the central project manager to keep the status updated and to flag projects for escalation on time.[ren] I'd like Training and Education to be divorced from tracking. As described above, our focus is on what is released, not what might be released. ie., we're satisfied as long as we have the source of truth about what's released at the redline date. 2 separate needs:
1) always accurate status reporting for product and engineering leads to monitor;
2) solid release commitments for CS and training, updated with likely releases 2 weeks before the release and with final releases 1 week before the rerlease.
Short-term solution: use the current weekly report to communicate this
Long-term solution: new project manager to suggest better tool
Squad coordinators to be responsible for accurate reporting
Schedule a session with them
1Status of "at risk" projects not determined until last minuteIn our bi-weekly status reviews we mark some projects as being at risk to miss the release date. We often want to use the stabilization period to try to get them done even though they miss the UAT deadline. We should decide one way or another no later than 1 week before the release. We haven't done that and it wasn't clear which projects actually are being released and which are going to be delayed until the last minute and even beyond.AllWe should enforce a real cutoff and therefore have no changes to release content for at least one week prior to release. The usual "we will try to be on time without making noise..." we should be more verbose with real status of projects and managers need to get comfortable putting it out.[ren] Same as aboveAvoid last minute changes by enforcing UAT cutoff (see above)
Eng team leads to monitor pacing more tightly and report status more accurately
1Unintended negative effect on qualityPrior to the new 8-week cycle we allowed more time for QA and bug-fixing to take place before doing the UAT and release. Now that we have a deadline for UAT the QA and bug-fixing time gets squeezed out of the process.AllWe certainly want to keep the strict deadlines. The right solution is to plan properly for QA to take place. Eng team leads should include it in their plans. We may want to discuss a different cycle length if we feel that the 8-week one doesn't allow for the development and QA of valuable features. That is not necessarily the case though.
Enforcement should be done by PMs in the UAT. If PMs hear from the QA manager quality isn't sufficient they should fail UAT, unless there are some unique circumstances. PMs should consult with tribe leaders.
We need to look at it with two aspects: regression and new features bugs.
Regression of existing is by far worse of a problem and must be caught quickly, fixed quickly and become part of the immune system.
New feature bugs should become part of the immune system too, and also must be looked at hollistically, per release, and compared to previous releases, to understand how can we improve product development.
I'm not sure about this, but it seems to me that there isn't enough communication between teams about the impact of releasing of one feature on existing features -- one team does something new that breaks something else no one anticipated. I realize that sounds rudimentary, but I hear about that kind of thing a lot1) Follow up on more sophisticated effort estimation that takes into account QA time
2) Align quarterly objectives between Product, Engineering and QA (has been discussed with Amir A, new system coming, etc.)
3) Follow up on better way to expose dependencies and downstream implications - in PRD and initial design stage.
4) follow up on tying compensation to release date and quality
5) Follow up on criteria for passing UAT to be applied by all PMs
1Project delayes due to external dependency9 projects were late due to dependency on an external 3rd party such as Nielsen, MediaMind, etc.AllWe should make sure we do everything possible to push for timely solution. We may need to avoid making specific commitments in these cases.Few comments;
1. PM and Eng need to move forward and use whatever possible to continue develop. No wait. If simply impossible, we should stop the project and move to do something else.
2. Communication of such delays (not stoppage of development, but actual release) need to go out as soon as we know.
3. Agree with JoAnna on accountability.
We should also make sure we have someone internally held accountable for getting answers from those third parties (Sasha, PM, etc.)1) Assign clear single owner for 3rd party releationship management who is committed to the project timeline.
2) PM and eng team lead to escalate dependency quickly to Management.
3) Push to either resolve issue or swap projects if can't be solved.
4) Make assumptions and improvize to close the gaps
5) business owner should commit the partner to the timeline
1Project delays due to dev work9 projects were late due to dev work taking longer than expected or resource availability changes (engineer leaving). EngineeringShould get more sophisticated in estimating effort in general.
Should have reevaluated timing after Surendra left and provided tighter supervision to find error in technical design ahead of time.
Three seperate issues:
1. Engineering leaving (or become unexpectedly unavailable): not in our control. Currently the decision is not to have "spares" for these kind of things. We could change that decision. [All]
2. Wrong estimates: identify and train managers. [Nir]
3. IC performance: identify and deal with. [Nir]
1) Unexpected departures: we have already put a 2-week buffer, should be able to cover some of these situations. Not changing plan.
2) Effort estimation: should develop stronger shared methodology and learn from past experience
3) Suggesting closer supervision of eng team leads on day to day work to catch detours (and ensure good design, progress, etc.)
4) Nir to review individual performance
1Project was released off-cycle without approvalLanguage targeting: flawed execution: we released the change to the ad-server in continuous deployment and becasue there wasn't a UI change we impacted customer experience without warning or validationBeckman, PMsWe should have created a ghost UI change and use it to have a product flag so we can control the release.This presents as much of a problem with perception as anything -- we have commited to AOL and Y! that we now don't release off-cycle, so when it happens once, they assume it's still happening rampantly[ren] 1) Off-cycle release needs approval by senior business person in Operations and/or Client Services.

2) As of now, Training and Education have neither resources nor process to handle off-cycle training for As such, it becomes the responsibility of the product team to ensure a custom release process (information distribution to CSM, training, articles) is established for the off-cycle feature. That process should intimately involve CSMs associated with customers getting the feature. Training and Edu will do our best to help while not owning this process.
This was supposed to be an off-cycle release that doesn't impact customer experience (generally agreed by all to be legit)
We missed the impact on user eperience. In order to avoid need better planning and PRD review by PMs
1Broken communication between PM and CS re: client sensitivities, feature release timing and delaysIn the case of OCR audit for pubs, Matt wasn't aware of Chri's sensitivity to the release dateJoAnna, PMsIn order to avoid such situations we should all keep each other informed of sensitive expectations.I don't think the explanation here is fair, nor is the categoriazation of the "Title" of this -- Chris was told the OCR feature was released, it was included in the training materials, and it was only days later he found out that it wasn't -- should we always assume clients can't use features for several days in case they don't release? Also, the delay wasn't a matter of two days, it's two weeks. I'm sorry, but Chris's communication wasn't the problem here.[melody] The feature was communicated to all during regular pre-release training on 9.17 as 'in beta / available for promotion to a select group of interested publishers.' Mid-day 9/17 the release date was rolled back in Jira, but Training + CSMs were not aware.

Features should meet a minimum standard in order to be treated as 'ready for release' by proposed 'redline date.' The decision of what to communicate to internal employees and customers should be clearly outlined by the Product team during technical training (no Training Team "judgement calls"); this should fall within a clear go-to-market process for introducing new features; and the list of features to be released by 'redline date' should be considered absolutely trustworthy (e.g. 'no take-backs.')
Avoid last minute changes by enforcing UAT cutoff (see above)
Discuss way to improve awareness of PMs to market sensitivities (commit dates, etc.)
Communicate to all CS that Melody is the only source of solid expectations for release
2Project delays due to late requirements4 projects were late due to late requirements in BAS and CR squads. In general we havebn't been providing PRDs early enough to engineering. PMsWe're on a mission to open up a gap of 2 to 3 months ahead of engineering with scoped PRDs. We're still far from it due to seberal reasons. I will follow up on this separately.
Eldad and product team leads to push harder for PRD work ahead of cycle.
2Project delays due to internal dependency2 projects were late due to dependency on an internal 3rd party (Ops, Dan Klein) AllWe should be able to avoid these situations by better managing the process. Need clear ownership and escalation.1) Assign clear single owner for internal dependency who is committed to the project timeline.
2) PM and eng team lead to escalate dependency quickly to Management.
3) Push to either resolve issue or swap projects if can't be solved.
2Project delayes due to changed priority or design3 projects were postponed due to a decision by PMs to deprioritize or redesign them PMsThis should be curbed by better planning but may still happen from time to time and is legitimate when not done in last minute. Change in priority due to new information (e.g. executive directive) is legitimate. Change in design should be avoided by investing more time in PRD planning and validation. Need to open PRD gap to allow for all that to happen.
2Stabilization period not really used for testingWe originally intended the stabilization period to include some user testing in order to increase confidence in new features. That hasn't been happening. Eldad, AlexThis expectation may not be realistic. It takes a lot of effort and time to put features in front of users and get them to use them. Instead we may want to plan full blown beta periods for sensitive features post-release. Give up on this particular expectation. Put in place beta rollout for projects when necessary.
2Release instructions submitted latePMs have been slow to provide precise release instructions to Laura. As result she didn't have time to review them and verify correctness. PMsThis one should be simple. PMs should know well ahead of time what the release instructions should be. It's only a matter of delivering them on time. To be enforced by new project manager
2Unclear use of the term "beta"Beta designation used for at risk projects released globally. Clarify difference between a beta release and a beta indication.EldadBeta can be used to describe a gradual rollout process. It may also be used to tag a feature that is released globally in which we're still not 100% confident. These are different things and we should have clear guidelines for using them.[ren] It is important for to develop a thoughtful Beta program. Training and Education is happy to lend our resources as possible, but our primary focus remains on training and documenting globally released features.

[melody] PMs should track / maintain a periodically updated list of which Accounts receive product flags or restricted features for features in 'beta' roll-out
Ellaborate beta plan being developed for NextGenUX. Need to start implementing also on routine projects that qualify. Eldad to follow up.
2Project rolled back due to missed tasksPRDM-1595 was rolled back after being released in v3.5i becasue we didn't complete some of the dev tasks associated with the projectPrem, NirWe should consider ways to use the tools to help ensure complete coverage for dev needs#3 of: Three seperate issues:
1. Engineering leaving (or become unexpectedly unavailable): not in our control. Currently the decision is not to have "spares" for these kind of things. We could change that decision. [All]
2. Wrong estimates: identify and train managers. [Nir]
3. IC performance: identify and deal with. [Nir]
[ren] Once a feature is globally communicated as released, it's left the station from a messaging and training perspective and we haven't opportunity to retreat in an effective way. Follow up on this issue. Need tighter controls over dev tasks. Should be enforced by squadd coordinator? Eng team lead?
2Project rolled back due to flawed requirementsLanguage targeting: flawed design: we chose to use browser setting to determine language although most users don't change the default English settingBeckman, PMsPMs should go deep into the design of new features and consider all implications. We should validate with people internally and externally to make sure we don't miss something important. To be discussedi in more detail in PM trainingSame as aboveEmphasize in product offsite.
Why wasn't it caught in PRD review?
2Passive approach to training by PMsPMs not signing up to training despite having projects that affect customer experience slotted in the releaseMelody, PMsI am not convinces the current process makes sense. Once there's clarity on the projects that are included in a release it should be immediately clear which ones require training and who should be included. I suggest we have a separate dedicated discussion about training.[ren] Training and Edu has been involved in driving this process—e.g. reaching out to PM to request their participation and tracking / finding which features haven't been trained. We'd welcome just becoming consumers of this process and assuming PM has trained us about all relevant features. Eldad to follow up with a separate discussion about training.
2Unclear training requirementsAll projects that impcat customer experience should be included in training. Some projects fall between the cracks due to wrong documentation / other reasonsMelody, PMsI suggest we have a separate dedicated discussion about training.[ren] good training, good documentation (PRDMs) and good application of the redline should address this concern.Eldad to follow up with a separate discussion about training.
2Applying similar process for custom releasesWhen we do custom features for clients like VivaKi and others, we don't often adhere to the same process around training, advanced notice, etc.JoAnna, Brian Burns[ren] (what is a custom release? Is this an off-cycle release? and/or is this something we're no longer doing now with our better defined process?)
Vivaki's custom asks and hurried development were a result of lacking process and over-selling by senior management. My expectation is that features needed in a hurry will now fall into the major and minor release process. Training supports the standard release process of any minor and major release, whether the feature is behind a product flag or not.

When execptions occur, feautures released outside the standard release date should primarily rely on (1) the business team needing / justifying the off-cycle release and (2) product team member executing the release to ensure the right people know about the feature. Of course, Training and Education will do our best to accomodate extra support — via documentation and training — for these cases when our resources allow.

After, the feature should go through the standard release process whereby all of is trained, KB is updated and the feature is released to all.
Who should follow up?
2Over-investment in projectCustomer request "Blackout Dates" became a large and risky project. That could have been avoided had we considered simpler design options. Eldad, NirPMs should define the needs carefully without dictating design choices [verify]
We should discuss design reviews by senior eng team leads before going into development.
PMs should define the needs carefully without dictating design choices [verify]
We should discuss design reviews by senior eng team leads before going into development.
גיליון1 <- I'm not convinced everyone understands