第 1 張，共 307 張

Welcome & Opening Remarks

Matt Lowrie, Google

第 2 張，共 307 張

Evolution of Business and Engineering Productivity

Google

Manasi Joshi, Sr. Staff Engineer

第 3 張，共 307 張

Multiple Axes of Growth

Scale

Complexity

Connectivity

...

第 4 張，共 307 張

1.0 The Small Beginnings

Google Test in 2005

Culture

Infrastructure

Talent

第 5 張，共 307 張

The Test Pyramid

Unit

Component

Regression

Integration

Log

Privacy

Canary

第 6 張，共 307 張

Achieve fine balance

VELOCITY

DURABILITY

QUALITY

UTILITY

第 7 張，共 307 張

Test Strategy 1.0

The Model

Dev/Prod defect ratios
99% defect free
Testing features matrix

Cons

Lack of tests beyond unit/smoke
Poor release velocity - 1 or 2 /month
Lack of granularity of metrics

Challenges in Scaling!

第 8 張，共 307 張

Release Strategy 1.0

The Model

Low frequency: 1 or 2 per month
Pre-submit tests: Unit + some Smoke tests
Lots of testing during releases

Cons

Longer release iterations
Releases with patches
Long cycles - features to production
Many late aborts in the release cycle

Challenges in Scaling!

WHY IT WORKS TO TEST IN PROD: POINTS

Not feasible or too hard to mimic production in test environments

Quality Metrics: User feedback

Supplement testing with hand crafted data to using/replaying production traffic including experimentation

Production monitoring and Anomaly detection that is running continuously matters too

This is indeed very very attractive to do - why not? You have everything at your disposal in real time.

But …

It is also very dangerous to test your brand new code branch in production for the first time without any prior vetting… what if there is a privacy leakage issue and inadvertently you expose user’s personal information to some unwanted google property?

Worse, you delete user data without their permission/knowledge

Or

Blow up the advertiser budget all at once and charge them exorbitant amounts?

For release velocity/iteration: Train analogy

Longer release iterations => (overflowing 1 or 2 weeks boundaries and pushing next release schedule at times)

Late aborts => Bugs found during canary phase, experiments harder to debug

Many late aborts had started becoming the new normal. Things had to change in a dramatic way

Development culture needed maturity => More focus on testing upstream

Usually never a happy situation

第 9 張，共 307 張

2.0 Transition to Mid-size...Google Test in 2012

- shift in mindset

Talent

Infrastructure

Brand

Metrics

Culture

第 10 張，共 307 張

Display Ads:Evolution of Organization

Talent - 10x (Hiring and Acquisitions)
Query - 1000x
Revenue - 1000x
Brand

Infrastructure

Platform: Google & acquisitions

Features: 20+ ad formats

Devices: Desktop, mobile phone, tablet, xbox ...

User Data: 1st party vs 3rd party

Growth in seniority and therefore maturity of the organization

We as engprod organization enable such massive scale of development with razor sharp focus on build and release health, constant pursuit to squeeze cl=>prod time while not compromising faster release iterations

Maintaining historically low rates of patches during releases and sailing the giant ship smoothly in a rocky sea

Back to the train analogy

We now have (ideally) smaller, faster trains, leaving more frequently. Miss the train? no big deal - the next one is leaving soon. They don't get derailed (we've improved the tracks) so they are reliable. The number of passengers is small so it's easy to track who's on board.

More about the Display ads space in 2016

100 Ms of LOC

100s of thousands of releases so far in 2016

Average 10 days CL->Prod (90%+ of our releases are at least weekly)

86% TAP continuous build greenness

第 11 張，共 307 張

Evolution of Test Strategy

The Model

Product Aligned Test Infra
Cross-functional Tools and Dashboards
Handling of user data privacy

Pros

Product excellence
Bad pattern detection, code coverage
Customer trust and business compliance : BRAND

Beyond unit tests, different types of testing included

As part of product aligned test infra we also made a shift in the developer mindset on when to run what tests?

Moving most of the testing upstream and continuously too. So there were faster feedback cycles and naturally that

Was directly proportional to setting up frequent releases (90% are at least weekly in DisplayAds)

In the world of Ads:

User Trust: Google is a proud member of Network Advertising Initiative and Digital Advertising Alliance: providing transparency and controls to users on what ads they should be exposed to, respecting their preferences to muting an ad universally and so on.

Proliferation of tools and in some cases duplication too

I will go over what is good or bad about duplication in a growing organization shortly enough.

Some cons can be leveraged to our benefit. Tool duplication is one such and we’ll cover more on this shortly.

第 12 張，共 307 張

Tool Proliferation and Redundancy : Con … or a Pro

Survival of the fittest!

Organized redundancy might be good!

Create competition and collaboration to develop best tools in their class.

If the proliferation is unchecked, it will lead to confusion, code debt.

I want to spend sometime talking about code/tools duplication

Given a massive shift in focus of testing and test culture, we did end up having multiple frameworks to achieve the same thing at high level but with separate mechanics of getting there.

I am going to say there are again 2 sides to this coin too.

Duplications should be avoided

Natural benefits:

Cleaning code debt,

Standardization of interfaces that lead to reduced cognitive overload for users/developers running tests with such tools

Costs:

Potentially lose flexibility

Can add lots of dependencies cross team/cross site

Can lead to slow feature changes, cannot thrive without owners/maintainers as a single team

As a result, sometimes leadership would need to make sensitive decisions to discontinue use of certain tools or forging a merge between similar tools.

This might lead to unhappiness for tool developers and early adopters but may have long term benefits.

The important thing isn't always the decision of which way to go (develop new tool vs standardize and extend old tools) but more so the communication around what is out there and what is being developed. People will make much better decisions with perfect knowledge.

第 13 張，共 307 張

Metrics and Measurements

Key Indicators

Releases
Feature code to Prod time
… and many more

Call to Action

Instrument the projects
Metric improvement influences quarterly goals
Shift in test culture

You cannot improve what you cannot measure.

Instrument the projects so measurement was feasible

Slices of data were created where relevant for per day, week, month, quarter, year and so on

Systemic issues seen at Google, which made us create http://signals and project hierarchy to track progress across the company.

For the very first time, this provided insight into number of interesting metrics and provided common platform to stack rank projects based on common principles

Shift in test culture => Move testing upstream

Some examples of granular metrics we track:

Changelist to production Latency

Code coverage

Presubmit greenness

Dashboard development

Test clusters

Intelligent selection of release candidate changelist

Release process ….

Going back to the earlier point I talked about => Achieving fine balance between development velocity and quality at the same time

Executive sponsorship for team fixits, project health to improve for entire PA etc

You have to make it easy to get the metrics, otherwise people won't prioritize using them.

Getting people to change requires making the path for them to follow easy. Providing common/standard definitions and computation of measurements makes them more understandable and relevant, as well as allowing comparison across a company.

第 14 張，共 307 張

Test and Release Strategy 2.0

Achievements

Continuous Deployment
5x Increase in release cycle speed
Avg line coverage ~70%

Metrics

Canary Testing

Monitoring

Frequent

Releases

Continuous Builds/Tests

Developer Testing

Feedback

第 15 張，共 307 張

3.0 Road to the Future … 2017 and beyond

Next Gen!

Talent

Infrastructure

Brand

Privacy

第 16 張，共 307 張

Test Strategy: New Challenges

The Model

Integrated vertical experience
Data driven eng productivity
Supporting multiple platforms (mobile, VR, ….)

Pros

Awesome user experiences
Makes engineering friction free
Follow change in user needs and computing platforms

Deeper integrations across our products in 2017

Google is a 70K people org.

Just within the engineering productivity teams, our seniority has gone up quite a bit within our org. We have need to focus on increasing training, collaboration and standardization.

Integrated and vertical experiences: e.g. shopping, booking travel use app vs web

Cannot seamlessly do it due to identity and carrying your data everywhere

It affects: analytics, user identifiers, go beyond mobile search,

Transactional interactions with home system : buying a product, booking travel (ads cases) and more

From streamlined tools to solutions, open source: We would like to impact open community through our infra and learnings

2. Data driven engprod: releases tools, critique, blaze, signals etc is a very rich amount of data.

Which metrics are excelling vs need help

Using data to optimize, analyze and recommend optimal solutions

Machine learning products that are coming online

Google assistant data that is getting processed internally.

With all the data at our disposal we have just begun using machine learning techniques to train models to optimize tests and resources they use in the development cycle, both pre-post submits and during releases. This is being achieved via experimentation around optimized dev workflow

3. At Google, our strength came from scaled server infrastructure, but our

ecosystem and tools not well equipped for mobile needs. We have not closed the gap on mobile app infrastructure, often due to

competing priorities among PAs solving similar challenges and product

teams focused on their own needs. We have to address this sooner than later.

This is a great seg for me to show you how divided we stand in the world of mobile phones fragmentation

第 17 張，共 307 張

Android Device Diversity (2015)

~25K distinct Android devices

~680K devices surveyed

~1200 device brands

第 18 張，共 307 張

第 19 張，共 307 張

THANK YOU

So many unsolved challenges or some with very little involvement of engprod teams

where the traditional model of

test - build - release - repeat does not always work or at the least is not sufficient

AI: Statistics + Machine Learning => Practical applications: NLP, Computer Vision, Robotics, Audio, Robotics, Sports, Genetics, Finance, Anomaly Detection (Spam, abuse detection or medical errors and so on)

Research shows that having diverse set of folks (where diversity could be in their race, gender, sexual orientation, culture, background and more axes) working together is directly proportional to improved products, all inclusive decisions.

We in Google are taking a serious note of this and have focused councils for D&I

In addition to improving engineering productivity through tools and infra, we are also actively working on hiring diverse candidates

第 20 張，共 307 張

Automating Telepresence Robot Driving

By Tanya Jenkins�Principal Automation Engineer

Cantilever Consulting, LLC

Event: Google Test Automation Conference, November 2016� ©Cantilever Consulting, LLC, 2016

第 21 張，共 307 張

What does it mean to be ‘Present’ in the 21^st Century?

Conferencing �- You can hear, but it is hard to visualize the remote environment.

Teleconferencing�- You can hear and see, but you cannot always react to the environment in a meaningful way because you are stationary.

�

第 22 張，共 307 張

Is this guy really present?

第 23 張，共 307 張

Telepresence is ‘Action-at-a-distance’

With telepresence, a ‘pilot’ drives a robot in a remote environment. Seeing, hearing, and now actions, are completely within the pilot’s control. The pilot is present in that remote environment.

第 24 張，共 307 張

第 25 張，共 307 張

A person can now be ‘present’ anywhere, but how do you test that?

第 26 張，共 307 張

What’s different in testing Telepresence?

Validation Goes Beyond Testing the UI

Manual vs Automated Approach

Automated Test Control Challenges

Dedicated Test Automation Environment

Tracking Position and Orientation of Beam Device

第 27 張，共 307 張

The Beam UI gives visual confirmation of position to the pilot, but Automation can’t use the same mechanism.

Telepresence testing is more than UI testing

第 28 張，共 307 張

Beam Telepresence System

第 29 張，共 307 張

Manual vs Automated Approach

Validation of Telepresence Driving

Manual Validation of about 100 tests...

2 Desktop Platforms (i.e. Windows, MacOS)

X 1 Beam Device

X 1 Driving Input device (keyboard)

...Grew to 1000s of tests

12 Desktop Platforms

X 3 Beam Devices

X 6 Driving Input Device permutations�

Definitely time to automate!

第 30 張，共 307 張

Automated Test Control Challenges

What framework to use for Client automation?�

How to handle various Beam Driving Controls (Keyboard, Mouse, Gamepad) on the Client?�

How to track the Beam’s movements/position in a remote location (the lab)?�

How to retrieve this tracking information after it’s recorded?�

第 31 張，共 307 張

Options for Tracking the Beam’s position

Option 1: Dream Big – Beacons… And Drones!

Option 2: A Little Too Simple – Text on Wall

Option 3: Just Right – Lidar

Image courtesy of Lino Schmid

第 32 張，共 307 張

Automated Test Control Challenges

(answered in a nutshell)

What framework to use for Client automation?� FrogLogic Squish, Python most flexible for our application

How to handle various Beam Driving Controls (Keyboard, Mouse, Gamepad)?� Abstract the controls and control commands by developing a Driving API

How to track the Beam’s movements/position in a remote location (the lab)?� Install Lidar in lab, take readings at specific spots during Automated Driving Course

How to retrieve this tracking information after it’s recorded?� Remote log into the NUC to start Lidar and retrieve data. Also, retrieve same info from Beam’s motor board via the client

第 33 張，共 307 張

Okay, so what is Lidar?

Lidar, LiDAR or LIDAR (from Wikipedia), “is a surveying method that measures distance to a target by illuminating that target with a laser light.”

Animation of Lidar

第 34 張，共 307 張

Lidar Scanning Sample Visualization

第 35 張，共 307 張

Dedicated Test Automation Equipment

Beam (with modifications for Lidar scanning)

Beam Charging Dock

Hokuyo Lidar

NUC small form computer

第 36 張，共 307 張

Dedicated Test Automation Equipment

Beam with modifications for Lidar scanning (Details)

第 37 張，共 307 張

Non-Traditional Testing Lab� Not typical hardware or software

Safety Concerns� Beam weighs about 100 lbs and is expensive

Room Considerations� Isolation (from noise & interference)� Lighting (natural & man-made light)� Flooring (density)

Dedicated Test Automation Environment

第 38 張，共 307 張

Automated Driving Course

Automated Driving Course Animation

Automation Test causes Beam to Drive. Position info is taken at each labeled point.

Starts on Dock at A, drives forward to B.

Turns left toward C, then opposite to the right.

Turns around toward Dock.

Drives in reverse when near Dock.

Initiates Auto-Docking until Beam is parked successfully at A.

第 39 張，共 307 張

Tracking Position and Orientation of Beam Device

Data Collection � Position and orientation information corresponding to each labeled point of Automated Driving Course is collected from the Lidar output.� Expected distance traveled is calculated using data collected from the Motor Board of the Beam.

Data Verification

Position information from Lidar and from Beam’s Motor Board are compared. The test passes if the comparison is within the margin of error.

第 40 張，共 307 張

Automated Driving Course Demo

第 41 張，共 307 張

Thank you!

Special Thanks

Models:�Karl Boucher�Oswaldo Torres�Brad Sandman

Tech Consultants:�Brad Sandman�Harsha Kikkeri�Francois Marie-Lefevere

Other Credits

Animation:�Tanya Jenkins

Beam device:�Suitable Technologies, Inc

Also:

第 42 張，共 307 張

Q & A

FAQ

Q. Are there other Beams at Google?

If you are a Google employee you can access one of the many Beams on Google’s campus.

第 43 張，共 307 張

Break

第 44 張，共 307 張

what’s in your wallet?

第 45 張，共 307 張

About Me

Hima Mandali

第 46 張，共 307 張

What is CapitalOne?

第 47 張，共 307 張

CapitalOne at a Glance

A leading diversified bank with $339.1 billion in assets, $235.8 billion in loans and $221.1 billion in deposits¹

4^th largest credit card issuer in the U.S.⁴

The 3^rd largest issuer of small business credit cards in the U.S.⁵

Largest U.S. direct bank⁷

8^th largest bank based on U.S. deposits²

More than 65 million customer accounts �and 45,000 associates

A FORTUNE 500 Company - #112

1) Source: Company reported data as of Q2’16

2) Source: FDIC, Domestic deposits ranking as of Q2’16

3) Source: FDIC, June 2015, deposits capped at $1B per branch

4) Source: Company-reported domestic credit card outstandings, Q2’16,

5) Source: The Nilson Report, Issue 1089, June 2016

6) Note: Financial institutions includes banks & specialty finance lenders,

Source: AutoCount, FY 2015

7) Source: FDIC, company reports as of Q2’16

第 48 張，共 307 張

Source: ABA 2015 Survey

http://www.aba.com/Press/Pages/081115MobileBankingSurvey.aspx

第 49 張，共 307 張

Digital customer touch points

第 50 張，共 307 張

What’s really in your wallet ?

第 51 張，共 307 張

Mobile First

第 52 張，共 307 張

Continuous automated test runs on DevOps pipeline

Opportunities

Why is the solution needed ?

Observed minor discrepancies between real devices and simulator/emulator tests

Need for increased automated test coverage

Immediate feedback

Quality delivery to production on an hourly basis

Faster test executions and faster integrated automated deployment pipeline

第 53 張，共 307 張

Appium

Cucumber, Galen, Maven,

JVM, Java/Selenium

Layout Testing

Functional Testing

Layout Test

Html Test Reports

Functional Test

第 54 張，共 307 張

Functional and Layout Testing

Functional Testing (Cucumber/Java)

Validate the application behavior
Schedule One Time Payment
Verify the error messages text
Verifying the confirmation message

Layout Testing (Galen/Java)

Validating visual and display aspects
Verifying the location of Button or Menus
Responsive Design
Appearance of Pay Bill button or Card image

第 55 張，共 307 張

Demo

第 56 張，共 307 張

第 57 張，共 307 張

第 58 張，共 307 張

第 59 張，共 307 張

第 60 張，共 307 張

60

Test results in Pipeline

第 61 張，共 307 張

Test results in Pipeline

第 62 張，共 307 張

Test results in Pipeline

第 63 張，共 307 張

63

Test results in Pipeline

第 64 張，共 307 張

64

Test results in Pipeline

第 65 張，共 307 張

65

Test results in Pipeline

第 66 張，共 307 張

66

"Product" or "Program" Level Dashboard View

第 67 張，共 307 張

Our DevOps dashboard is open sourced

第 68 張，共 307 張

QUESTIONS ?

第 69 張，共 307 張

第 70 張，共 307 張

Smart Test Execution:

Using Automated Test Run Statistics to Optimize Their Execution

第 71 張，共 307 張

What is Unity?

第 72 張，共 307 張

Unity Editor

2D & 3D
AR & VR
Physics

第 73 張，共 307 張

Asset Store

第 74 張，共 307 張

Unity Services

第 75 張，共 307 張

INDUSTRY-LEADING MULTIPLATFORM SUPPORT

第 76 張，共 307 張

5.5M Users

Registered with Unity

238K Unique Games

Downloaded in Q2 2016

770M people

Playing Unity-made games

1.7B Mobile Devices

Running Unity-made games

第 77 張，共 307 張

Unity Code Base

2M Lines of Code split across:

Platforms
Editor
Runtime

第 78 張，共 307 張

Unity Test Automation

31,261 Tests

10,603 unit tests
5,167 integration tests
3,460 runtime tests
519 graphics tests

Test runs per

Day 1,690,668
Week 9,854,880
Month 39,896,746
Year 311,043,693

第 79 張，共 307 張

Build farm

180+ Agents
Slow UX

第 80 張，共 307 張

Bottleneck

Why?

第 81 張，共 307 張

Branches

14,238 branches in total
Median age 3.5 days
Average age 37 days

第 82 張，共 307 張

Getting to the main branch ‘trunk’

Pull Request to go thru:

Code Review with peers
Green ABV (A Build Verification) on a build farm
Added to Trunk Merge Queue
Batch of PRs merged to draft branch
Pretty much all tests are launched
Goes to trunk if successful

第 83 張，共 307 張

ABV is slow :(

3-6 hours

第 84 張，共 307 張

Can we speed it up?

第 85 張，共 307 張

Zoo of how to execute tests

第 86 張，共 307 張

Unified Test Runner (UTR) & Hoarder

UTR

Hoarder

Unified way to run tests

Locally
Build farm
Sends all information on a test execution to Hoarder

All data on test run

MS SQL Server
ASP.Net Web Service
AngularJS UI

第 87 張，共 307 張

Hoarder

第 88 張，共 307 張

Evergreen tests

第 89 張，共 307 張

Smart Test Execution (Hoarder++)

Let’s not run tests:

100% Successful
During last month
>100 test runs
On all branches
Enabled for all branches except trunk, draft/trunk and draft/trunk-queue

第 90 張，共 307 張

Integration

Tests

Windows 64

Editor

Test

Suite

Run

第 91 張，共 307 張

Saved Time on Average :)

Integration Tests Windows 64 Editor Very Slow Test Suite

1 hour 41 minutes => 38 minutes - 63 minutes saved! - 62% saved!

Integration Tests Windows 64 Editor Test Suite

1 hour 15 minutes => 37 minutes - 39 minutes saved :) - 51% saved!

Integration Tests Mac 64 Editor Test Suite

1 hour 7 minutes => 23 minutes - 44 minutes saved! :) - 66% saved!

第 92 張，共 307 張

Integration Tests Mac 64 Editor

第 93 張，共 307 張

Integration Tests Mac 64 Editor

第 94 張，共 307 張

Next Steps

Find which tests we have excluded but then they failed at trunk merge time and learn from it
Use statistics for the current branch and trunk, draft/trunk and draft/trunk-queue (Boris’ mode :)
Come up with some aggregate threshold number with formula including:
Date of last failure
Statistical methods (standard deviation etc)?
Add test coverage data so that we know what tests to run on a particular changeset

第 95 張，共 307 張

Results

第 96 張，共 307 張

Questions? Ideas? Comments...

第 97 張，共 307 張

Thank you

boris@unity3d.com

第 98 張，共 307 張

Selenium-based test automation for

Windows and Windows Phone

第 99 張，共 307 張

Who am I?

99

Nikolai Abalov

Software Development Engineer at 2GIS

NickAb

@nickab

第 100 張，共 307 張

Selenium

“Free and open protocol for testing that has become a defacto standard”

Appium

第 101 張，共 307 張

101

Selenium 3.0
W3C working draft
WebDriver spec

WebDriver protocol

Over HTTP

WebDriver

Implementation

System

under test

Magic

第 102 張，共 307 張

Get element text (JSON Wire Example)

102

GET http://127.0.0.1:9999/session/AwesomeSession/element/123/text HTTP/1.1

...

Content-Type: application/json;charset=UTF-8

Accept: application/json

HTTP/1.1 200 OK

Content-Type: application/json;charset=UTF-8

...

{"sessionId":"AwesomeSession", "status":0, "value":"GTAC"}

第 103 張，共 307 張

Selenium + Appium

103

		Native	Web	Hybrid
Selenium
	Desktops		✓
Appium
	Android	✓	✓	✓
	iOS	✓	✓	✓
	Windows Desktop ^NEW	✓ (BETA)	Limited *	Limited *

* using UISpy I was able to locate elements in Internet Explorer, but not in Chrome

第 104 張，共 307 張

104

/Microsoft/WinAppDriver

第 105 張，共 307 張

Winium

第 106 張，共 307 張

Windows Phone Automation

第 107 張，共 307 張

107

/2gis/Winium.Mobile

第 108 張，共 307 張

Winium.Mobile

108

Selenium based	✓
Emulators	✓
Devices	✗ *
Native apps	✓ (StoreApps, Silverlight)
Hybrid/Web apps	✓(WIP)
AUT modification	Required
Additional	Selenium Grid support, App/Files deployment, Inspector, Extended commands

第 109 張，共 307 張

109

Tests

Winium.Mobile.Driver

Emulator

WebDriver Protocol

Over HTTP

XDEVirtualMachine

API

Internal API

Over HTTP

Automation Server

App under

test

第 110 張，共 307 張

How to start

110

Prepare the app
Write tests
Run

第 111 張，共 307 張

Prepare the app

111

Add NuGet package Winium.StoreApps.InnerServer
In MainPageOnLoaded (or anywhere on UI thread) initialize and start server

#if DEBUG

AutomationServer.Instance.InitializeAndStart();

#endif // DEBUG

Make your app testable (assign automation names, ids, etc.)
Build and package your appx

第 112 張，共 307 張

Write tests

112

import unittest

from selenium.webdriver import Remote

class TestMainPage(unittest.TestCase):

desired_capabilities = {"app": "C:\\YorAppUnderTest.appx"} # files, ...

def setUp(self):

self.driver = Remote(command_executor="http://localhost:9999",

desired_capabilities=self.desired_capabilities)

def test_button_tap_should_set_textbox_content(self):

self.driver.find_element_by_id('SetButton').click()

assert 'CARAMBA' == self.driver.find_element_by_id('MyTextBox').text

def tearDown(self):

self.driver.quit()

/2gis/Winium.Mobile/wiki/Supported-Commands

第 113 張，共 307 張

Write tests

113

(WIP) You can switch between native and web contexts

import unittest

from appium.webdriver import Remote

# switch to WebView and act on web elements to login, etc.

self.driver.switch_to.context('WEBVIEW_1')

self.driver.find_element_by_tag_name('h1').click()

# switch back to Native app and act on native elements

self.driver.switch_to.context('NATIVE_APP')

第 114 張，共 307 張

Extended commands

114

# direct use of Windows Phone automation APIs

app_bar_button = driver.find_element_by_id('GoAppBarButton')

driver.execute_script('automation: InvokePattern.Invoke', app_bar_button)

list_box = driver.find_element_by_id('MyListBox')

si = {'v': 'smallIncrement', 'count': 10}

driver.execute_script('automation: ScrollPattern.Scroll', list_box, si)

# setting value of public property

text_box = driver.find_element_by_id('MyTextBox')

driver.execute_script('attribute: set', text_box, 'Width', 10)

driver.execute_script('attribute: set', text_box, 'Background.Opacity', 0.3)

第 115 張，共 307 張

Run

115

Start Winium.Mobile.Driver.exe (download release from )
Run tests as usual

第 116 張，共 307 張

Demo

第 117 張，共 307 張

第 118 張，共 307 張

118

/2gis/Winium.StoreApps.CodedUi

第 119 張，共 307 張

Winium.StoreApps.CodedUi

119

Selenium based	✓
Emulators	✓
Devices	✓
Native apps	✓ (StoreApps, Silverlight)
Hybrid/Web apps	✓ (Limited)
AUT modification	Not Required (Appium Rule 1)
Additional	Requires premium Visual Studio license, Proof of Concept

第 120 張，共 307 張

120

Tests

Winium.StoreApps.Driver

Emulator

WebDriver Protocol

Over HTTP

XDEVirtualMachine

API

App under

test

vs.test.console

CodedUI

Internal API

第 121 張，共 307 張

Windows Desktop Automation

第 122 張，共 307 張

122

/2gis/Winium.Desktop

第 123 張，共 307 張

Winium.Desktop

123

Selenium based	✓
Native apps	✓ (WPF, WinForms, any accessible app)
Hybrid/Web apps	Limited *
AUT modification	Not Required (Appium Rule 1)
Additional	Full Desktop access, Extended commands
Limitations/Notes	Uses real mouse and keyboard events, i.e. you can’t run more than one session on same machine, or use mouse while tests are running **

* using UISpy I was able to locate elements in Internet Explorer, but not Chrome

** solvable by using Windows Emulator or RDP child session directly

第 124 張，共 307 張

124

Tests

WebDriver Protocol

Over HTTP

Winium.Desktop.Driver

Winium.Cruciatus

App under

test

Desktop

UI Automation

Framework

第 125 張，共 307 張

Demo

第 126 張，共 307 張

第 127 張，共 307 張

Automating calc.exe (python bindings)

127

# setup

dc = {'app': 'C:/windows/system32/calc.exe'}

driver = Remote(command_executor='http://localhost:9999', desired_capabilities=dc)

# test

window = driver.find_element_by_class_name('CalcFrame')

menu = window.find_element_by_id('MenuBar')

view_menu_item = menu.find_element_by_name('View')

view_menu_item.click()

view_menu_item.find_element_by_name('Scientific').click()

window.find_element_by_id('132').click() # 2

window.find_element_by_id('97').click() # ^

window.find_element_by_id('138').click() # 8

window.find_element_by_id('121').click() # =

result = window.find_element_by_id('150').get_attribute('Name')

assert '256' == result

# teardown

driver.close()

第 128 張，共 307 張

Automating calc.exe (C# bindings)

128

// setup

var dc = new DesiredCapabilities();

dc.SetCapability("app", @"C:/windows/system32/calc.exe");

var driver = new RemoteWebDriver(new Uri("http://localhost:9999"), dc);

// test

var window = driver.FindElementByClassName("CalcFrame");

var menu = driver.FindElement(By.Id("MenuBar"));

var viewMenuItem = menu.FindElement(By.Name("View"));

viewMenuItem.Click();

viewMenuItem.FindElement(By.Name("Scientific")).Click();

window.FindElement(By.Id("132")).Click(); // 2

window.FindElement(By.Id("97")).Click(); // ^

window.FindElement(By.Id("138")).Click(); // 8

window.FindElement(By.Id("121")).Click(); // =

var result = windows.FindElement(By.Id("150")).GetAttribute("Name");

Assert.AreEqual("256", result);

// teardown

driver.Close();

第 129 張，共 307 張

More open source automation tools

129

VMMaster (Browsers Cloud) https://github.com/2gis/vmmaster

Request clean environment for selenium tests on demand

STF Utils https://github.com/2gis/stf-utils

Connect Android devices to local machine from your STF instance with one simple command.
Record videos of your automated tests.
Write your own python apps using STF API client implementation.

第 130 張，共 307 張

Thank you!

Nikolai Abalov

n.abalov@2gis.ru

https://github.com/2gis/Winium

https://github.com/2gis/Winium.Mobile

https://github.com/2gis/Winium.Desktop

https://github.com/2gis/Winium.StoreApps.CodedUi

NickAb

@nickab

第 131 張，共 307 張

Lunch!

第 132 張，共 307 張

The Quirkier Side of Testing

Brian Vanpee, Senior Test Engineer Extraordinaire

November 15 2016

第 133 張，共 307 張

An Odd Trait or Behaviour

第 134 張，共 307 張

Programming Languages Have Quirks

A few examples...

Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

第 135 張，共 307 張

Objective-C

HMCharacteristicValueLockMechanismLastKnownActionUnsecuredUsingPhysicalMovementExterior

HMCharacteristics.h

“

”

Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

第 136 張，共 307 張

JavaScript

> null == undefined;

< TRUE

> isFinite(null);

< TRUE

> isFinite(undefined);

< FALSE

> null === undefined;

< FALSE

Try it in your Chrome browser right now! View -> Developer -> JavaScript Console

第 137 張，共 307 張

JavaScript

> Math.max(3, 0);

< 3

> Math.max(3, true);

< 3

> Math.max(-1, [1]);

< 1

Try it in your Chrome browser right now! View -> Developer -> JavaScript Console

第 138 張，共 307 張

JavaScript

> Math.max(-1, []);

< 0

> Math.max(-1, undefined);

< NaN

> Math.max(-1, null);

< 0

> var a = null + 1

< 1

Try it in your Chrome browser right now! View -> Developer -> JavaScript Console

第 139 張，共 307 張

Some quirks are found in many languages...

> String me = new String("\u00DFrian Vanpee"); // ßrian Vanpee

> System.out.println(me + " is now " + me.toUpperCase());

< ßrian Vanpee is now SSRIAN VANPEE

For a Fascinating Read go to https://goo.gl/G7QAI

第 140 張，共 307 張

Some quirks are specific to a language...

> int a = 0; System.out.println(a++ + " is the new " + a++);

< 0 is the new 1

> int a = 0; cout << a++ << " is the new " << a++ << endl;

< 1 is the new 0

// In C:

> int a = 0; printf ("%d is the new %d", a++,a++);

< 1 is the new 0

第 141 張，共 307 張

Sequence Points

A sequence point is a point in time at which side effects which have been seen so far are guaranteed to be complete.

Wikipedia

“

”

Source: https://en.wikipedia.org/wiki/Sequence_point

Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

第 142 張，共 307 張

Sequence Points

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression.

Section 6.5 - “Expressions” of the C99 Standard

“

”

Source: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

Source: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis non erat sem

第 143 張，共 307 張

Sequence Points

> int a = 0; cout << a++ << " is the new " << a++ << endl;

< 1 is the new 0

> int b = 0; b = b++; b = b++; cout << "b is " << b << endl;

< b is 0 b = b++; is equivalent to b = b; 0 + 1;

> int c = 0; c = ++c; c = ++c; cout << "c is " << c << endl;

< c is 2 c = ++c; is equivalent to c = c + 1;

第 144 張，共 307 張

Finding the Quirk is the real fun!

Proprietary + Confidential

第 145 張，共 307 張

What does this do?

> return 1, 2, 3;

第 146 張，共 307 張

The comma operator (in C/C++, JS, Perl)

LHS , RHS Always evaluate LHS, Always evaluate RHS

LHS && RHS Always evaluate LHS, If LHS is true evaluate RHS

LHS || RHS Always evaluate LHS, If LHS is false evaluate RHS

Note: Not Every comma is an operator!

Eg. for (i = 0, j = 10; i <= 10; i++, j-- ) { ... }

第 147 張，共 307 張

Comma Operator in C/C++, JS, Perl

> return 1, 2, 3;

< 3

> return(1), 2, 3;

< 3

> return a=1, b=2, c=3;

< 3

> return 1, return 2, return 3;

< Syntax Error!

第 148 張，共 307 張

What value for i prints ‘Well Done’?

if (i == -i) {

System.out.print("Well ");

}

System.out.println("Done!");

C/C++ : int i = 0;

OBJ-C : int i = 0;

JAVA : int i = 0;

JS : int i = 0;

PHP : $i = 0;

第 149 張，共 307 張

What value for i prints ‘Well Done’?

if (i == -i && i != 0) {

System.out.print("Well ");

}

System.out.println("Done!");

C/C++ : int i = INT_MIN;

OBJ-C : int i = INT_MIN;

JAVA : int i = Integer.MIN_VALUE;

JS :

PHP :

第 150 張，共 307 張

Why this works

Recall that the range of 32-bit integer is -2³¹ to 2³¹-1

-2^31 = 0x80000000 -> 1 sign bit, all 0’s

Negating Integer.MIN_VALUE gives:

Integer.MAX_VALUE + 1 = 0x7FFFFFFF + 1 = 0x80000000

In other words, Integer.MIN_VALUE is its own negation!

第 151 張，共 307 張

What value for i prints ‘Well Done’?

if ($i == $i + 1) {

echo ("Well ");

}

echo ("Done!");

C/C++ : float i = INFINITY;

OBJ-C : double i = INFINITY;

JAVA : double i = Double.POSITIVE_INFINITY;

JS : var i = Number.POSITIVE_INFINITY;

PHP : $i = INF;

第 152 張，共 307 張

What value for i prints ‘Well Done’?

if (i != i) {

System.out.print("Well ");

}

System.out.println("Done!");

C/C++ : double i = 0.0 / 0.0; OR double i = NAN; OR #define i = a++

OBJ-C : double i = NAN;

JAVA : double i = Double.NaN;

JS : double i = Number.NaN;

PHP : $i = NAN;

第 153 張，共 307 張

What does this do?

int i = 5 - - - - - - - - 6;�while (i --> 1) { System.out.println(i + “...”); }

> 10...9...8...7...6...5...4...3...2...1...

Equivalent to:

int i = 5 - (- (- (- (- (- (- (- 6)))))));�while (i-- > 1) { System.out.println(i + “...”); }�

第 154 張，共 307 張

Next Presentation!

(Thank you!)

Proprietary + Confidential

第 155 張，共 307 張

ML Algorithm for Setting up� Mobile Test Environment

Rajkumar J. Bhojan

Principal Consultant,

Wipro Technologies.

GTAC - 2016

第 156 張，共 307 張

Agenda

About Me
Mobile Test Environment
Device Management - Challenges
ML Algorithm for Selecting Devices
Q&A

第 157 張，共 307 張

Rajkumar J Bhojan (Raj)

Principal Consultant, Wipro Technologies, Boston, MA

Two Decades of professional experience in both Academics and IT

Research Scholar, published many papers and filed couple of Patents

Worked as a Test Architect, QA Manager, Corporate Trainer, Scrum Master and Test Consultant

第 158 張，共 307 張

Mobile Test Automation Environment

OS Platform

Network Environment

Testing Types

Mobile Devices ?

Mobile Testing Tools

第 159 張，共 307 張

Mobile Device Management – Challenges

Devices Released Every Year

Refresh Every 12-24 months

New Capabilities (Camera, GPS, Orientation, Voice, etc.)

Multiple Carriers, N/w switching, Disconnected Use, Throughput

Expensive to Buy

第 160 張，共 307 張

Mobile Device Management – Challenges (Contd.,)

Regional N/w - 400 N/w Operators

Processing Speed, Memory, Commn. Protocols

Non-Availability of Devices

Tight Deadline – Faster time to Market

第 161 張，共 307 張

Selecting Real Devices

What are the key criteria for selecting the devices ?

Does my device compatible with all types of mobile testing ?

What is the permutation and combination of devices for my regression testing?

What is the Device Market share ?

What about the User Ratings / Reviews ?

第 162 張，共 307 張

Selecting Optimal Devices

Selecting Devices from a pool of Hundreds and thousands of devices

How to classify these devices?

24,000 Android Devices

第 163 張，共 307 張

Decision Tree Algorithm

Cleaned the Dataset 120/15000 Samples
Separated Trained Data and Test Data
Computed the Entropy
Classified data, based on Mobile OS and other Attributes
Constructed Decision Tree for available Devices
Predicted devices for next testing
Ensemble Algorithm - Repeated the classification step for each attributes.

第 164 張，共 307 張

Sample Data

Data Set - 200/15000 Devices

Table 1– Trained Data

DEVICE_NAME	OS	MEMORY	MANUFACTURER	SCREEN_SIZE	SCR_RESOLUTION	PPI	RELEASE_DATE	TREG1	SELECTION
Google Pixel XL	Android	128	Google	5.5	1440x2560	534	16-Oct	0	1
Iphone7	IOS	32	Apple	4.7	1334x750	326	16-Sep	1	1
Samsung Galaxy S6	Android	64	Samsung	5.1	1440x2560	515	15-Apr	0	1
Iphone 6Plus	IOS	128	Apple	5.5	1080x1920	401	14-Sep	1	0
IPad2	IOS	256	Apple	9.7	1024x768	132	11-Mar	1	1
Iphone5S	IOS	256	Apple	4	640x1136	326	13-Sep	0	0
Samsung Galaxy S7	Android	128	Samsung	5.5	1440x2560	534	16-Mar	0	1
Samsung Galaxy Note4	Android	256	Samsung	10.1	800x1280	149	14-Feb	1	1
Iphone5S	IOS	64	Apple	4	640x1136	326	13-Sep	0	0
Google Pixel XL	Android	128	Google	5.5	1440x2560	534	16-Oct	0	1

第 165 張，共 307 張

Classification Graphs

第 166 張，共 307 張

Demo Screen Shot

#Import all required Library

from pandas import Series, DataFrame

import pandas as pd

import numpy as np

import os

import matplotlib.pylab as plt

from sklearn.cross_validation import train_test_split

from sklearn.tree import DecisionTreeClassifier

from sklearn.metrics import classification_report

import sklearn.metrics

from sklearn import tree

import warnings

warnings.filterwarnings('ignore')

#Load the data from CSV files

df = pd.read_csv("D:/DataSet/Device_demo.csv")

# Clean the Data

data_clean = df.dropna()

data_clean.dtypes

data_clean.describe()

第 167 張，共 307 張

Demo Screen Shot

features = list(clf.columns[:7])

y = clf["SELECTION"]

X = clf[features]

clf = tree.DecisionTreeClassifier()

clf = clf.fit(X,y)

第 168 張，共 307 張

Demo Screen Shot

from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier(n_estimators=10)

clf = clf.fit(X, y)

# Existing Data with Selection

print(clf.predict([1,64,3,5.1,1,515,4])) ---- > 1

# Existing Data with Not Selected

print(clf.predict([2,128,2,5.5,3,401,5])) ------→ 0

# New data with different memory size

print(clf.predict([1,128,3,5.1,1,515,4])) ----→ 1

第 169 張，共 307 張

Take Aways:

Predicted value = A set of Recommended Devices for the regression testing

Leverage ML for identifying devices

2. Time & Cost

3. Device Based Issues

第 170 張，共 307 張

Thank You

Rajkumar J. Bhojan

Wipro Technologies

Q & A

第 171 張，共 307 張

“Can you hear me now?” :

Surviving Audio Quality Testing

Alexander Brauckmann

Dan Hislop

第 172 張，共 307 張

Surviving Audio Quality Testing

Presenters

Introduction

Scale of the Challenge

Audio Test Architecture Vision

AQA as a Service

AQA Demo

Q&A

Dan Hislop

Alexander Brauckmann

Intro (Alex)

Graduate student at TU Dresden
Besides studies, involved in Audio Scale Testing, Audio Quality Automation

Summary (Alex)

“Audio Quality makes or breaks real-time products and solutions.”

developed a novel service that implements multiple established methods of audio evaluation, accessible through a common interface.
provides objective metrics for audio quality and presence
adapted by different teams

Intro (Dan) I’ve worked as a tester, manager, and architect all focused on various aspects of delivering audio & audio quality
Contents (Dan) - starting at very high level of audio:
The scale at citrix and the testing challenges we set out to solve a few years ago
Ways any of us can approach audio quality testing
FUNNEL DOWN: One approach we chose Specific components
FOCUS ON: A specific component that Alex built called Audio Quality Assessment service … details and demo

第 173 張，共 307 張

Scale of the Challenge

第 174 張，共 307 張

Citrix SaaS Family

第 175 張，共 307 張

Scale of Citrix GoTo Audio

1.4 Billion

audio minutes per month

scale

14 Million

audio minutes per customer escalation

quality

Reduced by factor of 10

the # of servers hosting audio,

while audio minutes increased by 125%

efficiency

第 176 張，共 307 張

Testing Challenges to Solve

Manual audio testing is time consuming and subjective
“Unrelated” changes can affect audio delivery
Re-usable test anyone can launch to verify audio

No audio is a dealbreaker for online meetings

Limited number of audio experts for testing
Many teams incorporating audio platform into product
Make testing easy for non-experts

第 177 張，共 307 張

Audio Test Architecture Vision

第 178 張，共 307 張

Audio Test Architecture Vision

Improve overall audio quality for Citrix customers

with unified internal teams

using best-in-class audio testing

��

第 179 張，共 307 張

Audio Quality Testing: Categories

Automating the top of the pyramid is the hardest to solve, requiring audio experts and product teams to work together.

第 180 張，共 307 張

Audio Quality Testing: Design Goals

Create a universal test solution that ALL teams can use to validate audio quality before release

Use audio expertise to create common components

Automated audio quality measurement pass/fail in CI after every audio-related check-in.

Provide tools & libraries for client teams to extend their existing automation.

第 181 張，共 307 張

Audio Quality Testing: Key Components

Standardized input files:

4 unique voices

Mock Mic: to inject audio file into client as if a person using a microphone

Mock Speaker: capture audio from client as if a person listening

Scoring Server: to compare the input to output and determine quality

Clean Audio

Reference Files

1

2

Client 1

Injects Audio

Client 2

Records Audio

AQA Service

Analyzes Audio

File Transfer

Media Flow

File Transfer

第 182 張，共 307 張

Audio Quality Testing: Overview

Client specific
Win
Mac
Web
iOS
Android
Windows Mobile
PSTN

Common
Standard Input Files

Common
AQA Server
POLQA License
Frequency Analysis

Clean Audio

Reference Files

1

2

Client 1

Injects Audio

Client 2

Records Audio

AQA Service

Analyzes Audio

Final Score and Pass/Fail

File Transfer

Client specific
Win
Mac
Web
iOS
Android
Windows Mobile
PSTN

Media Flow

File Transfer

Common
AQA Server
POLQA License
Frequency Analysis

第 183 張，共 307 張

AQA as a Service

第 184 張，共 307 張

AQA Architecture and Implementation

Client

1: Upload files

2: Create job

3: Start job

4: Fetch result

AQA Service

第 185 張，共 307 張

Audio Quality Testing

POLQA (“Perceptual Objective Listening Quality Assessment”)

Reference/Played-out Signal

Degraded/Recorded Signal

Active Speech Ratios
MOS (Mean Opinion Score)
Delay

MOS Score:

Predict speech quality based as obtained in subjective listening tests
Ranges between 5 (good) to 1 (bad)

第 186 張，共 307 張

Audio Presence Testing

Frequency Analysis

Speech Presence

Amplitude Analysis

1 kHz

Misc.

0.5 kHz

1 kHz

Speech

0.9

0.5

0.1

第 187 張，共 307 張

Demo

第 188 張，共 307 張

Demo

第 189 張，共 307 張

Demo

第 190 張，共 307 張

Demo

第 191 張，共 307 張

Q & A

第 192 張，共 307 張

IATF: An Automated Cross-platform and Multi-device API Test Framework

Yanbin Zhang

Intel software and services group

第 193 張，共 307 張

What is IATF

IATF - Interactive API Test Framework

An Automated Cross-platform and Multi-device API Test Framework

第 194 張，共 307 張

Why we need IATF

Project background

Intel® Collaboration Suite for WebRTC community at http://webrtc.intel.com

Intel CS for WebRTC Conference Server
Intel CS for WebRTC Client SDK
Intel CS for WebRTC Gateway

第 195 張，共 307 張

Project background http://webrtc.intel.com�

Telecom

A Growing Ecosystem of Intel® Collaboration Suite for WebRTC

Medical

Industry

Cloud

Many other Usages

Social Media

Online

Broadcasting

Education

Wearable

第 196 張，共 307 張

Why we need IATF

P2P Servers

stream 1

Signaling

Peer-to-Peer Video Communication with Peer Server

Project background

http://webrtc.intel.com

第 197 張，共 307 張

Why we need IATF

RTC Servers

stream 2

stream 3

stream 1

streams 1,2,3

mixed streams 1+2+3 (SD)

mixed streams 1+2+3 (HD)

stream 1

Project background

MCU-based multi-party video conference communication modes

http://webrtc.intel.com

第 198 張，共 307 張

Why we need IATF

How to automatically test the interoperability across those various SDKs on different platform becomes a big problem

JavaScript

C++

第 199 張，共 307 張

Challenges of our WebRTC SDK Testing

Test steps in a single test case run at different devices and with different programming language

Test steps are tied and depended to other steps

Interaction and Dependence

第 200 張，共 307 張

Challenges of our WebRTC SDK Testing

Can not dependence on UI automation framework.

No automation test framework supports design and runs such kind of interactive API test cases

No UI application

No existing API test framework

第 201 張，共 307 張

Not only for WebRTC

Cross-platform and multi-device gaming SDK testing

Cross-platform and multi-device messaging SDK testing

……

Any SDK supporting multi-user communication!

第 202 張，共 307 張

IATF Design Philosophy

Deploy test cases automatically on Android, iOS, Windows, Linux, etc.

- Real-time status sharing and control.

- Easy to integrate with third-party API test framework

Cross-platform support

Cross-device communication

Extensible

第 203 張，共 307 張

Test case workflow

Peer Server

Intel Reference

Signaling Server

Implementation

Bob

2. connect

4 Receive Invitation

5. Accept Invitation

Intel CS for WebRTC Android SDK

Intel CS for WebRTC JS SDK

1 connect

3 .Invite Bob

6. Receive accepted message

RTP

7. Publish local stream to Bob

8. Publish local stream to Alice

Socket.io

Alice

Typical WebRTC P2P test scenario

第 204 張，共 307 張

Demo – WebRTC P2P test scenario

Application using Intel WebRTC JS SDK<-> Intel WebRTC Android SDK

第 205 張，共 307 張

第 206 張，共 307 張

Demo

WebRTC P2P test scenario with IATF

第 207 張，共 307 張

第 208 張，共 307 張

IATF Logical and Physical Structures

第 209 張，共 307 張

第 210 張，共 307 張

IATF General Execution Steps

Test Case Building and Deployment

Run Test

Get test result and generate test reports

Clean test environment

Read test environment configuration

Check the test devices status

Prepare test devices

Read , build and Deploy test cases

Test steps sequence controller

Devices status monitor

第 211 張，共 307 張

Test Case Building and Deployment

Deploy test cases

IATF Controller Server

Alice

Bob

3. Read configuration and clean test environment

port=10086

adbPath=/usr/bin/adb

antPath=/usr/bin/ant

shellPath=/bin/sh

karmaPath=/usr/lib/node_modules/karma/bin/karma

………

1. Start lock server

2. Check the test devices status and prepare test devices

karma start ….

adb install

Peer Server

Intel reference implementation

Sequence control message

第 212 張，共 307 張

Test Sequence Control with IATF

IATF

JavaScript

C++

Objective-C

Java

Firefox

Windows

iOS

Android

Chrome

Key problem we should solve

.

Communication

第 213 張，共 307 張

Test Sequence Control with IATF

IATF

JavaScript

C++

Objective-C

Java

Firefox

Windows

iOS

Android

Chrome

Platform dependent APIs are provided below

lockEvent can be defined as specified per your test scenario steps.

Connect(“IATF Server Address”)

WaitLock(“lockEvent”)

NotifyLock( “lockEvent”)

第 214 張，共 307 張

Test Sequence Control with IATFq

Join waiting list

Remove from waiting list

Search waiting list

IATF Lock Server Module

WaitRemote()

OnNotify()

WaitLock(“Invite to TestClient1”)

NotifyLock( “Invite to TestClient1”)

TestClient1

TestClient2

Status: unLocked

Status: Locked

第 215 張，共 307 張

Test Steps Sequence Control Example

IATF Test Server

Peer Server

Status: Locked

WaitLock(invite, From:Alice)

API: Invite(Bob)

WaitLock(Accept,From:Bob)

API: Invite(Bob)

Status: Unocked

Status: Locked

NotifyLock(invite,From: Alice)

Alice

Bob

NotifyLock(invite,From: Alice)

Status: Unlocked

1 Status: Connect

API: onInvited event is trigged

Status: Locked

WaitLock(startTest)

Status: Locked

WaitLock(startTest)

NotifyLock(startTest)

第 216 張，共 307 張

Demo – Intel WebRTC P2P test scenario with IATF

Android SDK<->Android SDK

第 217 張，共 307 張

217

第 218 張，共 307 張

Demo - Intel CS for WebRTC Conference test scenario with IATF

MCU Servers

stream 2

stream 3

Mixed stream 1+2+3+4+5

mixed streams 1+2+3+4+5

stream 1

Mixed stream 1+2+3+4+5

mixed streams 1+2+3+4+5

stream 4

stream 5

第 219 張，共 307 張

第 220 張，共 307 張

Questions?

第 221 張，共 307 張

Thank You!

第 222 張，共 307 張

Legal Notices and Disclaimers

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance�

This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. Statements in this document that refer to Intel’s plans and expectations for the quarter, the year, and the future, are forward-looking statements that involve a number of risks and uncertainties. A detailed discussion of the factors that could affect Intel’s results and plans is included in Intel’s SEC filings, including the annual report on Form 10-K. �

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. �

第 223 張，共 307 張

Using Formal Concept Analysis in software testing

Fedor Strok

第 224 張，共 307 張

Model-Based

224

Model

System

Run test

Abstract tests

第 225 張，共 307 張

Requirements�

Model

Test sequence
Oracle

Implementation

Test output: fail or pass

225

第 226 張，共 307 張

Models

Model Driven Engineering

Extract from existing systems
Model before implementation
Build both system and model

226

第 227 張，共 307 張

Alternatives

Online

Uses the SUT

Offline

Generates test cases, which can be executed in future

227

第 228 張，共 307 張

Possible implementations

State Machine

Theorem proving

constraint programming

Markov chain

Input parameter modelling

228

第 229 張，共 307 張

Formal Concept Analysis

G – set of objects

M – set of attrivutes

I – binary relation between G and M, interpretation: object has attribute

229

第 230 張，共 307 張

Formal Concepts and Implications

If an object has all attributes from A, the it has all attributes from B:

Armstrong Rules:

230

第 231 張，共 307 張

Attribute Exploration

Interactive procedure of exploring the domain

Can be started with arbitrary set of data

Works with 2 types of data:

set of implications

set of counter-examples

demo available: https://github.com/orivej/fca

231

第 232 張，共 307 張

Example

Simple program

Input: one number

Output: features of given number (even, factorial, divided_by_three, odd, prime)

232

第 233 張，共 307 張

Example

Does the implication holds?

→ even, factorial, divided by three, odd, prime

233

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X

第 234 張，共 307 張

Example

Does the implication holds?

→ even, factorial, prime�

234

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X

第 235 張，共 307 張

Example

Does the implication holds?

→ prime�

235

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X

第 236 張，共 307 張

Example

Does the implication holds?

factorial → even

�

236

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X

第 237 張，共 307 張

Example

Does the implication holds?

even → factorial

�

237

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X

第 238 張，共 307 張

Example

Does the implication holds?

divided_by_three → even,factorial

�

238

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X

第 239 張，共 307 張

Example

Does the implication holds?

factorial, divided by three → even

Yes, first factorial dividable by 3, is dividable by 2

239

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X

第 240 張，共 307 張

Example

Does the implication holds?

prime, divided by three → even, factorial, odd

240

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X

第 241 張，共 307 張

Example

Does the implication holds?

prime, divided by three → odd

Yes, only 3

241

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X

第 242 張，共 307 張

Example

Does the implication holds?

even, odd → factorial, prime, divided by three

Left part is always false

242

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X

第 243 張，共 307 張

Example

Does the implication holds?

even, divided by three → factorial

243

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X
12	X		X

第 244 張，共 307 張

Example

Does the implication holds?

even, prime → factorial

Yes, 2

244

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X
12	X		X

第 245 張，共 307 張

Example

Does the implication holds?

factorial, prime → even

Yes, 2

End

245

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X
12	X		X

第 246 張，共 307 張

Example

factorial, prime → even

factorial, divided by three → even

prime, divided by three → odd

even, odd → factorial, prime, divided by three

even, prime → factorial

246

G\M	even	factorial	Dividedby three	odd	prime
2	X	X			X
5				X	X
6	X	X	X
1		X		X
8	X
9			X	X
3			X	X	X
12	X		X

第 247 張，共 307 張

MBT approach

Even: 1,0

Factorial: 1, 0

Divided_by_three: 1, 0

Odd: 1, 0

Prime: 1, 0

247

第 248 張，共 307 張

MBT: no conditions

248

even	factorial	Divided_by_three	odd	prime
1	1	0	0	1
0	0	1	1	0
1	0	1	1	1
0	1	0	1	0
1	0	1	0	0
0	1	1	0	1
0	0	0	0	0

第 249 張，共 307 張

MBT: same set

Even: 1,0

Factorial: 1, 0

Divided_by_three: 1, 0

Odd: 1, 0

Prime: 1, 0

Result: 12, 9, 6, 3, 2, 1

249

第 250 張，共 307 張

Implications

IF [Even] = 1 THEN [Odd] = 0 ELSE [Odd] = 1;

IF [Odd] = 1 AND [Factorial] = 1 THEN [Result] = 1;

IF [Even] = 1 AND [Prime] = 1 THEN [Result] = 2;

IF [Divs3] = 1 AND [Prime] = 1 THEN [Result] = 3;

IF [Divs3] = 1 AND [Even] = 1 THEN [Result] IN 6, 12;

IF [Divs3]=1 AND [Odd]=1 AND[Prime]=0 THEN [Result] = 9;

IF [Even] = 1 AND [Factorial] = 1 AND [Divs3] = 0 THEN [Result] = 2;

250

第 251 張，共 307 張

Data driven implications

IF [Result] = 2 THEN [Even] = 1 AND [Factorial] = 1 AND [Divs3] = 0 AND [Odd] = 0 AND [Prime] = 1;

IF [Result] = 3 THEN [Even] = 0 AND [Facto- rial] = 0 AND [Divs3] = 1 AND [Odd] = 1 AND [Prime] = 1;

IF [Result] = 6 THEN [Even] = 1 AND [Facto- rial]=1AND[Divs3]=1AND[Odd]=0AND [Prime] = 0;

IF [Result] = 9 THEN [Even] = 0 AND [Facto- rial] = 0 AND [Divs3] = 1 AND [Odd] = 1 AND [Prime] = 0;

IF [Result] = 12 THEN [Even] = 1 AND [Facto- rial] = 0 AND [Divs3] = 1 AND [Odd] = 0 AND [Prime] = 0;

251

第 252 張，共 307 張

MBT result

252

event	factorial	Divided by three	odd	prime	Result
0	0	1	1	1	3
1	1	1	0	0	6
0	0	1	1	0	9
1	0	1	0	0	12
1	1	0	0	1	2
0	1	0	1	0	1

第 253 張，共 307 張

Pros

Generated output has 2 parts

Test cases
Dependencies between attributes in the form of implications

The set of attributes can be extended:

Need to fill it for existing objects

253

第 254 張，共 307 張

White-box

We can use attribute exploration for covering if’s

254

第 255 張，共 307 張

255

第 256 張，共 307 張

256

第 257 張，共 307 張

257

第 258 張，共 307 張

258

第 259 張，共 307 張

259

第 260 張，共 307 張

260

第 261 張，共 307 張

261

第 262 張，共 307 張

262

第 263 張，共 307 張

Attribute negation

263

第 264 張，共 307 張

Lattice usage

Formal concepts could be partially ordered by subsumption on extents/intents

All concepts form a lattice

Can be used for analysis for test report

264

第 265 張，共 307 張

Lattice usage example

265

G\M	failed	https	login	messages
1	X	X	X
2		X		X
3	X		X
4				X

第 266 張，共 307 張

266

第 267 張，共 307 張

267

第 268 張，共 307 張

Lattice usage

Iceberg analysis is equivalent to finding most common descriptions of failed tests

In big systems lattice is a good representation for finding similar functionality

268

第 269 張，共 307 張

Future work

Using non-binary attributes

Test failure hypothesis generation

269

第 270 張，共 307 張

+7 (915) 103-79-28

fdrstrok@yandex-team.ru, fdr.strok@gmail.com

Thank you

Fedor Strok

Group head

@twitter

skype@

facebook

instagram

bitbucket

github

fdrstrok@

Contacts

Контакты

第 271 張，共 307 張

Test Flakiness @Google

Predicting & Preempting Flakes

By: John Micco¹ / Atif Memon²

¹Google Inc. (jmicco@google.com)

²University of Maryland, College Park (atif@cs.umd.edu)

Confidential + Proprietary

第 272 張，共 307 張

Flaky Tests

Test Flakiness is a huge problem

Flakiness is a test that is observed to both Pass and Fail with the same code

We observe that 84% of transitions from Pass -> Fail are flakes!

Almost 16% of our 3.5M tests have some level of flakiness

Flaky failures frequently block and delay releases

We spend between 2 and 16% of our CI compute resources re-running flaky tests

Confidential + Proprietary

第 273 張，共 307 張

Percentage of CI resources spent re-running flakes

% of testing compute hours spent on retrying flaky tests

Confidential + Proprietary

第 274 張，共 307 張

Sources of Flakiness

Factors that cause flakes

Test case factors

Waits for resource

sleep()

Webdriver test

UI test

Code being tested

Multi-threaded

Execution environment/flags

Chrome

Android

...

274

Exec

Env

Code

Being

Tested

Test

Case

Android

UI

Multi-threaded

See: https://pdfs.semanticscholar.org/02da/46889ee3c6bc44bfa0fc45071195781b99ce.pdf

第 275 張，共 307 張

Flakes are Inevitable

Continual rate of 1.5% of test executions reporting a "flaky" result

Despite large effort to identify and remove flakiness

Targeted "fixits"

Continual pressure on flakes

Observed insertion rate is about the same as fix rate

Conclusion: CI systems must be able to deal with a certain level of flakiness. Preferably minimizing the cost to developers

Confidential + Proprietary

第 276 張，共 307 張

Google's Continuous Integration System

Continuously runs 3.5M tests as changes are submitted

Only "triggers" (RTS) a test if the test depends (transitively) on the change

Each test runs in 2 distinct flag combinations (on average)

Records the pass / fail result for each test in a database

Each run is uniquely identified by the test + flags + change

We have 2 years of results for 3.5M tests

Just starting to do real analysis on this data

See: prior deck about Google CI System, See this paper about piper and CLs

Confidential + Proprietary

第 277 張，共 307 張

Life of a Test Execution

Selected Tests

Regression

Test

Selection

Build Enqueuer

Batches of Tests to run

Scheduler

Build Queue

Batches of Tests to run

Massively Parallel Test Backend

Developer

Submission

Batches of Tests to run

Build Failure Retrier

Batches of Tests to run

Goal is to minimize time between submission and test results provided to developer using minimum compute resources.

Test Results

Flakes

Confidential & Proprietary

第 278 張，共 307 張

Flaky Test Infrastructure

We re-run test failure transitions (10x) to verify flakiness

If we observe a pass the test was flaky

Keep a database and web UI for "known" flaky tests

Confidential + Proprietary

第 279 張，共 307 張

Speaker Changes

to Atif

Confidential + Proprietary

第 280 張，共 307 張

I AM 97% CONFIDENT THAT THIS TEST RESULT IS FLAKY!

BECAUSE REAL TESTS DON’T “BEHAVE” LIKE THAT!

HOW CAN YOU TELL WITHOUT RE-RUNNING IT?

Confidential + Proprietary

第 281 張，共 307 張

Flaky Test Infrastructure (continued)

Identifying Flaky tests without re-running them

Follow intuition

Simple signal of P -> F -> P patterns to indicate flakiness

Develop statistical models of features highly correlated with flakes

First models show promise - classifying 90% of the flakes correctly

Develop statistical models of features highly correlated with real failures

Deviations highly likely to be flakes

Formally model flakes and their behavior

Confidential + Proprietary

第 282 張，共 307 張

Modeling Test Target Behavior (via Edges)

//top/project/some_service_test

P

-

F

-

F

-

P

-

Negative Edge

Positive Edge

-

F

-

Negative Edge

CLs

Edge modeled as StartCL || EndCL || Length || POS/NEG

	All Edges	Confidently due to Flakes	Most likely not including Flakes
Positive	574,282	485,435 (84.5%)	88,847 (15.5%)
Negative	563,993	474,654 (84.2%)	89,339 (15.8%)

Take away message: Small % (1.5-2%) tests flakes (TAP spanner database/total targets in Feb11-Mar11 period); BUT, they lead to majority of edges (edges are better indicators of overall impact of flakes)

Affected

P

F

-

Failed

Passed

第 283 張，共 307 張

I HYPOTHESIZE: FLAKES HAVE LARGER NUMBER OF EDGES PER TIME PERIOD.

5 HOUR PERIOD

TEST 1

TEST 2

Confidential + Proprietary

第 284 張，共 307 張

Number of Edges Per Target by % Flakes/NotFlakes

Number of Edges

Percentage Flakes vs. Not Flakes

第 285 張，共 307 張

Number of Edges Per Target by % Flakes/NotFlakes

Number of Edges

Percentage Flakes vs. Not Flakes

Take away message: Test targets with more edges in their history are more likely to be flakes.

(Number of edges = signal for flake detection)

Most likely Flakes not detected by TAP

第 286 張，共 307 張

Quantifying Flakiness

Compute Flakes Score

Extract vectors F+.*P+.*F+

Flakes scoring formula

𝜱(number_of_vectors, [lengths_of_vectors])

Order tests by flakiness score

Top 404 (TAP agreed 100%)

991 of my top 1000 (99.1%)

4,948 of my top 5,000 (98.96%)

9,868 of my top 10,000 (98.68%)

19,372 of my top 20,000 (96.86%)

23,930 of my top 25000 (95.72%)

Decreasing TAP-agreement % good

286

第 287 張，共 307 張

5 HOUR PERIOD

TEST 1

TEST 2

TEST 3

TEST 4

I HYPOTHESIZE: FLAKES ARE UNLIKELY TO SHARE THEIR HISTORIES WITH OTHERS.

Confidential + Proprietary

第 288 張，共 307 張

Modeling Histories of Tests

t₁

t₂

t₃

P

-

F

-

F

-

P

-

F

-

t₄

F

-

F

-

P

-

F

-

F

t₅

t₆

t₇

F

-

P

-

F

-

F

t₈

F

-

F

-

P

-

F

P

F

-

F

-

P

-

F

-

F

P

-

F

-

F

-

P

-

F

-

P

-

F

-

F

-

P

-

F

-

P

-

F

-

F

-

P

-

F

-

Confidential + Proprietary

第 289 張，共 307 張

“Length of Edge History” vs. Shared Outcomes

“Target History” = Concat All Edges over time period.

Multiple targets share history.

2 targets share history. Edges in history = ~20

5000+ targets share history. Edges in history = 2

No sharing along y-axis

Very little sharing (2) in Sharing=2 column

Lots of sharing

第 290 張，共 307 張

“Length of Edge History” vs. Shared Outcomes

“Target History” = Concat All Edges over time period.

Multiple targets share history.

Take away message: Test targets that share history with other targets very unlikely to be flakes.

(“degree of sharing” = signal for flake detection)

All Flakes lie in

“No Sharing” or “Very little sharing” area here

第 291 張，共 307 張

I AM 90% CONFIDENT THAT THIS IS A REAL FAILURE, NOT FLAKY!

HOW CAN YOU TELL?

TEST CASE

CODE

CODE UNDER TEST

CODE

AUTHOR

PROGRAMMING

LANGUAGE

FILE MODIFICATION FREQUENCY

BECAUSE THE CODE & TEST ARTIFACTS ARE CONSISTENT WITH A REAL FAILURE!

Confidential + Proprietary

第 292 張，共 307 張

Statistical Models of Features Correlated with Real Edges

Relationship between code and test

Minrank

Code modification frequency

Source file type

Changelist authors

File modification by multiple authors

Confidential + Proprietary

第 293 張，共 307 張

Relationship Between Code and Test

Confidential + Proprietary

第 294 張，共 307 張

Minrank Population Distribution

294

Minranks

Frequency

第 295 張，共 307 張

Minranks for Edge Targets Distribution

295

Minranks

Probability

Take away message: Test target edges farther than Minrank=10 highly likely to be flaky.

第 296 張，共 307 張

Code Modification Frequency

P (File in Edge CL)

Number of Times File in CL

Take away message: Failures associated with frequently modified code highly unlikely to be flaky.

Confidential + Proprietary

第 297 張，共 307 張

Source File Type

File Types (by extensions)

Frequency (in 1000’s)

Take away message: E.g., failures associated with config files highly likely to be flaky.

Confidential + Proprietary

第 298 張，共 307 張

Changelist Authors

Take away message: E.g., failures associated with author product1-release highly unlikely to be flaky.

Confidential + Proprietary

第 299 張，共 307 張

File Modification by Multiple Authors

Fraction of Breakages

Number of Unique Users

Take away message: E.g., failures associated with code modified by

15 developers simultaneously highly unlikely to be flaky.

第 300 張，共 307 張

THAT TEST!

IT’S A FLAKE!!

BUT YOU DID NOT RUN IT EVEN ONCE!

TEST CASE

CODE

CODE UNDER TEST

CODE

AUTHOR

PROGRAMMING

LANGUAGE

FILE MODIFICATION FREQUENCY

TRUE! BUT THIS TEST HAS ALL THE ELEMENTS OF A FLAKE!

Confidential + Proprietary

第 301 張，共 307 張

Future work: Predicting that a test will be a flake before running it

Need a holistic look at flakes

Id factors that cause flakes

Test case factors

Waits for resource

sleep()

Webdriver test

UI test

Code being tested

Multi-threaded

Execution environment/flags

Chrome

Android

...

301

Exec

Env

Code

Being

Tested

Test

Case

Android

UI

Multi-threaded

第 302 張，共 307 張

Speaker Changes

to John

Confidential + Proprietary

第 303 張，共 307 張

Research Collaboration

We are looking to collaborate with researchers / other companies

Doing pure research looking for correlations in our data set

Applying that research to improve our automation for scheduling and detecting flakiness

Largest test result data set / pool of tests on the planet

Join our journal review club

Monthly open review of academic papers / articles in the area of software testing

Confidential + Proprietary

第 304 張，共 307 張

Q&A

Illustrations by

Confidential + Proprietary

第 305 張，共 307 張

Conference Badge

�

Wear your conference badge at all times!

Save your conference badge for use on Day 2.

第 306 張，共 307 張

Dinner

第 307 張，共 307 張

See you tomorrow!

Thank you for viewing the GTAC 2016

live stream!

�All sessions have been recorded and will be posted to the GoogleTechTalks YouTube channel after the conference is over.

See you tomorrow!