Data Analyst Associate �(DA-100)
Prepared by
MOHAN T
LECTURER, DEPT OF COMPUTER SCIENCE
GVPP GOVT FIRST GRADE COLLEGE
HAGARIBOMMANAHALLI
Microsoft Data Analytics
Data analysis
Data analysis is the process of identifying, cleaning, transforming, and modeling data to discover meaningful and useful information.
To analyse data, core components of analytics are divided into the following categories:
Descriptive analytics
Descriptive analytics help answer questions about what has happened based on historical data.
An example of descriptive analytics is generating reports to provide a view of an organization's sales and financial data.
Diagnostic analytics
Diagnostic analytics help answer questions about why events happened.
Predictive analytics
Predictive analytics help answer questions about what will happen in the future.
Cognitive analytics
Cognitive analytics attempt to draw inferences from existing data and patterns, derive conclusions based on existing knowledge bases, and then add these findings back into the knowledge base for future inferences, a self-learning feedback loop.
Prescriptive analytics
Prescriptive analytics help answer questions about which actions should be taken to achieve a goal or target.
Roles in data
Business analyst
A business analyst is closer to the business and is a specialist in interpreting the data that comes from the visualization.
Data analyst
Data engineer
Data scientist
Data scientists perform advanced analytics to extract value from data. Their work can vary from descriptive analytics to predictive analytics.
Database Administrator
A database administrator is responsible for the overall availability and consistent performance and optimizations of the database solutions.
Tasks of a Data Analyst
Prepare
Data preparation is the process of taking raw data and turning it into information that is trusted and understandable.
Model
Visualize
The visualization task is where you get to bring your data. The ultimate goal of the visualize task is to solve business problems.
Analyze
The analyze task is the important step of understanding and interpreting the information that is displayed on the report.
Manage
Power BI consists of many components, including reports, dashboards, workspaces, semantic models, and more. As a data analyst, you are responsible for the management of these Power BI assets, overseeing the sharing and distribution of items, such as reports and dashboards, and ensuring the security of Power BI assets.
____________is the process of identifying, cleaning, transforming, and modeling data to discover meaningful and useful information.
Data Analysis
_____________help answer questions about what has happened based on historical data.
Descriptive analytics
___________help answer questions about why events happened.
Diagnostic analytics
_____________help answer questions about what will happen in the future.
Predictive analytics
_____________help answer questions about which actions should be taken to achieve a goal or target.
Prescriptive analytics
A _______________is closer to the business and is a specialist in interpreting the data that comes from the visualization.
Business Analyst
Which data role enables advanced analytics capabilities specifically through reports and visualizations?
Data Analyst
___________perform advanced analytics to extract value from data.
Data Scientists
Which data analyst task has a critical performance impact on reporting and data analysis?
Model
The ________task is the important step of understanding and interpreting the information that is displayed on the report.
Analyze
Microsoft Power BI
Microsoft Power BI is a complete reporting solution that offers data preparation, data visualization, distribution, and management through development tools and an online platform.
Use Power BI
In order to create reports with Power BI, you must first understand the tools necessary. There are three primary components to Power BI:
Power BI Desktop is the development tool available to data analysts and other report creators.
Power BI service allows you to organize, manage, and distribute your reports and other Power BI items.
Power BI Mobile allows consumers to view reports in a mobile-optimized format.
Building blocks of Power BI
The building blocks of Power BI are semantic models and visualizations. Create a semantic model and then use visuals to build a report.
A semantic model consists of all connected data, transformations, relationships, and calculations. To follow the flow of Power BI, you first connect to data, transform data, and create relationships and calculations to create a semantic model.
In Power BI Desktop, when you create a visualization (also called visual), you add it to the canvas for a report page. Choose your visualizations to build pages in your report.
____________is a complete reporting solution that offers data preparation, data visualization, distribution, and management through development tools and an online platform.
Microsoft Power BI
What is the common flow of activity in Power BI?
Create a report in Power BI Desktop, share it to the Power BI service, and interact with reports in the service and Power BI Mobile.
Which of the following are building blocks of Power BI?
Semantic models and visualizations.
Prepare data for analysis with
Power BI
Power BI Desktop
Get data from files
Common types of flat files, which are
Get data from files
Power BI Desktop allows you to get data from many types of files.
You can find a list of the available options when you use the Get data feature in Power BI Desktop.
The first step is to determine which file location you want to use to export and store your data.
Flat file location
Get data from relational data sources
If your organization uses a relational database for sales, you can use Power BI Desktop to connect directly to the database.
Connecting to a SQL Server database to get data, Transact-Structured Query Language (T-SQL) language should you use to extract data
Connect to data in a relational database:
Get data from a NoSQL database
Some organizations don't use a relational database but instead use a NoSQL database. A NoSQL database (also referred to as non-SQL, not only SQL or non-relational) is a flexible type of database that doesn't use tables to store data.
Connect to a NoSQL database (Azure Cosmos DB)
Get data from Azure Analysis Services
Azure Analysis Services is a fully managed platform as a service (PaaS) that provides enterprise-grade semantic models in the cloud.
Connect to data in Azure Analysis Services
Get data from Azure Analysis Services
Connect live is an option for Azure Analysis Services. Azure Analysis Services uses the tabular model and DAX to build calculations, similar to Power BI.
If you want to directly query the Azure Analysis Services model, you can use multi-dimensional expressions (MDX) or data analysis expressions (DAX).
Common types of flat files, which are
All of the above
When connecting to a SQL Server database to get data, what language should you use to extract data?
T-SQL
You're creating a Power BI report with data from an Azure Analysis Services. How should you connect?
Live connection
Steps for Connect to data in a relational database
Home tab🡪 Get data feature 🡪 SQL Server
___________is a fully managed platform as a service (PaaS) that provides enterprise-grade semantic models in the cloud.
Azure Analysis Services
___________is a fully managed platform as a service (PaaS) that provides enterprise-grade semantic models in the cloud.
Azure Analysis Services
Expand T-SQL?
Transact-Structured Query Language
�Clean, transform, and load data in Power BI�
When examining the data, you discover several issues, including:
�Clean, transform, and load data in Power BI�
Power BI and Power Query offer you a powerful environment to clean and prepare the data. Clean data has the following advantages:
Shape the initial data
Power Query Editor in Power BI Desktop allows you to shape (transform) your imported data.
Get started with Power Query Editor
To start shaping your data, open Power Query Editor by selecting the Transform data option on the Home tab of Power BI Desktop.
In Power Query Editor, the data in your selected query displays in the middle of the screen and, on the left side, the Queries pane lists the available queries (tables).
1. Identify column headers and names: The first step in shaping your initial data is to identify the column headers and names within the data and then evaluate where they are located to ensure that they are in the right place.
2. Rename columns: The next step in shaping your data is to examine the column headers. You might discover that one or more columns have the wrong headers, a header has a spelling error, or the header naming convention is not consistent or user-friendly.
3. Remove top rows: When shaping your data, you might need to remove some of the top rows, for example, if they are blank or if they contain data that you do not need in your reports.
To remove these excess rows, select Remove Rows > Remove Top Rows on the Home tab.
4. Remove columns
A key step in the data shaping process is to remove unnecessary columns. It is much better to remove columns as early as possible
The first method is to select the columns that you want to remove and then, on the Home tab, select Remove Columns.
5. Replace null values
Occasionally, you might find that your data sources contain null values. For example, a freight amount on a sales order might have a null value if it's synonymous with zero. If the value stays null, the averages will not calculate correctly.
Combine multiple tables into a single table:
The ability to combine queries is powerful because it allows you to append or merge different tables or queries together.
Append queries
When you append queries, you'll be adding rows of data to another table or query.
Merge queries
When you merge queries, you're combining the data from multiple tables into one based on a column that is common between the tables.
__________in Power BI Desktop allows you to shape (transform) your imported data.
Power Query Editor
How to remove top rows in power Query Editor?
Home tab > Remove Rows > Remove Top Rows.
What is a risk of having null values in a numeric column?
That function AVERAGE of data will be incorrect.
If you have two queries that have different data but the same column headers, and you want to combine both tables into one query with all the combined rows, which operation should you perform?
Append
you're combining the data from multiple tables into one based on a column that is common between the tables, which operation should you perform?
Merge
Model data with Power BI
Introduction
When developing the model, you will complete the following tasks:
Understanding the structure of semantic models can help you design the right model to support your reports and dashboards.
Star schema design
Star schema refers to a design approach that's commonly used by relational data warehouse designers because it presents a user-friendly structure and it supports high-performance analytic queries.
Fact tables
The role of a fact table is to store an accumulation of rows that represent observations or events that record a specific business activity.
Dimension tables
Dimension tables describe your business entities, which commonly represent people, places, products, or concepts.
In a diagram, a fact table forms the center of a star, while dimension tables, when placed around a fact table, represent the points of the star.
Notice the relationships between the dimension and fact tables and that each relationship filter direction is pointing toward the fact table.
Analytic queries
An analytic query is a query that produces a result from a semantic model. Each Power BI visual, in the background, submits an analytic query to Power BI to query the model.
An analytic query has three phases that are implemented in the following order:
Analytic queries
Filtering, or slicing, targets the data of relevance. In Power BI reports, filters can be applied to three different scopes: the entire report, a specific page, or a specific visual.
Grouping, or dicing, divides query results into groups.
Summarizing produces a single value result. Typically, numeric columns are summarized by using summarization methods
In a Power BI Desktop model design, which type of object do you create to connect multiple tables?
Relationship
�____________describe your business entities, which commonly represent people, places, products, or concepts.
Dimension tables
The role of a __________ is to store an accumulation of rows that represent observations or events that record a specific business activity.
Fact table
Power BI model framework
Power BI offers you a choice when designing your model. You can use Power BI Desktop to develop your model, and you can develop it by using different frameworks. These frameworks help to deliver fast performance, near real-time results, or both.
Describe Power BI model fundamentals
This unit introduces Power BI model terms. It’s important that you understand these terms in order to choose the appropriate model framework for your project.
This unit describes the following terms:
Data model
A Power BI data model is a query-able data resource that’s optimized for analytics. Reports can query data models by using one of two analytic languages: Data Analysis Expressions (DAX) or Multidimensional Expressions (MDX).
Power BI dataset
You develop a Power BI model in Power BI Desktop, and once published to a workspace in the Power BI service, it’s then known as a dataset. A dataset is a Power BI artifact that’s a source of data for visualizations in Power BI reports and dashboards.
Analytic query
When Power BI visualizes dataset data, it prepares and sends an analytic query. An analytic query produces a query result from a model that’s easy for a person to understand, especially when visualized.
Tabular model
A Power BI model is a tabular model. A tabular model comprises one or more tables of columns. It can also include relationships, hierarchies, and calculations.
Star schema design
To produce an optimized and easy-to-use tabular model, we recommend you produce a star schema design. “Star schema design is a mature modeling approach widely adopted by relational data warehouses. It requires you to classify model tables as either dimension or fact.”
Table storage mode
“The storage mode property can be either Import, DirectQuery, or Dual, and it determines whether table data is stored in the model.”
Model framework
“Table storage mode settings determine the model framework, which can be either import, DirectQuery, or composite.”
A Power BI _____________is a query-able data resource that’s optimized for analytics.
Data Model
A __________is a Power BI artifact that’s a source of data for visualizations in Power BI reports and dashboards.
Dataset
An __________ produces a query result from a model that’s easy for a person to understand, especially when visualized.
Analytic Query
A _______________comprises one or more tables of columns. It can also include relationships, hierarchies, and calculations.
Tabular Model
The ___________ property can be either Import, DirectQuery, or Dual, and it determines whether table data is stored in the model
Table storage mode
Table storage mode settings determine the_______, which can be either import, DirectQuery, or composite.
Model framework
Build Power BI
Visuals and Reports
As with any project, a good starting point when designing reports is to define clear goals. In your reporting project, those goals should strive to help you determine your reporting requirements for:
Identify the audience
Identifying the audience is one of the most important steps in the report design process. It enables the report author to create a final result that can be efficiently used and will meet the needs of the report consumer.
The three broad report consumer audiences are:
“An executive is a person who is charged with making plans and decisions that often involve a medium or long-term focus.”
Executives are responsible for making the business run smoothly.
“An analyst is a person who provides guidance to the organization."
Analysts can be responsible for a range of tasks, often with goals of determining the effectiveness of business strategies, developing or improving processes, or implementing change.
An information worker is someone who uses data to help make decisions or take actions.
Determine Report types
Generally, report design can be classified by report type. Often, a direct mapping between the report audience and the report type occurs.
Audience needs can be met by one, or possibly a combination, of four report types:
Commonly, executives work with dashboards, analysts work with analytical reports, and information workers work with operational reports.
Dashboard
The primary goal of a dashboard is to interpret the story as quickly as possible. User interactions are limited by insights that are highly curated toward the audience. Report visuals are focused, self-explanatory, and clearly labeled.
Analytical reports
The primary goal of an analytical report is to help report consumers discover answers to a broad array of questions by interacting with the report and its visuals.
Operational reports
Operational reports are designed to give the report consumer the ability to monitor current or real-time data, make decisions, and act on those decisions.
Educational reports
Educational reports assume that the report consumer is unfamiliar with the data or context. So the reports must provide clear narrative detail and guidance to help with understanding.
Define user interface requirements
User Interface requirements relate to how reports are consumed and to the appearance and behavior of reports.
Aspects to consider include
Form factor
In the context of report design, form factor describes the size of the hardware that is used to open reports, and to page orientation (portrait or landscape).
Input method
While a computer has a keyboard and pointing device (mouse), mobile devices rely on common gestures, such as tap, double-tap, drag, pinch, spread, or press. “Report consumers who are using mobile devices can also use on-screen keyboards, voice control, or barcode and QR code readers.” Augmented or mixed reality devices rely heavily on hand gestures or body movement.
Style and theme
UI requirements should also consider style and theme. Strive to design reports with a consistent and distinctive appearance that is determined by a deliberate theme.
At a minimum, the theme should include the following elements:
Accessibility
UI requirements should also factor in accessibility. Reports need to communicate to the broadest audience possible. So you should consider how report consumers with no-to-low vision or other physical disability can fully experience the reports.
To support people with low or no vision, consider using:
A good starting point when designing reports is to define clear goals. In your reporting project, those goals should strive to help you determine your reporting requirements for:
All of the above
___________enables the report author to create a final result that can be efficiently used and will meet the needs of the report consumer.
Audiences
Which of the following is/are broad report consumer audiences?
All of the above
An __________is a person who is charged with making plans and decisions that often involve a medium or long-term focus.
Executive
An ___________ is a person who provides guidance to the organization.
Analyst
An _____________is someone who uses data to help make decisions or take actions.
Information worker
Report visuals are focused, self-explanatory, and clearly labelled in which type of reports?
Dashboard
The primary goal of an __________ report is to help report consumers discover answers to a broad array of questions by interacting with the report and its visuals.
Analytical
___________reports are designed to give the report consumer the ability to monitor current or real-time data, make decisions, and act on those decisions.
Operational
____________ relate to how reports are consumed and to the appearance and behavior of reports.
User interface requirements
Which user interface requirement is an example of an accessibility requirement?
Contrasting colors
Which user interface requirement is an example of an Input method requirement?
Augmented reality
Which user interface requirement is an example of an Form factor� requirement?
Portrait orientation
__________ describes the size of the hardware that is used to open reports, and to page orientation (portrait or landscape).
Form factor
Manage workspaces and datasets in Power BI
Workspace
A workspace is a centralized location or repository that allows you to collaborate with colleagues and teams to create collections of reports, dashboards, and so on.
By creating a workspace in Power BI, you can house your reports in one location, make them shareable, collaborate with other teams, and update reports.
Assign workspace roles
Workspace roles allow you to designate who can do what within a workspace.
There are four roles for workspaces, and it's advised that you grant the minimum access necessary to collaborators.
Admin
Member
Assign workspace roles
Contributor
Viewer
Troubleshoot data by viewing its lineage
The Lineage view feature in Power BI allows you to quickly refresh semantic models and see the relationships between the artifacts in a workspace and their external dependencies.
Data lineage
Data lineage refers to the path that data takes from the data source to the destination.
The Lineage view feature in Power BI is crucial because it:
How is the Admin workspace role different from other types of workspace roles?
Admin is the only role that can remove any users.
Which one of the following options is the best description of a workspace?
A workspace is a centralized location or repository that allows you to collaborate with colleagues and teams to create collections of reports, dashboards, and so on.
What feature in Power BI service can you use to troubleshoot the flow of data from its source to its destination?
Lineage view
Configure a semantic model scheduled refresh
The Scheduled refresh feature in Power BI service allows you to define the frequency and time slots to refresh a particular dataset.
Scheduling the refresh of your data will save you time because you don't have to manually refresh the data. It also ensures that users can access the most up-to-date data.
Configure incremental refresh settings
The Incremental refresh feature in Power BI is a popular feature because it allows you to refresh large semantic models quickly and as often as you need, without having to reload historical data each time.
Manage and promote semantic models
Business intelligence involves collaboration, and sharing semantic models across workspaces is a powerful way to collaborate within your organization.
Power BI provides two ways to endorse your semantic models:
Where are semantic model-scheduled refreshes configured?
Power BI Service
What is the difference between Promotion and Certification when you are endorsing a semantic model?
Promotion is for broad usage while Certification needs permission granted on the Admin Tenant settings.
Introduction to dashboards
Microsoft Power BI dashboards are different than Power BI reports.
“Dashboards are made up of visuals from different reports and allow report consumers to create a single page of targeted data insights, with the option to explore more in a report.”
Dashboards vs. reports
Dashboards | Reports |
Create only in service | Create in Power BI Desktop (limited in service) |
Single page | Multiple pages |
Static tiles | Interactive visuals |
Read-only tiles | Filters pane |
Configure data alerts
Configuring data alerts is a simple process to complete for a dashboard in Power BI. Data alerts can notify you or a user that a specific data point is above, below, or at a specific threshold that you can set.
These alerts are features that are only available on Power BI service and they're available on such report elements such as KPI visuals, gauges, and cards.
What is a dashboard?
A canvas of report elements that can be built in Power BI service.
What is one way that reports and dashboards differ?
In reports, you can have multiple pages; in dashboards, you can have only one page.
Where can you configure and set data alerts?
Data alerts can be set only in Power BI service on specific visuals such as KPI cards, gauges, and cards.