1 of 22

2009 - 2013

CHINOOK MUSIC STORE SALES REPORT

2 of 22

Overview

The database for this analysis is the 'Chinook' database, a comprehensive music store database designed to represent a digital media store, including:

albums’,

artists’,

customers’,

employees’,

genres’,

invoice_items’,

invoices’,

media_types’,

playlist_track’,

playlists’,

tracks

This database is utilized to perform various analytical tasks to derive insights related to Top-Selling Artists, Customer Purchase Patterns, Genre Popularity, Sales Over Time, Customer Lifetime Value (CLV) and Market Basket Analysis. The following sections provide a detailed overview of the database structure and highlight the important relationships between the tables, which are crucial for understanding the flow of data and performing accurate analysis.

3 of 22

DB Structure

The 'Chinook' database comprises several tables, each representing different aspects of the music store's operations. Below is a summary of the tables and their relevant columns:

albums

[AlbumId] (INTEGER): Unique identifier for each album.

[Title] (NVARCHAR(160)): Title of the album.

[ArtistId] (INTEGER): Foreign key linking to the ‘artists’ table

artists

[ArtistId] (INTEGER): Unique identifier for each artist.

[Name] (NVARCHAR(120)): Name of the artist.

customers

[CustomerId] (INTEGER): Unique identifier for each customer.

[FirstName] (NVARCHAR(40)): First name of the customer.

[LastName] (NVARCHAR(20)): Last name of the customer.

[Company] (NVARCHAR(80)): Company name of the customer.

[Address] (NVARCHAR(70)): Address of the customer.

[City] (NVARCHAR(40)): City of the customer.

[State] (NVARCHAR(40)): State of the customer.

[Country] (NVARCHAR(40)): Country of the customer.

[PostalCode] (NVARCHAR(10)): Postal code of the customer.

[Phone] (NVARCHAR(24)): Phone number of the customer.

[Fax] (NVARCHAR(24)): Fax number of the customer.

[Email] (NVARCHAR(60)): Email address of the customer.

[SupportRepId] (INTEGER): Foreign key linking to the ‘employees’ table.

4 of 22

DB Structure cont.

employees

[EmployeeId] (INTEGER): Unique identifier for each employee.

[LastName] (NVARCHAR(20)): Last name of the employee.

[FirstName] (NVARCHAR(20)): First name of the employee.

[Title] (NVARCHAR(30)): Job title of the employee.

[ReportsTo] (INTEGER): Foreign key linking to the ‘employees’ table (self-referencing).

[BirthDate] (DATETIME): Birth date of the employee.

[HireDate] (DATETIME): Hire date of the employee.

[Address] (NVARCHAR(70)): Address of the employee.

[City] (NVARCHAR(40)): City of the employee.

[State] (NVARCHAR(40)): State of the employee.

[Country] (NVARCHAR(40)): Country of the employee.

[PostalCode] (NVARCHAR(10)): Postal code of the employee.

[Phone] (NVARCHAR(24)): Phone number of the employee.

[Fax] (NVARCHAR(24)): Fax number of the employee.

[Email] (NVARCHAR(60)): Email address of the employee.

genres

[GenreId] (INTEGER): Unique identifier for each genre.

[Name] (NVARCHAR(120)): Name of the genre.

invoice_items

[InvoiceLineId] (INTEGER): Unique identifier for each invoice line.

[InvoiceId] (INTEGER): Foreign key linking to the ‘invoices’ table.

[TrackId] (INTEGER): Foreign key linking to the ‘tracks’ table.

[UnitPrice] (NUMERIC(10,2)): Price per unit of the track.

[Quantity] (INTEGER): Quantity of the track sold.

5 of 22

DB Structure cont.

invoices

[InvoiceId] (INTEGER): Unique identifier for each invoice.

[CustomerId] (INTEGER): Foreign key linking to the ‘customers’ table.

[InvoiceDate] (DATETIME): Date of the invoice.

[BillingAddress] (NVARCHAR(70)): Billing address.

[BillingCity] (NVARCHAR(40)): Billing city.

[BillingState] (NVARCHAR(40)): Billing state.

[BillingCountry] (NVARCHAR(40)): Billing country.

[BillingPostalCode] (NVARCHAR(10)): Billing postal code.

[Total] (NUMERIC(10,2)): Total amount of the invoice.

media_types

[MediaTypeId] (INTEGER): Unique identifier for each media type.

[Name] (NVARCHAR(120)): Name of the media type.

playlist_track

[PlaylistId] (INTEGER): Foreign key linking to the ‘playlists’ table.

[TrackId] (INTEGER): Foreign key linking to the ‘tracks’ table.

playlists

[PlaylistId] (INTEGER): Unique identifier for each playlist.

[Name] (NVARCHAR(120)): Name of the playlist.

6 of 22

DB Structure cont.

tracks

[TrackId] (INTEGER): Unique identifier for each track.

[Name] (NVARCHAR(200)): Name of the track.

[AlbumId] (INTEGER): Foreign key linking to the ‘albums’ table.

[MediaTypeId] (INTEGER): Foreign key linking to the ‘media_types’ table

[GenreId] (INTEGER): Foreign key linking to the ‘genres’ table.

[Composer] (NVARCHAR(220)): Composer of the track.

[Milliseconds] (INTEGER): Length of the track in milliseconds.

[Bytes] (INTEGER): Size of the track in bytes.

[UnitPrice] (NUMERIC(10,2)): Price per unit of the track.

7 of 22

Table Relationships

Understanding the relationships between these tables is essential for performing accurate data analysis. Key relationships in the ‘Chinook’ database include:

albumsartists’ [ArtistId]

This relationship allows us to identify which artist created each album.

tracksalbums’ [AlbumId]

This relationship enables us to determine which album each track belongs to.

tracksgenres’ [GenreId]

This relationship allows us to categorize tracks by their genre.

invoice_itemsinvoices’ [InvoiceId]

This relationship is crucial for understanding which items were purchased in each transaction.

invoice_itemstracks’ [TrackId]

This relationship allows us to identify which tracks were sold in each invoice item.

invoicescustomers’ [CustomerId]

This relationship enables us to analyse purchase behaviour and patterns for each customer.

customersemployees’ [SupportRepId]

This relationship helps in identifying which employee assisted which customer.

8 of 22

Data Model

9 of 22

Calculated Tables

In addition to the core tables and relationships, a calculated table has been created for specific analytical tasks.

AlbumsReoccurrence

This is a calculated table designed to support analysis, such as determining album purchase frequencies by filtering the ‘invoice_items’ table to include only rows where albums appear more than once in each invoice.

AlbumPairsInOneInvoice

This is a calculated table designed to support analysis, such as identifying and summarizing instances where multiple albums are purchased within a single invoice. It is created by summarizing the ‘invoice_items’ table to list each album, its name, and the count of times it appears more than once per invoice.

CustomerMetrics

This is a calculated table designed to support analysis, such as customer segmentation and behaviour analysis, by summarizing and enhancing customer-related data with calculated measures.

10 of 22

Data Preparation

Data Collection:

The data was collected from the 'Chinook' database, these are they tables used in the analysis ‘albums’, ‘artists’, ‘customers’, ‘genres’, ‘invoice_items’, ‘invoices’, and ‘tracks’. These other tables allowed on the model for relationships ‘media_types’, ‘playlist_track’, ‘playlists’, and ‘employees’.

Data Integration:

All relevant tables were imported into a single Power BI data model to facilitate seamless analysis. All measures were stored in a single calculated table. The relationships between these tables were established based on key columns, as outlined in the database schema.

Missing Values:

Checked across all tables to ensure data integrity. The only table used in the analysis that had missing is ‘customers’ table but these missing were irrelevant to the analysis as those columns were not use in any part of the analysis.

Data Types:

Checked and ensured that all columns had appropriate data types. For example, numerical columns were cast to numeric types, primary keys were converted to categorical data type, dates to date types, and textual information to string types.

Data Validation:

Performed validation checks to ensure data consistency. This included verifying that invoice dates were within expected ranges and that sales figures were non-negative.

11 of 22

Data Preparation cont.

Data Collection:

The data was collected from the 'Chinook' database, these are they tables used in the analysis ‘albums’, ‘artists’, ‘customers’, ‘genres’, ‘invoice_items’, ‘invoices’, and ‘tracks’. These other tables allowed on the model for relationships ‘media_types’, ‘playlist_track’, ‘playlists’, and ‘employees’.

Data Integration:

All relevant tables were imported into a single Power BI data model to facilitate seamless analysis. All measures were stored in a single calculated table. The relationships between these tables were established based on key columns, as outlined in the database schema.

Data Cleaning

  1. Missing Values:

Checked across all tables to ensure data integrity. The only table used in the analysis that had missing is ‘customers’ table but these missing were I relevant to the analysis as those columns were not use in any part of the analysis.

2. Data Types:

Checked and ensured that all columns had appropriate data types. For example, numerical columns were cast to numeric types, primary keys were converted to categorical data type, dates to date types, and textual information to string types.

  1. Data Validation:

Performed validation checks to ensure data consistency. This included verifying that invoice dates were within expected ranges and that sales

figures were non-negative.

12 of 22

Data Preparation cont.

Data Exploration

  1. Descriptive Statistics:

A descriptive statistic was performed for key columns to understand the distribution, central tendency, and dispersion of the data. This included measures such as mean, median, standard deviation, and range.

2. Data Visualization:

Various visualizations were created to explore the data. Bar charts, line graphs, and cluster charts were used to uncover trends, outliers, and patterns within the data.

  1. Correlation Analysis:

Correlation analysis was conducted to identify relationships between different variables. For example, the relationship between total sales and customer demographics and popular genres was explored.

13 of 22

Data Analysis

Top-Selling Artists:

To identify artists with the highest sales and analyze their sales trends over time, these steps were followed:

  1. Identify the artists with the highest total sales from a database of music sales. Calculate total sales for an artist by multiplying the [UnitPrice] of each [TrackId] by the [Quantity] sold in ‘invoice_items’ table and summing these values across all tracks for each artist.

  • Analyze and the yearly sales trends for each artist to understand their sales performance over time. To derive yearly sales, a [Year] column was added to ‘invoices’ table and then a measure for filtering the data to include only the single sales per year was created.

Customer Purchase Patterns:

To segment customers based on purchase behaviour and identify key characteristics of high-value customers, a Recency, Frequency, and Monetary (RFM) analysis was performed to segment customers based on their purchase behaviour. This involved calculating:

Recency: Days since the last purchase.

Frequency: Total number of purchases.

Monetary: Total amount spent.

Customers were then segmented into different groups based on their RFM scores to identify high-value customers.

14 of 22

Data Analysis cont.

Genre Popularity:

The most popular music genres were calculated by summing total sales for each genre and analysing changes in genre popularity over different time periods.

‘invoices’[Year], genres, and [TotalSales/Spent] were compared to analyzed to determine the change in genre popularity over different time periods.

Sales Over Time:

To analyzed sales trends over time, a measure calculating month-over-month (MoM), quarter-over-quarter (QoQ), and year-over-year (YoY) percentage changes was created. This helped identify seasonal effects and significant sales events.

Customer Lifetime Value (CLV):

To determine a customer’s lifetime value, calculations were done based on their purchase history using these matrix, ‘Average Purchase Value’, ‘Purchase Frequence’ and ‘Customer’s Lifespan’.

Market Basket Analysis:

Here, market basket analysis was conducted to find common combinations of albums purchased together. This involved creating a calculated tables to summarize album pairs and their purchase frequencies.

Click here for further insights.

15 of 22

Top Selling Artists

Top-5 Best Selling Artists/Yearly Sale Trend

Iron Maiden, U2, Metallica, Led Zeppelin and Lost are the top-5 best selling artist from a list of 275 artist from Chinook Music Store.

Iron Maiden also maintained all year best selling except for 2011. Other top-4 best selling artists maintained a positive correlation in the sells trends.

U2 and Lost did not have any sells in 2009.

2011 was the worst year with $56 in total sales for top-5 best selling artist.��2012 was the best selling year with $132 in total sales for top-5 best selling artist.

Only Iron Maiden, Led Zeppelin and Metallica has sales in 2009 from the top-5 best selling artists.

16 of 22

Customer Purchase Patterns

Customers Segment Based on Purchase Behaviour

Customers Segments

High-Value: 3 months purchase

At Risk: 6 months purchase

Lost: Others

Only 32% of purchase frequencies are active customers, 20% are at the risk of not purchasing and 47% are already lost.

USA is the highest contributor to all customers segments, UK comes second and Portugal, Spain & Sweden are part of the top-5 countries.

High-Value customers are purchasing Rock and Alternative & Punk genres more.

17 of 22

Rock is the most popular music genre. It sold $827 and stayed popular all through the reporting period.

Rock contributes 36% to total genres sold.

Latin is the second popular music genre. It sold $382 and stayed popular all through the reporting period but dropped to one place in 2012.

Latin contributes 16% to total genres sold.

Metal is the third popular music genre. It sold $261 and did not have a stable popularity trend. It rosed to the second place in 2012 and dropped again in 2013.

Metal contributes 11% to total genres sold.

Alternative & Punk and TV Shows ranks fourth & fifth as popular music genre. Both sold $336 and had a stable popularity trend. Both rosed one place higher in 2011 and dropped in 2012.

They contributes 14% to total genres sold.

Genre Popularity

Popular Music Genres and the Change in their Popularity

18 of 22

Sales Over Time

Monthly, Quarterly and Yearly Sales Trends

2009 was the year with highest sales.

2013 was the year with the lowest sales.

Quarter One has a cumulative best quarter sales throughout the reporting period.

Quarter Four has a cumulative lowest sales throughout the reporting period.

January and June are the months with the cumulative highest selling months throughout the reporting period.

YoY Sales: 24.0%

QoQ Sales: 5.7%

MoM Sales: 2.0%

19 of 22

Customer Lifetime Value (CLV)

Customers Lifetime Value Based on their Purchase History

USA, Canada, France, Brazil and Germany are they top-5 countries with this highest customer's purchase. The mean order days for these countries is 58 days.

USA, Canada, France, Brazil and Germany are also the top-5 countries with the highest CLV points.

Canada, USA, Belgium, France, Ireland and Norway are the top-5 countries with the longest customer life spans ranging from 59 – 57 months. Average customer life span is 47 months.

Only USA, Germany and France from the top-5 countries with highest CLV points have individual customers listed among top-5 Customers with highest CLV.

Customers from Ireland, Czech Republic and Norway joined this list.

20 of 22

Market Basket Analysis

Albums combinations during purchased together

Minha Historia is album with the highest purchase with 25 times order over different invoices.

Although Greatest Kiss ranks among top-5 all time album purchase, it was not ranked among best combined songs for purchase.

Highest purchase album was not rank among the first 2 best combined songs for purchase.

Song combination is relative spread among different album during purchase.

21 of 22

Recommendations

When promoting an artist, Rock, Latin, Metal, Alternative & Punk and TV Shows genres should receive the highest funding. These are they most preferred genres from customers.

More sales promotions should be directed on the USA market since there are more High-Value customers in that region.

More Rock, Alternative & Punk, and Latin genres should be promoted to Lost customers since these are their preferred genres – this might lead to new purchases.

Iron Maiden, U2, Metallica, Led Zeppelin and Lost song should be promoted more as the rank highest among all artist with all time best sales.

Activities and programmes that were done in 2011 should not continued; that year was saw artist with records crashes in total sales.

Tracks in these albums Greatest Hits, Unplugged, Minha Historia, Acústico MTV, The Cream Of Clapton, Chronicle, Vol. 1, Up An' Atom, The Office, Season 3, Battlestar Galactica (Classic), Season 1, Acústico, Arquivo Os Paralamas Do Sucesso have higher chance of being purchased with other songs and should be made more visible to customers.

More resources should be made available for promotion toward the end of every quarter four. Cumulative sales trend shows that customers are more willing to purchase songs in January.

Artist also should be encouraged to release new album during the same period or around June when sales peaks again.

22 of 22

Thank you