Thursday, December 07, 2023

Lesson 7 - Power BI Datasets to build great visuals

Power BI is a powerful tool for data visualization and analysis, and it provides several options for creating different charts. In this blog post, we will explore how to create different charts in Power BI and how to customize its appearance.

Dataset Used:

I used raw dataset “120 years of Olympic history: athletes and results” from Kaggle and transformed for further analysis. I added additional data along with this dataset to cover all visuals in Power BI.

Context:

This is the historical dataset on Olympic games from Athens 1896 to Rio 2016.It has 2 Files. NOC Region file with distinct 230 rows with 4 rows and Olympic Events file have 268,768 rows with 16 columns. Winter and Summer Games were held in the same year until 1992. After that, they changed like Winter Games and Summer Games occur on a 4 year cycle starting with Winter in 1994, then Summer in 1996, then Winter in 1998, and so on.
It also has information on events, sports they participated, NOC (National Olympic committee) regions. From this diverse information on data set we can analyze the data based on various aspects like country’s performance, athlete performance, Evolution of women participation in Olympics every year. 

Field Description:

Olympic Events-File 1
  • ID- Unique Identification number for each Athletes (Integer).
  • Name- Name of the Athlete (Text).
  • Gender- Gender of the Athlete(M/F) (Text).
  • Age- Integer representing the Age of the Athlete (Integer)
  • Height- Height of the Athlete in cms (Integer)
  • Weight-Weight of the Athlete in kgs (Decimal)
  • Team- Name of the Team under the Athletes Participating.
  • NOC – National Olympic Committee (3 letter code) representing specific region (Text)
  • Games- Year and season (Winter/Summer) (Text)
  • Year- Year Olympic held (Integer)
  • Season -Summer/Winter (Text)
  • City – City where Olympics Hosted (Text)
  • Sport- Sports Information (Text)
  • Events – Events Information (Text)
  • Medal -Medals Secured (Gold or Silver or Bronze or NA) (Text)
NOC Region – File 2
  • NOC- National Olympic Committee (3 letter code) representing specific region (Text)
  • Region – Name of the region
  • Notes- Additional Info
  • Continent – Name of the continent 

Shape Map – File 3

 It has Australia's population data as of 2023
  • Change over previous year ('000) - Change in population compared to previous year (in numbers)
  • Change over previous year (%) - Change in population compared to previous year (in percentage)
  • Population at 31 March 2023 ('000) - Population data as of 31st march 2023 (in numbers)

Medal Target - File 4

I have fixed some random targets for medal counts
  • Medals Won - Medal count by region
  • Region - Name of the region
  • Target - Target by region

Importing Dataset into Power BI

Step 1: Launch Power BI desktop. Import the Data files into Power BI

















Step 2: Import the data Source to the Report















Step 3: Once we import the data into power BI, the below dialog box will get opened.


















Load : Here at the very first step, Load option is where you bring the data directly into Power BI to create a data model without any transformations and start building visualizations and reports

Transform Data: Transform Data is the step where we can shape, clean, and manipulate the data before loading it into Power BI. When we connect to a data source, Power BI gives us with a Power Query Editor window. In this window, you can perform various data transformation tasks, such as:
  • To rename columns and Tables
  • Remove Unwanted columns
  • Filtering Rows 
  • Splitting, Merging rows
  • Data type Conversions
  • Creating Custom Calculations and so on.

These transformations help to make sure the data is in the desired format and making it more suitable for analysis in Power BI. Once we have finished transforming the data, we can load the data further.

Note: Once the transformation is done make sure all the filters applied are cleared. Only filtered rows will get loaded if any filters are applied.

Data Preparation & Cleaning

In the above dataset we made some basic transformations using Power Query Editor.

As Age, height, weight are numerical columns. Replaced those null values by zero.
In medal column, empty cells are replaced as “No Medal”.
Removed row duplicates


To replace Null values as NA or 0

























To remove duplicates


























To change Datatypes

























To Extract values from the column
























Once the data comes into desired form as per the requirement, we can load the data and start working on visuals.
Once we load the data into power BI, data model will get created.












We can view the data model by clicking on “Model view” on the left side pane.


The relationship created was “Many to one relationship” as per our dataset
NOC is the common field in both the table.

















We can further edit the relationship or to make any data model related changes we can click on the arrow button above






















Under the option Cross filter direction, we have two options

·       Single directional filter

·       Bi-directional filter

Single -In a single-directional filter, the filtering flows in only one direction. when you apply a filter in the one table, it will affect the related data in the other table, but not vice versa.

Both - In a bi-directional filter, the filtering can flow in both directions. When you apply a filter in either the one or other table it will affect the related data in both directions.

Lets create visuals based on the above dataset in further blogs. Get ready !

Download Raw files - https://github.com/shantha05/PowerBILessons/tree/main/RawFiles

Download .pbix files - https://github.com/shantha05/PowerBILessons/tree/main/pbixfiles

No comments:

Post a Comment