Week 02 #WorkoutWednesday Power BI challenge

We continue with the document that has been made last week.

Read the challenge and solution for this week

How did I build my solution?

  • In the Power Query Editor added a Custom column, using the option in the menu. I like to put commands in the properties to know later what I have been doing and I changed the name of the step.
  • The format of the column needs to change into a Decimal Number.
  • Took the code from the Advance Editor into https://www.powerqueryformatter.com/ to format it into a nice humanly readable code
  • For the dashboard I went for the Frontier Theme of Power BI to soften the colours.
  • I like to work with a background for the header and visualizations, so I started with building that in Power Point, from there saving it as a JPG and put that as a page background. Need to make sure the background colour here is the same as in my Frontier Theme of Power PI, so copied the Hex code. The lettertype also need to fit, the Verdana type
  • Added 3 card visualizations for “total revenue”, “total expanses” and “total profits” and changed the Data Label into the Million units. I gave the Cards the title and removed the Category option.
  • Added 3 line charts visualizations to show the 3 trendlines. Here the titles of the axis have been removed, because the card title can do those jobs.
  • Added 3 bar charts to show the results by conference. I changed the x-axis title into “Total Revenues by conference”, “Total Expenses by conference” and “Total Profits by conference” and made sure the top titles of the cards and the bottom titles are aligned.

The dashboard is published online.

The M-Code

Week 01 #WorkoutWednesday Power BI challenge

Scrolling in Twitter I came across the posting of Spencer Baucke about a new WorkoutWednesday challenge. They have been making them for Tableau since January 2018. In 2021 they start with Power BI and I’m going to join them.

In week 1 it was a small touch to the wonders in Power Query. Making a connection to the database and doing some filtering.

I made the connection with the dataset in Data.World without downloading it to my computer.

Connection to DataWorld

Thanks to Barrett Studdard for sharing how to do that.

And we needed to set a filter on the fact table. Making sure the conference is not null and not FBS Total and Conference Median is not in the subdivision.

The complete code became as followed.

I have published the file online, the coming weeks we will build onto this dashboard.

You can follow my progress.

Dashboard walking challenge

This morning I got involved with #100daysofwalking in twitter because of a tweet from my friend Máirín Murray.

The challenge by Dr Ciara Kelly

And suddenly I was using Power BI to analyse my walking data that I have been collecting since August 2015 with my apple watch.

I used the app Health Export to get the data form Apple Health into a CSV.

After importing it in Power BI it needed some transformations:

  • Power BI needed to have the right local settings to make sure the numbers and delimiters where read right.
  • I only want to keep walks longer than 30 minutes and more than 1 km, so I used a filter
  • When I was visualising, I noticed some doubles in the beginning, that was because at that time I would use two apps to track 1 walk to find out which app I wanted to use. So, I extracted the date and the hour out of the starting moments to use Power BI to remove the doubles
  • When I wanted to show the number of steps, it was not giving me the thousands separator so I went back to Power Query Editor to change it from a whole number to a decimal number and I got the right visualisation.

I kept the dashboard basic. Just showing a slicer from my first recorded walk (September 11, 2015) up to the end of 2020. And 3 cards that are showing the number of walks, distance walked and a step count. In the bottom there is a line chart that shows by Date the walked distance.

I published it on my Power Bi side:

Now the next challenge: how to build this in Tableau.

While importing the data I need to change the text file properties. Field separator from the automatic generated space option into a comma and the locale needs to be changed into English. And I need to tell Tableau the field names are in the first row.

The columns that I do not need, need to be removed. I can’t find delete, I do find an option called hide, will that be the same. I notice that in Power BI I removed a column more, also one that the not have values at all, this on is not loaded in Tableau.

The type needs to be filtered on Outdoor Walks,

Filter walks longer than 30 minutes. The column Duration is Date & Time all with the date 30-12-1899, so this needs to be in the filter. The Duration needs to have a minimum that is equal or bigger than 30-12-1899 00:30:00.

Filter walks longer than 1 km

Remove the double entries, noop I have no idea yet how to do this, so I have 36 entries too much and those will effect my visualisation.

Step count is already a number.

Question to me being used to Power BI, how can I recall the steps I took to prepare my data, in Power BI I can read that step by step back in the query editor.

Get 3 cards for number of walks, distance walked and step count

The filter being used is the start date and needs to be used for all using this data source

After I had the sheets, I made a dashboard, was not easy to give me the 3 cards and the filter on top, needed to choose for float.

Overtime I will learn more of Tableau dashboarding and it will become better looking

Go to my public Tableau to have a view of this dashboard.

My journey with the #30DQUERY challenge

The encounter

At May 2, 2020 I bumped into a LinkedIn post from Alex Powers about a #30DQUERY challenge.

On my road to learn more about Power BI and searching for challenges to learn how to use the tool I was like “lets join this challenge”.

My interest in Power BI and my past track

In the first quarter of 2019 I followed the Microsoft Data Science Track and one of the 11 courses was analysing data with Power BI. That is when I fell in love with the program.

I went looking for more information. I got myself into a weekly online challenge to visualise a given dataset, with www.makeovermonday.co.uk. They were fun and I learned a lot. I went into the direction of how to visualise the data in a dashboard in a nice and need way. #StoryTelling with data, I wanted to know more about.

During visualisation I realised I sometimes needed to change the given dataset. So, I learned how to add columns, measurement and I started to learn DAX from www.sqlbi.com.

When I saw this #30DQUERY challenge on LinkedIn I was like, I know what queries are and I want to do a Power BI based challenge. Get to know people who use the tool, learn from them. So lets jump in.

Since March 2020 I have a new job. This employer is using Business Objects and BI Publisher, two new tools for me. So the interest in Power BI is personal.

The first day’s

Import a dataset via the SQL-server, not all columns are needed, new columns need to be made and filters applied. Use the Power Query Editor to shape the dataset in a way that you want it to load.

Not to difficult with the knowledge I have built up with DAX.

The first 14 days I wrote blog posts about my experience, see www.kiwimaori.nl. After that the challenge became more time consuming, more difficult and I needed to check into other submissions to find out how to use the GUI or to write code.

Questions during the first day’s

  • Alex is talking about query folding; I know query and I know what folding is. But what is it when these two words are used a 1 expression?
  • Why is the Power Query Editor giving me code’s I don’t know when I start typing DAX-code?

Time to do a google search to find some answers.

Started with the playlist form Alex called An Introduction to Power Query M

Found a You Tube movie from Pragmatic Work about the Power Query Editor and on May 18 there was a live webinar about Query Folding by Nicky van Vroenhoven and dataMinds. This all helped to find my answers.

  • I found out that query folding means that the selection of the data happens in the DataBase and not in the engine of Power BI. So, it speeds up the loading of data.
  • The Query Editor is not responding to DAX-code, because that language is not used in Power Query. In the back up part of Power BI you need the language M.
    Oh why has that not been told in the Analysing with Power BI training. Because in that training, we also have been told how to import and change date with the GUI of the Power Query Editor.

Question about how to load more entities into 1 Query

I kept seeing in the answers that others loaded more than 1 entity into the same Query, while I had not a clue how to do that. So, halfway the challenge, I asked the question in twitter at the rest of the team. Steven Bitaxi gave my explanations and now I know how to do it. I even went back in time to correct some of the solutions, where I had been using merge and the folding had been broken because of that.

Wrap-up meeting

May 31, 2020, we had an online team Wrap up meeting organised by Alex. A group of us where online and we shared our experience for almost 2 hours. It was great to see each other and to hear the story’s and backgrounds.

Take away

I learned a new language. Power Query #M.

I learned that the more you can do outside Power BI, the faster the engine inside the application stays.

I can use the produced Native Query to build up my SQL skills, that I need during my job.

I met new people who I can talk to whenever I have Power BI questions.

Some extra’s

Because I saw solutions in GitHub, I started to become more interested in GitHub and followed a lab GitHub training to see what is possible for me how to start using GitHub

I decide to blog not only on my own website, but post this as an article in LinkedIn.

Tip for trainers who introduce new people to Power BI

Introduce them to visualisation, DAX and to Power Query M. With visualisation people will fall in love with the tool. But with M they really can turn the magic on.

My next step.

I have not mastered all the challenges yet, so I will go back in time and look at each of them again together with the given answers of the #30DQUERY team mates. Thanks guys it was fun working and learning from you. Stay in contact and stay save.

Live long and prosper.

Day 14 #30DQUERY

15-05-2020 The Challenge

DAY 14 
Database: AdventureWorksDW2017 
Entity: DimCustomer 
Select [CustomerKey], [EmailAddress], [EnglishOccupation] and [Yearlylncome] for the customers 
whose [BirthDate] was not on a weekend (Saturday, Sunday).

Birthday not in the weekend

My route

  • Build the filter on the BirthDate column, I used Date.DayOfWeekName first, but that was not folding, so I fell back to Date.DayOfWeek
  • Needed to find out what number Saturday (1) and Sunday (7) was.
  • Removed other columns
  • Put the columns in the requested order.

My result

Day 13 #30DQUERY

14-05-2020 The Challenge

DAY 13 
Database: AdventureWorksDW2017 
Entity: DimEmpIoyee 
Select [EmailAddress], [Gender], [BirthDate] for all employees with a [BirthDate] between 
19900101 and 19901231.

Getting in between

My route

This were 3 steps in the Power Query Editor and I was done, so I wondered where the catch was. Looking at earlier solutions I learned the catch was in the notation of the dates. Alex had hope people would start change the datatype of the birthday

  • Load the table
  • Remove the not needed columns
  • Put a filter on the birthdays

My result

Day 12 #30DQUERY

13-05-2020 The Challenge

DAY 12 
Database: WideWorIdImportersDW 
Entity: Dimension.EmpIoyee , Dimension.Customer 
Union select a [Full Name] column for each unique name in ascending order. 
The [Full Name] field is an alias for the [Employee] and [Primary Contact] fields.

A union to be careful with

My route

I did not fully understand what Alex was asking us to do, so I used the posting from Steven Bitaxi to read his blog and in his first line I found out what needed to be selected from what Entity. I also read, that Power Query does not have a UNION function, so I was wondering if I needed to skip this Day, like I did with the earlier days, where I did not find a folding query solution. Still I started using the toolbar in Power Query.

  • For both Tables I removed all the columns that I did not need
  • I made sure that both columns in both Tables have the same name, by typing it into one and copying it into the other one.
  • I used the option Append Queries and got one column.
  • Removed the Duplicates and ended up with the same number of rows as Steven Bitaxi
  • Sorted the rows
  • I saw there is one row with the ‘name’ Unknown and decided to remove that row

The Native Query is still folding, so I seceded this challenge by using all the build in Power Query options. The difference is, that I have two query’s and that I need to be aware not to load them both into Power BI when I want to make visualisations.

My result

Day 08 #30DQUERY

09-05-2020 The Challenge

DAY 8 
Database: WideWorldlmportersDW 
Entity: Fact.Sale 
Select [Sale Key], [Customer Key], [Invoice Date Key], [Total Excluding Tax], [Tax Amount] and 
[Profit] where the [Invoice Date Key] is equal to the last date of each month.

A leap

My route

  • Remove the not needed columns
  • List the formula for last date of each month
    Date.EndOfMonth(dateTime as any) as any
    Noop that is not right, that is just giving the last date of each month in each row
  • Let’s go for [Invoice Date key] == Date.EndOfMonth
    Yes, I got a filter
    Noop the folding broke
  • I leave it with this, I did 3 challenges this day and it has been enough thinking for 1 day

My result

Day 07 #30DQUERY

08-05-2020 The Challenge

DAY 7 
Database: AdventureWorksDW2017 
Entity: DimEmployee 
Dynamically select all the columns that do not contain the text 'Key'.

Future proof

My route

From Owen Auger

From Barrett Studdard

// Find valid columns 
column_list • each not Text.Contains(_, "Key")),
  • So lets start looking in the documentation for Tekst.contains and List.select
    From what I find I cannot bring that together with the used code by the two people.
    Looking at List.select I now see why those two are in combination with each other; List.Select gives a list and Tekst.Contains gives True or False

My result

Day 06 #30DQUERY

07-05-2020 The Challenge

DAY 6 
Database: WideWorldlmportersDW 
Entity: Dimension.Stock Item 
Select [Stock Item Key], [Stock Item] and [Color] where the [Stock Item Key] is an even number and 
the [Color] is not N/A

Getting even

My route

  • Remove not needed columns
  • Filter out the N/A from Colour
  • I wanted to use the function Number.IsEven for the column Stock Item Key, but it kept giving an error, so I took a look in already given answers and show people where using the function Number.mod, so I used that as well

My result

Day 04 #30DQUERY

05-05-2020 The Challenge

Basic Math

My Route

  • Remove not needed columns
  • Make new column with the Text Length of [Description] (found the formula for this in Microsoft Docs)

I found out that there are 2 different types of formula’s DAX and M and they should be used at different locations in the program. In de Query Editor it is M.

  • Make new column with the Modulo
  • Mmmm, maybe it is nicer to put the formula into one so it is just going to be one column. I put the code for text.length into the formula for modulo
  • Filter the rows so only Modulo 1 is left

My result

Day 03 #30DQUERY

04-05-2020 The Challenge

DAY 3 
Database: AdventureWorksDW2017 
Entity: DimProduct , FactReseIIerSaIes 
Select the [ProductKey], [EnglishProductName] and an aggregated [TotalRevenue] column which is 
the sum of a new computed column [TotalUnitPrice] made up of [OrderQuantity] multiplied by 
[UnitPrice] where the [TotalRevenue] is above one million.

Watch the order of applied steps!

My Route

  • Import the two Entities
  • Remove not needed columns from Both tables
  • Create [TotalUnitPrice]
  • Remove [OrderQuantity] and [UnitPrice]
  • Give it the Fixed decimal number format => fold broken, so removed again
  • Merge DimProduct into FactResellerSales, Inner Join at ProductKey
  • Took a peek at the solution from Barrett Studdard, because I do not know how to do the aggregation, learned I need first to group at ProducKey. Now the question, where is this option in my Editor. Ah found it, it is in the Transform Tab
    • First try I get the new column, but the name columns disappear
    • So that is why it needs to be the advance option of Group By
  • Last but not least, filter TotalRevenue at one million
  • I would like to see it as a Fixed decimal number, but when I do this the folding breaks.

My result

Day 02 #30DQUERY

03-05-2020 The Challenge

Filters. Filters. Filters. We’re in for it now.

My Route

  • Import the two Entities
  • Filter the [ResellerName] and the [CountryRegionCode]
  • Join this two Entities using the merge queries
    • DimReseller [GeographyKey] Left Outer join DimGeography [GeographyKey]
  • Remove the not needed columns from DimReseller
  • Show only the needed columns from [DimGeography]
  • No Transformations

My result

Yes today I got a result in the Native Query

It became a table with 241 rows.

Day 01 #30DQUERY in repeat

This morning when I checked again all the information and some of the other outcomes I realised I had missed the instruction that everything should be done in the Power Query Editor and not to make visualisations.

So, lets go and repeat Day 1

My Route

  • Open Power Query Editor
  • Import via the option SQL Server the Entity <DimEmployee> op my computer
  • Follow the recommended practice
    • Filter the BirthDate at 1974
    • Merge the 3 columns that have name parts into 1 new column [FullName] and here it goes wrong. I use the option Merge Colums in the Power Query Editor and then the option View Native Query greys out.
      I have been looking around on the internet to find a solution, but I cannot find it. So, I stop the challenge of today and wait for an explantion on Day 10 and then will see if I can redo Day 1
  • Remove all the not selected columns so only the 4 needed are left.

My Result

Day 01 #30DQUERY

The Challenge

A little bit of GUI, a little bit of keyboard. With today’s challenge find the employees with a birthdate in 1974.

My Route

  1. Open the Entity <DimEmployee>
  2. Make the new column by selecting the 3 columns and choose option merge, give the name <FullName>
  3. Close and Apply
  4. Go to Report
  5. Select the visual Table
  6. Select the 4 columns
  7. Don’t summarize the EmployeeKey
  8. Filter the BirthDate-Year at 1974
  9. Sort ascending on BirthDate

My result

Report in progress

The Report is published as it is right now. In the coming day’s it will get more shape. Right now it is a working process. So, what you see is the latest update.