This article answers some of the most common set analysis questions and struggles.
Set analysis issues with dates
One of the most common issues Qlik beginners (and experts) have is using set analysis to filter dates down. Usually, the issues stem from formatting differences between the target date and the date entered in the set analysis.
Consider this data model:
DATES:
Load
Date(Date1,'DD/MM/YYYY') as Date1,
Date(Date2,'YYYY-MM-DD') as Date2,
Count
inline [
Date1, Date2, Count
01/01/2023,01/01/2023,1
02/01/2023,02/01/2023,1
03/01/2023,03/01/2023,1
11/02/2023,11/02/2023,1
24/03/2023,24/03/2023,1
05/04/2023,05/04/2023,1
];
Two date fields are being loaded, with each row having identical dates. The difference is that Date1 is formatted as ‘DD/MM/YYYY’, and Date2 is formatted as ‘YYYY-MM-DD’.
If we wanted to refer to a specific date using set analysis, consider these three examples:
Column 1 sums up the Count field by filtering Date1 to a single day in the correct format. The result is successfully computed.
Column 2 sums up the Count field by filtering Date2 to a single day, but the date format does not match. The result is 0 because the expression can’t find dates matching the specified one in field Date2.
Column 3 does the same as column 2 but specifies the correct format. The result is successfully computed.
When having issues with dates, the solution is to match the date format of the modifier field. Here is my list of tips for conquering date issues with set analysis.
Solution 1: formatting the dates at import
Always format dates at the point of import; do not rely on Qlik to figure it out. To do this successfully:
- Load a small sample set of your data into the app. We restrict to a small sample at this stage to speed up the reload.
- Create a table in a sheet and add the various dates to the table as dimensions so you can see their default format.
- Utilise the Date#() interpretation function and the Date() function to reformat the dates to be exactly how you want them.
Solution 2: formatting the date value within the set analysis
You can also format the value of the date or date range within the set expression. Using our example data above, if we structure our set expression in column 2 in the following way, we get the correct result:
Sum({} Count)
Note how we put the =Date() expression into a dollar expansion in double quotes: “$()”. This tells Qlik to calculate the result of the function in brackets and use that as the value within the set expression.
Set analysis issues with variables
Variables can be used anywhere within a set analysis function. For example, they can be used to change the aggregating function that is being called depending on the user’s requirements, or they can be used to change the modifier field and modifier values. Issues with variables tend to stem from the format of the variable and the subsequent way it is called within the set analysis function. Let’s take a look at an example where a variable is being used within the set modifier values.
Consider the following data model:
Load
*
inline [
Date1,Count
01/01/2023,11
02/01/2023,22
03/01/2023,33
11/02/2023,44
24/03/2023,55
05/04/2023,66
];
Now consider the following set expression, which sums up the Count field where the field Date1 is equivalent to the max value in that field:
Sum({} Count)
Let’s say we want to replace the function within the curly brackets with a variable. Essentially, we will store the value of max Date1 inside a variable and call that variable within the set expression. There are two ways we can do this. Firstly, we can calculate the value of the max Date1 and store the result within the variable. Secondly, we can store the full function within the variable, which will be evaluated when the variable is called. The difference between the two is the addition of the equals sign in front of the function:
The top screenshot calculates the value and stores the result. Note how the ‘Value’ is the actual date. The bottom screenshot defines the Max function and stores that. Note that the ‘Value’ is just the full function as a string.
If the variable is defined using the top method, the set expression will need to simply expand the variable to get the date value as follows:
Sum({} Count)
If the variable is stored using the bottom method, the set expression will first need to expand the variable to bring the Max function as a string and then expand this string to calculate the actual function result as follows:
Sum({} Count)
The result will not be returned if the variable store and subsequent call misalign. So, to resolve any issues with variables in set analysis, follow these steps:
- Check how the variable is calculated: are you storing the result or the function?
- Check how the set expression is expanded: does the store and call align?
- Check the format of the variable value. For example, if you are working with dates, ensure that the variable result is formatted like your date field.
Set analysis issues with NULL values
Issues with NULL values within set analysis usually take two forms: when you try to filter by the NULL value and when you try to filter by all values that are not NULL.
Looking for NULL values using set analysis
The first thing we need to understand when working with NULL values is that, by definition, a NULL is not a value at all. A NULL means no value, i.e. the value does not exist. With this definition, it becomes obvious that you can’t actually specify a NULL value criteria within a set expression as no such value exists. However, you can use a couple of methods to achieve the result regardless.
Solution 1: convert NULLs into values in the frontend
At the time of writing, you can use functions in your modifier to change NULLs into a value on the left side. Then, on the right side of the modifier, you would look for this new value.
Let’s take a look at an example. Assume we have the following table in our app where the field Pet contains 6 string values and the field Exists contains 4 numeric values and 2 NULL values.
Assume we want to count the number of Pet values that contain a NULL in the Exists field. This is how we’d do it:
Count({<"=If(IsNull(Exists),0,Exists)" = {0}>} Pet)
In the above set expression, we combined If() and IsNull() functions to convert the NULL Exists values into zeros. We then filter the result by zero. Note how the function has to be surrounded by double quotes and have an equal sign in front. This tells Qlik to calculate the result before performing the set analysis filterdown.
Solution 2: convert NULLs into values in the backend
The most reliable way to filter by NULLs is to add an additional field to your table in the backend that contains a marker for whether the value in the desired field is NULL or not. The same combination of functions, If() and IsNull(), can be used to achieve this. For example, adding the following field below the other data will result in values of 1, where the field Exists is NULL and 0 where it has a value.
If(IsNull(Exists),1,0) as "Exists is NULL"
The set analysis for NULLs in Exists can then become:
Count({<"Exists is NULL" = {1}>} Pet)
Note that you could also change the NULL value within the Exists field from NULL to something else instead of creating a new field if that’s appropriate, given your desired data model.
Excluding NULL values using set analysis
The above methods can also work when excluding NULLs within a set expression. For example, in the frontend, you can use the functions to convert NULLs into a value such as zero, but then instead of looking for zero, exclude it as follows (see Solution 1: convert NULLs into values in the frontend above):
Count({<"=If(IsNull(Exists),0,Exists)" = -{0}>} Pet)
All we did here was add a minus in front of the modifier value of {0}.
The backend solution is also applicable, but inversely, you would set “Exists is NULL” to zero within the set expression instead of 1 (see Solution 2: convert NULLs into values in the backend above).
In addition to the above two methods, there is a third solution that can be used to include only modifier values that are not NULL. This is to use a wildcard character to select all values. As NULL isn’t a value, those rows are excluded from the result.
Count({} Pet)