How to Quickly Sift Through 3,050 Crosstabs and Find the Important Ones in Q
Want to save a heap of time narrowing down thousands of crosstabs to just the ones you need? This post describes which buttons to push in Q in order to implement the three approaches described in “3 Ways to Quickly Sift Through 3,050 Crosstabs and Find the Magic One“. The post starts by describing how to create lots of crosstabs, then looks at creating the heatmap, automatically deleting tables, and smart tables.
Creating lots of crosstabs
The first step is to create lots of crosstabs. We do this by:
- Add a data set. In this post I am using a file about mobile phones called phone.sav. This files a bit messy, and should be tidied, but I’m not going to go down that rabbit hole in this post.
- Click Create > Tables > Lots of Crosstabs
- Select all the variables that you wish to have in the rows of the crosstabs and press OK. In the examples in this post, I’ve selected all of them.
- Select all the variables you wish to have as columns and press OK. For this example, I’ve selected various five-point agreement scales: Allows to keep in touch, Technology fascinating, …, Would like to do mobile banking with phone.
- Choose your report type, choosing the option with just a table per page, as shown to the right and press Create Report.
You will now have many folders, each containing pages cross-tabbing all the variable sets in your project by the key variable sets that you selected. If you are using the phone.sav data set that I am using, you will have almost 2,000 crosstabs!
To see the p-values or z-statistics on any of the tables, right-click on them and select Statistics – Cells; to add multiple statistics at the same time, hold down the Ctrl key on your keyboard.
Creating a heatmap summarizing all the tables
The heatmap below (it may take a while to load) shows the z-statistics for all 3,050 tables, with darker blue for higher z-scores, and the z-scores capped at 5 (i.e., any value greater than 5 is changed to 5, as beyond 5 the differences are immaterial). The heatmap was created by:
- Selecting all the folders containing all of the tables
- Automate > Browse Online Library> Significance Testing > Identify Interesting Tables. This creates a table called most.significant.results.
- Create > Charts > Visualization > Heatmap
- Set Inputs > DATA SOURCE > Output in ‘Pages’ to most.significant.results (it will be at the very bottom)
- Press CALCULATE. (Note that the table of interesting numbers does not automatically update if the inputs tables are changed; this is the exception that makes the rule that everything in Q automatically updates.).
If you follow these instructions you will get an output that looks a bit different to the one above. The key differences are that:
- I’ve cleaned and tidied the data prior to running the analysis.
- I modified the code in most.significant.results to exclude the column of SUMMARY tables, as shown below:
Automatically deleting crosstabs
A separate approach to using a heatmap is just to delete all the tables that are not significant. We can do this in Q by:
- Selecting all the folders containing tables.
- Home > Utilities > Delete tables and plots and choosing one of the options. The smaller the p-value, the fewer tables that will be left.
Smart tables is run in Q as follows:
- Create > Tables > Smart Tables
- Select a single question of interest as the Dependent question. For example, if you are wanting to profile a segmentation, then you select the variable that indicates which person is in which segment.
- Select any other questions that may be of interest as the Independent questions and press OK.
Author: Tim Bock
Tim Bock is the founder of Displayr. Tim is a data scientist, who has consulted, published academic papers, and won awards, for problems/techniques as diverse as neural networks, mixture models, data fusion, market segmentation, IPO pricing, small sample research, and data visualization. He has conducted data science projects for numerous companies, including Pfizer, Coca Cola, ACNielsen, KFC, Weight Watchers, Unilever, and Nestle. He is also the founder of Q www.qresearchsoftware.com, a data science product designed for survey research, which is used by all the world’s seven largest market research consultancies. He studied econometrics, maths, and marketing, and has a University Medal and PhD from the University of New South Wales (Australia’s leading research university), where he was an adjunct member of staff for 15 years.