Lab-00: Science, Human Experience, Experiments, and Data
Some Data Games
Let us start with a couple of small rumpus/games.1
Game-1: Making Sushi
I will call out a few random characteristics, such as “People who wake up before 0800 hours in the morning”, or “People who love Sushi”. We will see how our classgroup spontaneously reorganises itself based on these characteristics.
Questions to Ponder:
- Did you stay in the one group you chose?
- If you moved, why did you move?
- How did you know “where to stand”? (like Archimedes seems to have known)
- Did you feel some groups to be “cooler” than say the groups you were in?
- If you were to look down at our classroom arrangement from the ceiling, how would you know which group was which?
Game-2: Thinking like Kandinsky
Look around the room, at the people, furniture, walls, fittings…. and write down as many abstract nouns that pertain to concrete things as you can.
A concrete noun is a noun that can be identified through one of the five senses (taste, touch, sight, hearing, or smell).
An abstract nounnames a quality or an idea that cannot be physically quantified with the senses. Instead, it symbolises an abstract concept, such as a feeling, a quality, or an idea. In other words, abstract nouns are intangible concepts.
Questions to Ponder:
- Did any of the Abstract Nouns “show up” in the way you formed Sushi groups?
- How did you know “where to stand”? ( like Archimedes seems to have known)
- Did you feel some groups to be “cooler” than say the groups you were in?
- If you were to look down at our classroom arrangement from the ceiling, how would you know which group was which?
- How could you possibly use some of the Abstract Nouns in the Sushi-group-making?
The Nature of Data
Why Visualize?
So now that we know where data comes from, why do we want to visualize it?
- We can digest information more easily when it is pictorial
- Our Working Memories are both short-term and limited in capacity. So a picture abstracts the details and presents us with an overall summary, an insight, or a story that is both easy to recall and easy on retention.
- Data Viz includes shapes that carry strong cultural memories and impressions for us. These cultural memories help us to use data viz in a universal way to appeal to a wide variety of audiences. (Do humans have a gene for geometry?)
- It helps sift facts and mere statements: for example:
Why Code? Why not use no-Code?
There are good arguments in favour of using code to produce charts. There are of course also situations and needs where you may decide to not use code.
Let us paraphrase the arguments from Data Viz expert Claus Wilke :
Ideally, (charts) should come out of the pipeline ready to be sent to the printer, no manual post-processing needed.
- First, the moment you manually edit a figure, your final figure becomes irreproducible. A third party cannot generate the exact same figure you did. This may be important for example in scientific and research disciplines certainly, but also when you are part of a larger team of collaborators and you have to swap roles and work products.
- If you use say Adobe Illustrator to spruce up a chart, how does another person know why you made the changes? Code can show what decisions you make.
- No chart is ever done-done one time. And if you add a lot of manual post-processing to your figure-preparation pipeline, then you will be more reluctant to make any changes or redo your work. Code makes it easier to iterate, especially you may not be in a position to ignore reasonable requests for change made by collaborators or colleagues.
- You may yourself forget what exactly you did to prepare a given figure, or you may not be able to generate a future figure on new data that exactly visually matches your earlier figure. For example then, what do you do if the underlying data changes and causes changes and you can’t remember what you did?
So, we will play it safe and do both: Code and No-Code.
How do we Spot Data Variable Types?
By asking questions!
Pronoun | Answer | Variable / Scale | Example | What Operations? |
---|---|---|---|---|
What, Who, Where, Whom, Which | Name, Place, Animal, Thing | Qualitative / Nominal | Name |
|
How, What Kind, What Sort | A Manner / Method, Type or Attribute from a list, with list items in some ” order**” ( e.g. good, better, improved, best..) | Qualitative / Ordinal |
|
|
How Many / Much / Heavy? Few? Seldom? Often? When? | Quantities with Scale. Differences are meaningful, but not products or ratios |
Quantitative / Interval |
|
|
How Many / Much / Heavy? Few? Seldom? Often? When? | Quantities, with Scale and a Zero Value. Differences and Ratios /Products are meaningful. (e.g Weight ) |
Quantitative / Ratio** |
|
|
As you go from Qualitative
to Quantitative
data types in the table, I hope you can detect a movement from fuzzy groups/categories to more and more crystallized numbers. Each variable/scale can be subjected to the operations of the previous group. In the words of S.S. Stevens ,
the basic operations needed to create each type of scale is cumulative: to an operation listed opposite a particular scale must be added all those operations preceding it.
What Are the Parts of a Data Viz?
How to pick a Data Viz?
Most Data Visualizations use one or more of the following geometric attributes or aesthetics. These geometric aesthetics are used to represent qualitative or quantitative variables from your data.
What does that mean? We can think of simple visualizations as combinations of these aesthetics. Some examples:
Aesthetic #1 | Aesthetic #2 | Shape | Chart Picture |
---|---|---|---|
Position X = Quant Variable | Position Y = Quant Variable | Points/Circles with Fixed Size | |
Position X = Qual Variable | Position Y = Count of Qual var) | Columns | |
Position X = Qual Variable | Position Y = Qual Variable | Rectangles, with area proportional to joint(X,Y) count | |
Position X = Qualitative Variable | Position Y = Rank Ordered Quant Variable | Box + Whisker, Box length proportional to Inter-Quartile Range, whisker-length proportional to upper and lower quartile resp. | |
Position X = Quant Variable | Postion Y = Quant Variable + Qual Var | ||
Quant Variable | Shape = Line with Quant Variable |
The plural of rumpus is unlikely to be “rumpii”.↩︎