class: center, middle, inverse, title-slide #
nullabor
## tools for testing whether what you see in plots is really there ### Di Cook, Monash University ### NYC R meetup - June 6, 2019
Slides:
https://dicook.org/files/NYCR/slides.html
--- class: center, middle # Why? --- background-image: url("http://dicook.org/files/NYCR/images/polls1.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/polls2.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/wasps.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/biomass.png") background-position: 50% 50% background-size: 75% --- class: center, middle # A lot of decisions that we make are based on plots, but there is no rigorous was to say that what we see is really there --- # Outline - why - lineup, rorschach functions - null generating mechanisms - p-values - power and plot design - where to get more information - acknowledgements --- # Why inference? - Plots of data allow us to uncover the unexpected, but it needs to be calibrated against what might be seen by chance, if there really is no underlying pattern - Classical statistical inference allows computing probabilities, of this being a likely value for the statistic, if there really is no structure --- # Post-hoc inference - Inference is usually set up before collecting data - Once you see it, its too late - You cannot legitimately test for significance of structure ... but you can't always plan the future --- # nullabor - Lineup protocol: Plots your data among a field of "null" plots - Puts it in the context of what it might look like if there is really no structure - Encrypts the location of the data plot - Rorschach protocol: Plots only nulls, and gives a sense of what random structure might be seen --- background-image: url("http://dicook.org/files/NYCR/images/wasps1.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/wasps2.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/biomass1.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/biomass2.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/polls3.png") background-position: 50% 50% background-size: 75% --- background-image: url("http://dicook.org/files/NYCR/images/polls4.png") background-position: 50% 50% background-size: 75% --- # lineup functions - `lineup`: Generates a lineup using one of the given null generating mechanisms - `null_permute` - `null_dist` - `null_lm` - `null_ts` - `pvisual`: Compute `\(p\)`-values, after showing to impartial jurers - `visual_power`: Compute the *power*, after showing to impartial jurers - `distmet`: empirical distribution of distance between data plot and null plots --- class: center, middle # Let's do it! Motivated by this article [How Data Made Me A Believer In New York City's Restaurant Grades](https://fivethirtyeight.com/features/how-data-made-me-a-believer-in-new-york-citys-restaurant-grades/), using data on NYC restaurant inspections tidied by [tidytuesday](https://github.com/rfordatascience/tidytuesday/tree/master/data/2018/2018-12-11) How do restaurants rate? --- class: center, middle # Pick the plot that is most different from the others --- class: center, middle <img src="slides_files/figure-html/unnamed-chunk-3-1.png" width="100%" /> --- ``` ggplot(lineup(null_permute("grade"), nyc_cuisine), aes(x=cuisine_description, fill=grade)) + geom_bar(position="fill") + xlab("") + ylab("") + scale_fill_ochre(palette="parliament") + facet_wrap(~.sample) + ggtitle("A grade % and cuisine") + coord_flip() + theme_bw() + theme(legend.position="none", axis.text.y = element_blank()) ``` `\(H_o:\)` There is no difference in proportion A, B, C grade among cuisines `\(H_a:\)` There is Null generating mechanism: Permuting the values of grade True data plot is in position 6. --- class: center, middle # Pick the plot that is most different from the others --- class: center, middle <img src="slides_files/figure-html/unnamed-chunk-4-1.png" width="100%" /> --- ``` nyc_yrmth_l <- lineup(null_ts("pct", auto.arima), filter(nyc_yrmth, grade=="A")) ggplot(nyc_yrmth_l, aes(x=yrmth, y=pct)) + geom_line() + xlab("") + ylab("") + facet_wrap(~.sample) + ggtitle("Monthly A grade %") ``` `\(H_o:\)` There is NO temporal pattern in percentage of restaurants awarded an A grade `\(H_a:\)` There is Null generating mechanism: Simulating from an ARIMA model with same parameters True data plot is in position 7. --- class: center, middle # Pick the plot that is most different from the others --- <img src="slides_files/figure-html/unnamed-chunk-7-1.png" width="100%" /> --- ``` nyc_boro_grade_l <- lineup(null_dist("grade_A", "unif", params=list(min=0, max=1)), nyc_boro_grade, n=6) nyc_boro_grade_l_p <- nyc_boro_grade_l %>% group_by(.sample, boro) %>% summarise(prop_A = length(grade_A[grade_A>0.174])/length(grade_A)) ``` `\(H_o:\)` The percentage of restaurants awarded an A grade in each boro is the same (82.5%) `\(H_a:\)` Its not Null generating mechanism: Simulating from an binomial model with same parameters True data plot is in position 3 --- # Computing p-value Probability of `\(x\)` or more independent observers picking the data plot, assuming that there is no difference between the data plot and the null plots. ```r pvisual(5, 50) ``` ``` ## x simulated binom ## [1,] 5 0.1711 0.1036168 ``` --- # Computing power ```r data(turk_results) visual_power(turk_results) ``` ``` ## # A tibble: 6 x 3 ## pic_id power n ## <int> <dbl> <int> ## 1 36 0 18 ## 2 105 0.746 17 ## 3 116 0.125 16 ## 4 131 0.842 14 ## 5 159 0.656 15 ## 6 225 0.130 15 ``` Useful for objectively determining best plot design, see [Hofmann et al (2012)](https://ieeexplore.ieee.org/document/6327249) --- # What did we learn about NYC restaurant quality? ? - There is a difference in percentage A, B, C between cuisines - There is no temporal trend - Some boroughs are more A class than others --- # Summary - Really useful package. - Various null plot generators - Embeds the data plot, and encrypts - Computes significance and power - Helps to adjust our expectations, dampen surprise, support surprise - Calibrate your eyes on what randomness looks like Original version of the package written by Hadley Wickham, `\(p\)`-value and power functions by Heike Hofmann, metrics and model nulls by Niladri Roy Chowdhury, and time series nulls and code updates by myself and Stuart Lee. Website for package is [http://dicook.github.io/nullabor/articles/nullabor.html](http://dicook.github.io/nullabor/articles/nullabor.html). --- # Acknowledgements # 👩💻 Made by a human with a computer Slides at [https://dicook.org/](https://dicook.org/files/NYCR/slides.html). Code and data at [https://github.com/dicook/NYCR](https://github.com/dicook/NYCR). <br> Created using [R Markdown](https://rmarkdown.rstudio.com) with flair by [**xaringan**](https://github.com/yihui/xaringan), and Australianised **shinobu** style. <br> <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.