https://dicook.org/files/vISEC2020/slides_tourr.html
Image credit: Gentoo Penguins, Wikimedia Commons
tourr
install.packages("tourr")help(package="tourr")
library("tourr")
Implements geodesic interpolation and basis generation functions that allow you to create new tour methods from R.
spinifex
install.packages("spinifex")help(package="spinifex")
library("spinifex")
Implements manual control, where the contribution of a selected variable can be adjusted between -1 to 1, to examine the sensitivity of structure in the data to that variable. The result is an animation where the variable is toured into and out of the projection completely.
geozoo
install.packages("geozoo")help(package="geozoo")
library("geozoo")
Geometric objects defined in 'geozoo' can be simulated or displayed in the R package 'tourr'.
## R version 4.0.1 (2020-06-06)## Platform: x86_64-apple-darwin17.0 (64-bit)## Running under: macOS Mojave 10.14.6## ## Matrix products: default## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib## ## locale:## [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8## ## attached base packages:## [1] stats graphics grDevices utils datasets methods base ## ## other attached packages:## [1] geozoo_0.5.1 spinifex_0.2.0 tourr_0.5.6 ## [4] xaringanthemer_0.3.0## ## loaded via a namespace (and not attached):## [1] sysfonts_0.8.1 digest_0.6.25 showtextdb_3.0 bitops_1.0-6 ## [5] magrittr_1.5 evaluate_0.14 xaringan_0.16 rlang_0.4.6 ## [9] stringi_1.4.6 rmarkdown_2.3 tools_4.0.1 stringr_1.4.0 ## [13] showtext_0.8-1 xfun_0.14 yaml_2.2.1 compiler_4.0.1 ## [17] htmltools_0.5.0 knitr_1.28
Grab the runthis.R
file from https://github.com/dicook/vISEC2020
in the skills_showcase
folder. (Or the slides_tour.Rmd
for everything!)
remotes::install_github("allisonhorst/palmerpenguins")
library(tidyverse)library(palmerpenguins)penguins <- penguins %>% filter(!is.na(bill_length_mm))
species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | |
---|---|---|---|---|---|---|---|
1 | Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | male |
2 | Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | female |
3 | Adelie | Torgersen | 40.3 | 18 | 195 | 3250 | female |
4 | Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | female |
5 | Adelie | Torgersen | 39.3 | 20.6 | 190 | 3650 | male |
See https://allisonhorst.github.io/palmerpenguins/ for more details.
![]() | ![]() | ![]() |
Adélie Wikimedia Commons | Gentoo Wikimedia Commons | Chinstrap Wikimedia Commons |
library(ochRe)ggplot(penguins, aes(x=flipper_length_mm, y=body_mass_g, colour=species, shape=species)) + geom_point(alpha=0.7, size=2) + scale_colour_ochre( palette="nolan_ned") + theme(aspect.ratio=1, legend.position="bottom")
clrs <- ochre_pal( palette="nolan_ned")(3)col <- clrs[ as.numeric( penguins$species)]animate_xy(penguins[,3:6], col=col, axes="off", fps=15)
00:30
A grand tour is by definition a movie of low-dimensional projections constructed in such a way that it comes arbitrarily close to showing all possible low-dimensional projections; in other words, a grand tour is a space-filling curve in the manifold of low-dimensional projections of high-dimensional data spaces.
xi∈Rp, ith data vector
F is a p×d orthonormal basis, F′F=Id, where d is the projection dimension.
The projection of xi onto F is yi=F′xi.
Tour is indexed by time, F(t), where t∈[a,z]. Starting and target frame denoted as Fa=F(a),Fz=F(t).
The animation of the projected data is given by a path yi(t)=F′(t)xi.
Tour is indexed by time, F(t), where t∈[a,z]. Starting and target frame denoted as Fa=F(a),Fz=F(t).
The animation of the projected data is given by a path yi(t)=F′(t)xi.
A grand tour is like a random walk (with interpolation) through the space of all possible planes.
Hollow
Solid
Hollow
Solid
Torus
Mobius
Length and direction of axes relative to the pattern of interest
Gentoo from others in contrast of fl, bd
Chinstrap from others in contrast of bl, bm
There may be multiple and different combinations of variables that reveal similar structure. ☹️
The tour can help to discover these, too. 😂
new target bases are chosen using a projection pursuit index function
maximizeFg(F′x) subject to F being orthonormal
holes
: This is an inverse Gaussian filter, which is optimised when there is not much data in the center of the projection, i.e. a "hole" or donut shape in 2D.central mass
: The opposite of holes, high density in the centre of the projection, and often "outliers" on the edges. LDA
/PDA
: An index based on the linear discriminant dimension reduction (and penalised), optimised by projections where the named classes are most separated.Grand
Might accidentally see best separation
Guided, using LDA index
Moves to the best separation
control the coefficient of one variable, reduce it to zero, increase it to 1, maintaining orthonormality
Rocks from and to a given projection, in order to observe the neighbourhood
Method 1, using plotly (see reading axes
code chunk):
frame
pointing to your indexggplotly
htmltools::save_html()
or try using
spinifex::play_tour_path()
Method 1, using gifski
and tourr::render_gif()
. See lots of code chunks!
We can learn a little more about the data if have a tour in the toolbox. It can help us to understand
Slides created via the R package xaringan, with iris theme created from xaringanthemer.
The chakra comes from remark.js, knitr, and R Markdown.
Slides are available at https://dicook.org/files/vISEC20/slides_tourr.html and supporting files at https://github.com/dicook/vISEC2020.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
Number + Return | Go to specific slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |