layout: true --- name: xaringan-title class: inverse, left, middle .pull-left[ # .center[Common misconceptions in statistics] ## .center[Adrian Barnett, QUT] ### .center[23 June 2022] [
@aidybarnett](http://twitter.com/aidybarnett) [
@agbarnett](http://github.com/agbarnett) [
Median Watch](https://medianwatch.netlify.app) [
a.barnett@qut.edu.au](mailto:a.barnett@qut.edu.au) ] .pull-right[  ] --- ## Pie charts are terrible <img src="figures/pie_charts.jfif" width="62%" style="display: block; margin: auto;" /> From https://twitter.com/MaxCRoser/status/857389434756505600 --- ## Truly terrible example <img src="figures/pie_terrible.jpg" width="80%" style="display: block; margin: auto;" /> --- class:inverse ## 3D charts are even worse <img src="figures/3d_pie_1.png" width="80%" style="display: block; margin: auto;" /> --- class:inverse ## 3D charts are even worse <img src="figures/3d_pie_2.png" width="80%" style="display: block; margin: auto;" /> --- ## Bar charts should not be used for continuous data <img src="figures/pbio.1002128.g001.png" width="80%" style="display: block; margin: auto;" /> Weissgerber et al, DOI: 10.1371/journal.pbio.1002128 --- ## Outliers should not be deleted <img src="slides_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> --- ## Outliers should not be deleted .pull-left[ <img src="slides_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] .pull-right[ <img src="figures/rothwell.jpg" width="88%" style="display: block; margin: auto;" /> (from BBC) ] Data from [Spyrou et al](https://www.nature.com/articles/s41586-022-04800-3#MOESM1) --- class:inverse <!--- two columns ---> .pull-left[ ## The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka” but “That’s funny...” Isaac Asimov (1920 - 1992) ] .pull-right[ <img src="https://upload.wikimedia.org/wikipedia/commons/archive/3/34/20100906191953%21Isaac.Asimov01.jpg" width="320" height="464"/> ###### Image from Wikipedia, Phillip Leonian ] --- # Too many numbers ## "The number of deaths in the last ten years was 1,456,231.'' ## "The percentage working was 44.23% and the percentage retired was 12.14%.'' --- # Too many numbers ## "The number of deaths in the last ten years was 1.4 million.'' ## "The percentage working was 44% and the percentage retired was 12%.'' Guidelines here: DOI: [10.1136/archdischild-2014-307149](https://adc.bmj.com/content/100/7/608) --- class: inverse .pull-left[ ## “Lack of mathematical education does not become more evident than by excessive precision in numerical calculation" Gauss (1777.241 - 1855.145) ] .pull-right[ <img src="https://upload.wikimedia.org/wikipedia/commons/e/ec/Carl_Friedrich_Gauss_1840_by_Jensen.jpg" width="320" height="464"/> ###### Image from Wikipedia ] --- # Continuous data do not have to be normal <img src="slides_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> --- # Continuous data do not have to be normal <img src="slides_files/figure-html/unnamed-chunk-10-1.png" style="display: block; margin: auto;" /> --- # Continuous data do not have to be normal <img src="slides_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> --- class:inverse # A common misconception "It is widely but incorrectly believed that the t-test and linear regression are valid only for Normally distributed outcomes. This belief leads to the use of rank tests for which confidence intervals are very hard to obtain and interpret and to cumbersome data-dependent procedures where different transformations are examined until a distributional test fails to reject Normality." We only need to worry about **extreme** non-Normality #### DOI: [10.1146/annurev.publhealth.23.100901.140546](https://pubmed.ncbi.nlm.nih.gov/11910059/) --- class:center, middle <img src="figures/screw_meme.png" width="88%" style="display: block; margin: auto;" /> --- # Rarely need non-parametric tests <img src="figures/time_difference.png" width="88%" style="display: block; margin: auto;" /> --- # Testing for a difference between groups .pull-left[ ## Parametric t-test * Difference of 6.3 days * 95% confidence interval: 2.5, 10.0 days * p-value = 0.001 ] .pull-right[ ## Non-parametric Wilcoxon test * W = 20428 * p-value = 0.0003 ] --- # Do not categorise continuous data <img src="https://raw.githubusercontent.com/agbarnett/talks/master/skepticon/figures/malnutrition.jpg" width="75%" style="display: block; margin: auto;" /> Known as "Dichotomania" --- class: inverse # Measuring malnutrition <!--- https://scopeblog.stanford.edu/2015/05/29/study-finds-arm-circumference-is-accurate-measure-of-malnutrition-in-children-with-diarrheal-illnesses/ --->  ###### Image from Stanford Medicine (study was not by Stanford) --- ## Lab vs real life <img src="figures/psa_journey1.jpg" width="62%" style="display: block; margin: auto;" /> --- ## Lab vs real life <img src="figures/psa_journey2.jpg" width="62%" style="display: block; margin: auto;" /> --- ## Lab vs real life <img src="figures/psa_journey3.jpg" width="62%" style="display: block; margin: auto;" /> --- ## Lab vs real life <img src="figures/psa_journey4.jpg" width="62%" style="display: block; margin: auto;" /> --- ## Do not use p-values and statistical significance <img src="figures/pval_journey.jpg" width="50%" style="display: block; margin: auto;" /> False positive probability = 9 / (9 + 12) = 43% --- class:inverse ## Do not use p-values and statistical significance .pull-left[ ## "Significance tests are popular with non-statisticians, who like to feel certainty where no certainty exists" (Yates and Healy 1964) DOI: [10.2307/2344003](https://shibbolethsp.jstor.org/start?entityID=https%3A%2F%2Fidp.qut.edu.au%2Fentity&dest=https://www.jstor.org/stable/2344003&site=jstor) ] .pull-right[ <img src="figures/Yates_Fisher_Cochran.jpg" width="60%" style="display: block; margin: auto;" /> Yates, Fisher and Cochran (Barry Eagel, Wikimedia Commons) ] --- ## Statistical significance has created huge biases in the literature <img src="figures/z_values.png" width="80%" style="display: block; margin: auto;" /> --- ## Don't use matching .pull-left[ Matching is a very strong assumption, needs a very strong match Widely applied inappropriately to match patients and hence eliminate confounding Can easily create selection biases Often reduces power ] .pull-right[ <img src="figures/cards.jpg" width="60%" style="display: block; margin: auto;" /> ] --- ## Understand differences in risk <img src="figures/risks.png" width="60%" style="display: block; margin: auto;" /> Adapted from DOI: [10.1016/S0140-6736(10)62296-9](https://pubmed.ncbi.nlm.nih.gov/21353301/) --- # Use research checklists <img src="figures/checklists.png" width="89%" style="display: block; margin: auto;" /> [goodreports.org](https://www.goodreports.org/) --- class:inverse # Same statistical mistakes ad nauseam <img src="figures/push_back.png" width="85%" style="display: block; margin: auto;" /> [Andrew Althouse](https://discourse.datamethods.org/t/reference-collection-to-push-back-against-common-statistical-myths/1787) --- class:center, middle, inverse ## Work slowly and carefully ## Publish fewer papers ## Take time to learn the methods you use ## Check your work, try to break your models --- ## “Do you want to be credible or incredible?” <img src="figures/simine.png" width="55%" style="display: block; margin: auto;" /> [Association for Psychological Science](https://www.psychologicalscience.org/observer/do-we-want-to-be-credible-or-incredible) --- .pull-left[  ] .pull-right[ <img src="figures/david.jpg" width="73%" style="display: block; margin: auto;" /> ]