But keep statistical evidence. How? A statistician shares a writing sample.

photo by Anna Nekrashevich at https://www.pexels.com/photo/pen-business-eyewear-research-6801648/

(Scroll to the end to see the Writing Sample.)

What? Why?!

It’s not so much a technical problem. It’s a language and behavior problem — a failure in scientific communication. That is, it’s bad for science.

We can all improve science together! But changing deeply ingrained bad habits, especially at an institutional and societal level, takes time. Behavior change is hard — so let’s start now…


Two common wrong phrases about statistical significance

photo of a magnifying glass laid atop a table of numbers
photo of a magnifying glass laid atop a table of numbers
photo by Ian Panelo at https://www.pexels.com/photo/magnifying-glass-on-textbook-4494644/

(NOTE: On 13 October 2020, I changed “detect” to “discern” to avoid confusion in analyses related to fields like signal detection or anomaly detection.)

You might see me twitch whenever I hear a lecturer, presenter, or researcher say “significant” when they clearly mean “statistically significant” (which in fact has nothing to do with being “clinically significant” or “scientifically significant”; see Wasserstein et al, 2019, and…


My Filipino American journey as a privileged Brown Asian immigrant

(This essay has been updated as “How A ‘Secret Asian Man’ Embraced Anti-Racism”, which can be found at https://laist.com/2020/09/25/race-in_la_how_a_secret_asian_man_embraced_anti-racism.php.)

Black, brown, and yellow bees working at their colony. (Photo by David Hablützel from Pexels. https://www.pexels.com/photo/animals-apiary-beehive-beekeeping-928978/.)

I came…


Significance does not imply importance — but you need it to judge quality

See the Appendix at rpubs.com/ericjdaza/607888 for a proof sketch that significance does not imply importance.

Disclaimer: This post is not an argument for or against the recent remdesivir findings. Rather, it’s meant to help you better distinguish the importance of clinical findings from the quality of the evidence for or against those findings. These two often get conflated in the news — even by medical doctors and health experts!

Technical Disclaimer: We’ll analyze the time to recovery as a continuous variable for simplicity, though a time-to-event / survival analysis is more appropriate.

Main Lessons

  1. Ask yourself if a randomized controlled trial’s reported effect size estimate is meaningful, regardless of sample size.
  2. Train yourself to internalize that significance…


Causal inference tutorial in R using synthetic data (Part 2)

Photo of the author’s glassboard filled with equations.
Photo of the author’s glassboard filled with equations.
Photo of the author’s glassboard.

To review from Part 1 of this two-part tutorial using synthetic data:

Our analysis goal will be to help public health authorities in our simulated world reduce SARS-CoV-2 (“coronavirus”) infections. We believe our digital health or telemedicine app can help prevent new infections; e.g., by promoting healthy lifestyle choices —…


Causal inference tutorial in R using synthetic data (Part 1)

Photo of the author’s glassboard.

Consider a timely hypothetical twist on a classic example of spurious correlation: Recently, ice cream sales have been dropping — along with the number of homicides. But this isn’t because eating ice cream drives people to murder. It’s because a community-wide shelter-in-place mandate was enacted to prevent the spread of a novel infectious agent. …


Eric J. Daza at the University of North Carolina Department of Biostatistics circa 2014.
Eric J. Daza at the University of North Carolina Department of Biostatistics circa 2014.
Eric J. Daza (the author) circa 2014 at the Department of Biostatistics of the Gillings School of Global Public Health at The University of North Carolina at Chapel Hill. (Photo credit: J.P. + J.K.)

Epidemiologists, infectious-disease specialists in particular, are the ultimate domain experts in guiding data science solutions that model or otherwise analyze population-level health-related aspects of SARS-CoV-2 (“coronavirus”) and its health impacts (i.e., COVID-19 characteristics and effects). As such, it is encouraging to see more and more data science hackathons and projects that correctly recognize the need to recruit epidemiologists to guide solution development. …

Eric J. Daza, DrPH, MPS

data science, digital health, biostatistics, causal inference, n-of-1 | 🇺🇸🇵🇭 ericjdaza.com statsof1.org evidation.com | #blacklivesmatter #stopasianhate

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store