Hilary Parker
Quick Facts
Biography
Hilary S. Parker is an American biostatistician and data scientist. She was formerly a senior data analyst at the fashion merchandising company Stitch Fix. Parker co-hosts the data analytics podcast Not So Standard Deviations with Roger Peng. She received her PhD in biostatistics from the Johns Hopkins Bloomberg School of Public Health and has formerly been employed by Etsy.
Life and education
Parker graduated from Pomona College in 2008 with a bachelor's degree in molecular biology and mathematics. After earning her MHS, she obtained her PhD in biostatistics from the Johns Hopkins Bloomberg School of Public Health in 2013. Parker resides in San Francisco.
Parker's scientific research began during her PhD in the areas of genomics and personalized medicine. Her research looked at factors like batch effects and their impact on prediction. Working alongside Jeffrey T. Leek, Parker developed methods for the application of genomic technologies in personalized medicine. Batch effects confound data produced by genomic sequencing technologies, like microarrays. Parker's work aims at correcting predictions that are influenced by the batch effect. This helps mitigate the effects of confounded genomic data. This is of importance since the data is used for diagnosis. In her dissertation, "Practical statistical issues in translational genomics," Parker proposed frozen surrogate variable analysis (fSVA) to improve prediction accuracy in public genomic studies and simulations.
Career and research
After her PhD, Parker went on to work as a data scientist in industry. Her first job was as a data analyst (later, senior data analyst) at Etsy, where she worked for approximately three years. Parker self-described her position as an internal statistical consultant, eventually focused on developing A/B testing and other experiments run by the company, along with analyzing the resulting data. Opportunity sizing, experimentation and impact analysis all play a role in how she helped the company development.
In 2015, Parker began work on the podcast, Not So Standard Deviations, with co-host Roger Peng. The pair discuss data analytics, covering statistical computation, data cleaning, and R packages. The show is among the more popular data science and statistics podcasts, with over half a million downloads. The two also co-authored the book, Conversations on Data Science based on their conversations during the podcast. They recorded their 100th podcast episode live on stage as a keynote presentation at the RStudio-sponsored rstudio::conf 2020.
After leaving Etsy, Parker transitioned to a career as a data scientist at personal styling site Stitch Fix. The company employs a human-in-the-loop algorithmic process to generate a recommended box of clothing that is shipped to subscribers. Parker optimizes the algorithms the site uses to recommend clothes to people and helps determine what data is needed from clients to determine clothing matches. She has worked on new forms of data generation and helped build datasets powering outfits. Parker left Stitch Fix in August 2020 to join the Joe Biden 2020 presidential campaign.
Parker speaks at conferences, often as a keynote speaker. She coined the term "opinionated analysis development" to describe a framework for producing robust data analysis that resembles some aspects of software design.
Awards
In 2012, Parker received the Helen Abbey Award from Johns Hopkins. This award is given to a student who intends to teach biostatistics.
Selected works
Parker has contributed to several different publications and projects including the following:
- Leek, Jeffrey T.; Johnson, W. Evan; Parker, Hilary S.; Jaffe, Andrew E.; Storey, John D. (March 15, 2012). "The sva Package for Removing Batch Effects and Other Unwanted Variation in High-throughput Experiments". Bioinformatics. 28 (6): 882–883. doi:10.1093/bioinformatics/bts034. ISSN 1367-4803. PMC 3307112. PMID 22257669.
- Parker, Hilary S.; Leek, Jeffrey T. (January 16, 2012). "The practical effect of batch on genomic prediction". Statistical Applications in Genetics and Molecular Biology. 11 (3): Article-10. doi:10.1515/1544-6115.1766. ISSN 1544-6115. PMC 3760371. PMID 22611599.
- Parker, Hilary (January 30, 2013). "Hillary: The Most Poisoned Baby Name in U.S. History". The Cut.
- Parker, Hilary S.; Leek, Jeffrey T.; Favorov, Alexander V.; Considine, Michael; Xia, Xiaoxin; Chavan, Sameer; Chung, Christine H.; Fertig, Elana J. (October 2014). "Preserving Biological Heterogeneity with a Permuted Surrogate Variable Analysis for Genomics Batch Correction". Bioinformatics. 30 (19): 2757–2763. doi:10.1093/bioinformatics/btu375. ISSN 1460-2059. PMC 4173013. PMID 24907368.
- Peng, Roger D.; Parker, Hilary (2016). Conversations On Data Science. Leanpub. Retrieved August 10, 2020.
- Parker, Hilary (August 2017). "Opinionated Analysis Development" (PDF). PeerJ. doi:10.7287/peerj.preprints.3210v1. Archived from the original (PDF) on November 12, 2020. Retrieved August 10, 2020.