Research distribution, each number represents a published or working paper (see below for the references). * indicates that I am the first author. Methodologically, I have focused on the intersection of:
Data integration: using fewer assumptions to extend causal inferences from 1) a trial population to a target population, 2) external data to a trial, and 3) multi-source data to a target population.
Heterogeneity of treatment effect: developing robust and clinically applicable methods to estimate the treatment effect for patients who have similar conditional treatment effects with valid inference, identifying exceptional responders and optimal treatment strategies.
Prior knowledge integration: incorporating clinician's knowledge to construct coherent and interpretable prediction models and identify clinically meaningful predictors.
I have employed theories in non/semi-parametric statistics, optimization, and survival analysis to develop reproducible machine learning methods with valid statistical inference.
I have experience dealing with time-dependent treatment and covariates, and repeated and survival outcomes. I have used data from (multi-center) randomized clinical trials, electronic health records, healthcare insurance claim data, and administrative data to investigate statistical inquiries in cardiovascular diseases, oncology, multi-drug resistant tuberculosis, HIV, and mental illness etc. I am proficient in R, Python, and C++.
Fellowship:
Machine learning for prediction of clinical outcome and estimation of causal effect of direct oral anticoagulants among patients with atrial fibrillation, Fonds de Recherche du Québec Santé (FRQS 272161) doctorate training grant, 70,000 CAD Role: PI, 2019-2022.
Grant submission:
Heterogenous Effect Analyses for caRdiovascular Treatments on Survival (HEARTS), NIH K99/R00 Pathway to Independence Award, budget: 1M USD. Role: PI Mentor team: Issa Dahabreh, James Robins, Miguel Hernán, Kosuke Imai, Lu Tian, Jon Steingrimsson, Robert Yeh, and Robert Giugliano.
Studying Phenotypes in ARDS Research Consortium (SPARC), Canadian Institute of Health Research (CIHR) Project Grant Competition, budget: 3M CAD Role: Co-I PI: Eddy Fan and Ewan Goligher.
1. G. Wang*, A. Levis, J. Steingrimsson, IJ. Dahabreh. Efficient estimation of subgroup treatment effects using multi-source data,arXiv
2. G. Wang'*, S. McGrath', Y. Lian', IJ. Dahabreh, CausalMetaR: An R package for performing causally interpretable meta-analyses, arXiv,R package
3. G. Wang, MP. Costello*, H. Pang, J. Zhu, HJ. Helms, I. Reyes-Rivera, RW. Platt, M. Pang, A. Koukounari. Evaluating hybrid controls methodology in early-phase oncology trials: a simulation study based on the MORPHEUS-UC trial, Pharmaceutical Statistics, (2024); 23(1): 31-45. doi:10.1002/pst.2336 Wiley
4. H. Bian*, M. Pang, G. Wang, Z. Lu, Non-collapsibility and Built-in Selection Bias of Hazard Ratio in Randomized Controlled Trials, in revision, BMC Medical Research Methodology, arXiv
5. G. Wang*, A. Levis, J. Steingrimsson, IJ. Dahabreh. Causal inference under transportability assumptions for conditional relative effect measures, arXiv
6. L. Ung*, G. Wang, S. Haneuse, M. Hernán, IJ. Dahabreh. Combining an experimental study with external data: study designs and identification strategies, arXiv
7. Y. Liu, ME. Schnitzer, G. Wang, E. Kennedy, P. Viiklepp, MH. Vargas, G. Sotgiu, D. Menzies, and A. Benedetti*. Modeling Treatment Effect Modification in Multidrug-Resistant Tuberculosis in an Individual Patient Data Meta-Analysis, Statistical Methods in Medical Research (2021) 10.1177/09622802211046383. SAGE
8. G. Wang'*, S. Liu', S. Yang. Continuous-time Structural Failure Time Models with intermittent treatment. arXiv
9. G. Wang, ME. Schnitzer, D. Menzies, P. Viiklepp, TH. Holtz, and A. Benedetti. Estimating treatment importance in multiple drug resistant tuberculosis using Targeted Learning: an observational individual patient data Network Meta-Analysis, Biometrics (2020) 76(3):1007-1016. Wiley
10. AA. Siddique, ME. Schnitzer*, A. Bahamyirou, G. Wang, A. Benedetti et al. Causal Inference for polypharmacy: Propensity score estimation with multiple concurrent medications, Statistical Methods in Medical Research, (2019) 28(12):3534-3549. SAGE
11. G. Wang*. Review 1: ''Antibiotic Prescribing in Remote Versus Face-to-Face Consultations for Acute Respiratory Infections in English Primary Care: An Observational Study Using TMLE.'' Rapid Reviews Infectious Diseases. (2023) RR
12. A. Jaman, G. Wang, A, Ertefaie, M. Bally, R. Lévesque, RW. Platt, and ME. Schnitzer*. Penalized G-estimation for effect modifier selection in the structural nested mean models for repeated outcomes, in revision, Biometrics, arXiv, R package
13. ME. Schnitzer*, A. Ertefaie, D. Talbot, G. Wang, D. Berger, J. O'Loughlin, M.P. Sylvestre. Longitudinal outcome-adaptive and marginal fused Lasso for confounder selection and model pooling with time-varying treatments
14. G. Wang*, ME. Schnitzer, T. Chen, R. Wang, RW. Platt. A general framework for formulating structured variable selection, Transactions of Machine Learning Research, ISSN: 2835-8856 (2024); OpenReview
15. G. Wang*, ME. Schnitzer, RW. Platt, R. Wang, M. Doris, S. Perreault, Integrating complex selection rules into the latent overlapping group Lasso for constructing coherent prediction models, in revision, Statistics in Medicine, arXiv
16. G. Wang*', Y. Lian', A. Yang, RW. Platt, R. Wang, M. Dorias, S. Perreault, M. Schnitzer. Structured learning in Cox models with time-dependent covariates, Statistics in Medicine, 2024; 1-20. doi: 10.1002/sim.10116,Wiley, R package
17. A. Bouchard, F. Bourdeau, J. Roger, VT. Taillefer, N. Sheehan, ME. Schnitzer, G. Wang, IJ. Jean Batiste, and R. Therrien*. Predictive Factors of Detectable Viral Load in HIV Infected Patients, AIDS Research and Human Retroviruses, (2022) 38(7):552-560, Libertpub
18. X. Di, Y. Chi, L. Xiang, G. Wang, B. Liao*. Association between Sitting Time and Urinary Incontinence in the US Population: data from the National Health and Nutrition Examination Survey (NHANES) 2007 to 2018, Heliyon (2024) 10(6):E27764, Elsevier
19. G. Wang, TE. Liao, D. Furfaro, LA. Celi*, KS. Ma*, Extending inference from randomized clinical trials to target populations: a scoping review of transportability methods, arXiv.
20. G. Wang, PJ. Heagerty, IJ. Dahabreh*, Effect score analyses in randomized clinical trials, Journal of American Medical Association. 2024;331(14):1225–1226. doi:10.1001/jama.2024.3376,JAMA
21. G. Wang'*, R. Karlsson'*, J. Krijthe, IJ. Dahabreh, Robust integration of external control data in randomized trials, in revision, Biometrics , arXiv
About structured variable selection: We encourage researchers to thoroughly understand the data before applying any analytical methods. Integrating such information can significantly enhance the performance of the model in use. For example, when it comes to variable selection, incorporating specific selection rules can lead to improvements in prediction accuracy, a reduced false alarm rate, and, notably, enhanced interpretability of the selected model. Examples of simple rules can be "if the interaction is chosen, then all or at least one of the main terms must also be selected", "if the subtopic is selected, then the overarching topic should also be chosen", "select at least one gene from each pathway", and "collectively select dummy variables representing a categorical variable." We have developed unified methods for systematically integrating all available information into the analysis process.