Nam Sy Vo, PhD

Director, Center for Biomedical Informatics, VinBigdata

Affiliate Faculty, College of Engineering and Computer Science, VinUniversity

About

Dr. Nam Sy Vo is Director of Center for Biomedical Informatics at Vingroup Big Data Institute and an Affiliate Faculty at College of Engineering and Computer Science at VinUniversity, Vietnam. Previously, Dr. Vo worked as a Senior Bioinformatics Scientist at Center for Translational Data Science at The University of Chicago. He was trained as a postdoc at The University of Texas MD Anderson Cancer Center after obtaining a PhD in Computer Science from The University of Memphis, USA.

Dr. Vo’s current research interests focus on analysis and interpretation of large-scale multi-omics data towards understanding disease risk and adverse drug reaction. He is leading various projects focusing on sequencing and analyzing Vietnamese genomes at population scale for studying complex diseases in Viet Nam. His group is also building various platforms for managing, analyzing, and sharing large-scale biomedical datasets where reproducibility, portability, and scalability are maximized. He is also interested in applications of Machine Learning and Data Science in bioinformatics.

Previously, Nam has developed various computational methods for sequence alignment and variant calling using next-generation sequencing data. Some of these methods focus on analyzing complex regions of the genomes such as human leukocyte antigens and T-cell receptors. He has also developed various methods for gene expression analysis which focus on predicting patterns of gene response. His work has been applied to several world’s largest datasets including The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Therapies (TARGET).

Research Interests

  • Bioinformatics, with an emphasis on multi-omics data analysis.
  • Data Science/Machine Learning, with an emphasis on biomedical data analysis.
  • High Performance Computing, with an emphasis on cloud computing and workflow acceleration.

Research Experience

Director, Center for Biomedical Informatics, Vingroup Big Data Institute, Vietnam.   2019- present

  • 1000 Vietnamese Genomes Project, Adverse Drug Reactions, Antimicrobial Resistance.
  • Management, Analysis, Sharing, and Harmonization of Big Biomedical Data.
  • Analysis and interpretation of short/long-read sequencing data from WGS/WES.

Senior Bioinformatics Scientist, Center for Translational Data Science, The University of Chicago, USA.    2017-2019

  • Workflows for Somatic Variant Calling, Structural Variant Detection, Variant Annotation.
  • Analysis and interpretation of short-read sequencing data from gene panels/WES/WGS.
  • Workflow analysis for large-scale biomedical data.

Postdoctoral Fellow, MD Anderson Cancer Center, The University of Texas, USA.   2016-2017

  • Computational methods for HLA Typing, TCR Profiling, Neo-antigen Prediction.
  • Analysis and interpretation of WGS/WES, mass spectrometry data (MS).

Research Assistant, Department of Computer Science, The University of Memphis, USA.   2009-2016

  • Computational methods for Sequence Alignment, Variant Calling, Gene Expression Analysis.
  • Analysis and interpretation of WGS, microarray data.

Teaching Experience

Affiliate Faculty, College of Engineering and Computer Science, VinUni, Vietnam.   2020-present

Teaching Assistant, Department of Computer Science, The University of Memphis, USA.   2009-2013

  • Undergraduate courses: Data Structures, Design and Analysis of Algorithms, Foundations of Computing.
  • Graduate courses: Bioinformatics Algorithms, Advanced Topics in Algorithms, Data Mining.

Lecturer, Department of Systems Engineering, National University of Civil Engineering, Vietnam.      2005-2009

  • Undergraduate courses: Introduction to Computer Science, C/C++ Programming Languages; Networks Application Programming with Java.

Awards and Honors

  • Best paper award, IEEE KSE conference, Vietnam.   2019
  • Best paper runner up award, MCBIOS conference, USA.  2014
  • Honor for student research advising, National University of Civil Engineering, Vietnam. 2006-2008
  • Scholarship for excellent students, Hanoi University of Technology, Vietnam. 2000-2004
  • Third prize in the National Olympiad in Mathematics for K12 students, Vietnam.  1999
  • Fourth prize in the Mathematical Problem-Solving Competition for high-school students, Journal of Mathematics and Youth, Vietnam Mathematical Society, Vietnam.   1999
  • Fourth prize in the National Olympiad in Mathematics for K9 students, Vietnam.    1996
  • Top prizes in Hatinh Province Olympiad in Mathematics and Physics, Vietnam.    1996-1999

Educational Background

Ph.D., Computer Science, The University of Memphis, USA.    2016

Dissertation: Computational Methods for Gene Expression and Genomic Sequence Analysis.

M.S., Bioinformatics, The University of Memphis, USA.   2011

Thesis: Analysis of Microarray Data with Directed Graphs.

M.S., Computer Science, Hanoi University of Science and Technology, Vietnam. 2007

Thesis: XML Schema Automatic Matching.

B.S., Computer Science, Hanoi University of Science and Technology, Vietnam. 2004

Project: Normalization of Relational Database Schemas.

Software Packages

  • MASH (VinBigdata, closed source platform): Management, Analysis, Sharing, and Harmonization of Big Biomedical Data.
  • VEPSpark: VEP Accelaration using Spark.
  • ClinAnnot (UChicago, closed source tool): Clinical Annotation of Genomic Variants.
  • pMHCb: Neo-antigen prediction using MS data.
  • IVC: Variant calling using WGS data.
  • RandAL: Short-read alignment using WGS data.
  • mDAG: Gene expression pattern prediction using microarray data.

Research Grants

  • VINIF.2020.DA02. Vietnamese Genome-based Prediction of Disease Risks (VGP). Multiple Principal Investigator. (2020-2023)
  • VINIF.2019.DA109. Fighting antibiotic-resistant pathogenic bacteria using genomics sequencing and big data analytics. Senior member. (2019-2022)
  • CPRIT-IIRACB RP180248. Characterizing cancer genome instability and translational impact using new sequencing technologies. Grant writer. (2018-2021)
  • NSF-CCF 1320297. Analysis of gene expression data using transitive directed graphs. Grant writer and key researcher. (2013-2016)

Publications

Journals:

  • With SEAPharm network. Prevalence of pharmacogenomic variants in 100 pharmacogenes among Southeast Asian populations under the collaboration of the Southeast Asian Pharmacogenomics Research Network (SEAPharm). Human Genome Variation volume 8, Article number: 7 (2021).
  • Hang Tong, Nga VT Phan, Thanh T. Nguyen, Dinh Nguyen, Nam Sy Vo, Ly Le. Review on Databases and Bioinformatic Approaches on Pharmacogenomics of Adverse Drug Reactions. Pharmgenomics Pers Med. 2021; 14: 61–75.
  • With the TCGA PanCanAtlas Immune Response Working Group. The Immune Landscape of Cancer. Cell Immunity, 48(4), 812-830 (2018).
  • With the TCGA PanCanAtlas Fusion/Splicing Working Group. Systematic Analysis of Splice-Site-Creating Mutations in Cancer. Cell Reports, 23(1), 270-281 (2018).
  • Nam S. Vo, Vinhthuy Phan. Leveraging Known Genomic Variants to Improve Detection of Variants, Especially Close-by Indels. Bioinformatics, 34(17), 2918–2926 (2018).
  • Nam S. Vo, Vinhthuy Phan. Exploiting Dependencies of Pairwise-comparison Outcomes to Predict Patterns of Gene Response. Best paper runner-up award, MCBIOS 2014. BMC Bioinformatics, 15(S-11): S2 (2015).
  • Vinhthuy Phan, Shanshan Gao, Quang Tran, Nam S. Vo. How Genome Complexity Can Explain the Hardness of Aligning Reads to Genomes. BMC Bioinformatics, 16(S-17): S3 (2015).
  • Nam S. Vo, Quang Tran, Nobal Niraula, Vinhthuy Phan. RandAL: A Randomized Approach to Aligning DNA Sequences to Reference Genomes. BMC Genomics, 15(S-5): S2 (2014).

Conferences:

  • Duc Tran, Frederick C Harris, Bang Tran, Nam Sy Vo, Hung Nguyen, Tin Nguyen. Single-cell RNA sequencing data imputation using deep neural network. In: Latifi S. (eds) ITNG 2021, Advances in Intelligent Systems and Computing, vol 1346. Springer, Cham.
  • Quang Tran, Nam S. Vo, Eric Hicks, Tin Nguyen, Vinhthuy Phan. Analysis of Short-read Aligners using Genome Sequence Complexity. IEEE International Conference on Knowledge and Systems Engineering (KSE), October 2020.
  • Bang Tran, Duc Tran, Hung Nguyen, Nam S. Vo, Tin Nguyen. RIA: a novel Regression-based Imputation Approach for single-cell RNA sequencing. IEEE International Conference on Knowledge and Systems Engineering (KSE), October 2019.
  • Quang Tran, Shanshan Gao, Nam S. Vo, Vinhthuy Phan. Repeat Complexity of Genomes as a Means to Predict the Performance of Short-read Aligners. International Conference on Bioinformatics and Computational Biology (BICoB), April 2016.
  • Vinhthuy Phan, Shanshan Gao, Quang Tran, Nam S. Vo. How Genome Complexity Can Explain the Hardness of Aligning Reads to Genomes. IEEE International Conference on Computational Advances in Bio and Medical Sciences (ICCABS), June 2014.
  • Nam S. Vo, Quang Tran, Nobal Niraula, Vinhthuy Phan. A Randomized Algorithm for Aligning DNA Sequences to Reference Genomes. ICCABS, June 2013.
  • Nam S. Vo and Vinhthuy Phan. Exploiting Dependencies of Patterns in Gene Expression Analysis using Pairwise Comparisons. International Symposium on Bioinformatics Research and Applications, May 2013.
  • Nam S. Vo, Thomas Sutter, Vinhthuy Phan. Inferring Directed-graph Patterns of Gene Responses in Gene Expression Studies with Multiple Treatments. BICoB, March 2013.
  • Nam S. Vo, Vinhthuy Phan, Thomas Sutter, Predicting Possible Directed-graph Patterns of Gene Expressions in Studies Involving Multiple Treatments, ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM-BCB), October 2012.
  • Nam S. Vo and Vinhthuy Phan, Pattern Analysis: A Web-based Tool for Analyzing Response Patterns in Low-replication, Many-treatment Gene Expression Data. ACM-BCB, October 2012.

Others (abstracts, posters):

  • Nam S. Vo, Zhenyu Zhang, and Robert L. Grossman. Somatic Variant Detection from Tumor-only Samples. Conference on Intelligent Systems for Molecular Biology (ISMB), July
  • Quang Tran, Shanshan Gao, Nam S. Vo, Vinhthuy Phan. A linear model for predicting performance of short-read aligners using genome complexity. UT-ORNL-KBRIN 2015, BMC Bioinformatics, P17 (2015).
  • Nam S. Vo, Quang Tran, Vinhthuy Phan. An Integrated Approach for SNP Calling based on Population of Genomes. UT-ORNL-KBRIN 2014, BMC Bioinformatics 15(Suppl 10): P30 (2014).
  • Nam S. Vo, Vinthuy Phan. Exploiting the bootstrap method to analyze patterns of gene expression. UT-ORNL-KBRIN 2014, BMC Bioinformatics 15, P19 (2014).
  • Nam S. Vo and Vinhthuy Phan. Using Partially Ordered Sets to Represent and Predict True Patterns of Gene Response to Treatments. UT-ORNL-KBRIN 2013, BMC Bioinformatics 14(Suppl 17): A20 (2013).
  • Vinhthuy Phan, Nam S. Vo, Thomas R. Sutter. mDAG: A Web-based Tool for Analyzing Microarray Data with Multiple Treatments. UT-ORNL-KBRIN 2011, BMC Bioinformatics 12(Suppl 7): A7 (2011).

Under review:

  • Thach Pham, Son Ho, Quynh Pham, Anh Nguyen et al. Artificial Intelligence powered, Melting-Spectrum PCR (AIMS-PCR) enables massive screening for SARS-CoV-2.

Professional Activities

  • Reviews:
    • Journal reviews: Nature Methods, Bioinformatics, Frontiers in Oncology, BMC Bioinformatics, BMC Medical Genomics.
    • Conference reviews: RECOMB, RECOMB-CBB, AICoB, BIBM.
    • Program committee: IEEE International Conference on Knowledge and Systems Engineering, Genomic Medicine Conference.
    • Panel reviews: Vingroup Innovation Foundation (VINIF).
  • Invited talks:
    • Genomic Big Data: What can we do? National University of Civil Engineering, Viet Nam, 11/2019.
    • Mathematical Methods for the Understanding of the Human Genome. International Graduate Summer School in Mathematics, Ha Noi, Viet Nam, 08/2019.
    • Mathematical Methods for the Understanding of the Human Genome. VN-USA Joint Mathematical Meeting, Quy Nhon, Viet Nam, 06/2019.
    • Predicting Response to Cancer Immunotherapy: Big Data Approaches. Genomic Medicine Conference, Ha Noi, Viet Nam, 06/2019.
    • Towards a Software Platform for Big Data in Biomedical Research. Vingroup Institute of Big Data, Vietnam, 12/2018.
    • NCI’s Genomic Data Commons: Research and Development, Vinmec Research Institute of Stem Cell and Gene Technology, Viet Nam, 09/2018.
    • Genomic Variant Analysis for Computational Immunogenomics, Vinmec Research Institute of Stem Cell and Gene Technology, Viet Nam, 07/2017.
  • Other talks:
    • Neoantigen Predictions from Splice-creating Mutations. TCGA PanCanAtlas Fusion/Splicing Working Group Teleconference, 10/2017.
    • From Genomic Variant Analysis to Computational Immunogenomics. The University of Chicago, USA, 09/2017.
    • Genomic Variant Analysis for Cancer Immunogenomics. The New York Genome Center, USA, 09/2017.
    • Neoantigen Predictions from InDels, TCGA PanCanAtlas Immune Response Working Group Teleconference, 04/2017.
    • Computational Methods for Genomic Variant and Gene Expression Analysis, The University of Texas MD Anderson Cancer Center, USA, 06/2016.
  • Oral/poster presentations:
    • Leveraging Known Genomic Variants to Improve Variant Detection. ISMB, poster presentation, 07/
    • Somatic Variant Detection from Tumor-only Samples. ISMB, poster presentation, 07/
    • Improving Variant Calling by Incorporating Known Genetic Variants into Read Alignment, MCBIOS, poster presentation, 03/2015.
    • Predicting True Patterns of Gene Response to Treatments in Expression Analysis using Pairwise Comparisons. MCBIOS, selected oral presentation, 03/2014.
    • Using Partially Ordered Sets to Represent and Predict True Patterns of Gene Response to Treatments, UT-ORNL-KBRIN Summit, selected oral presentation, 03/2013.
    • Predicting Possible Directed-graph Patterns of Gene Expressions in Studies Involving Multiple Treatments, ACM-BCB, poster presentation, 10/2012.
    • Pattern Analysis: A Web-based Tool for Analyzing Response Patterns in Low-replication, Many-treatment Gene Expression Data, ACM-BCB, poster presentation, 10/2012.