Current and former members of the JSB Group are in bold
author* indicates equal contribution
author indicates corresponding author(s)
Statistical rigor in omics data analysis
-
Zhou, H.J., Li, L., Li, Y., Li, W., and Li, J.J. (2022). PCA outperforms popular hidden variable inference methods for QTL mapping. Genome Biology 23:210. [ SOFTWARE ] [ PDF ]
-
Li, Y.*, Ge, X.*, Peng, F., Li, W., and Li, J.J. (2022). Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology 23:79. [ CODE ] | [ PDF ]
-
Ge, X.*, Chen, Y.E.*, Song, D., McDermott, M., Woyshner, K., Manousopoulou, A., Wang, N., Li, W., Wang, L.D., and Li, J.J. (2021). Clipper: p-value-free FDR control on high-throughput data from two conditions. Genome Biology 22:288. [ UCLA News ] [ SOFTWARE ] [ CODE ] [ VIDEO ] | [ PDF ]
-
Li, J.J. and Tong, X. (2020). Statistical hypothesis testing versus machine-learning binary classification: distinctions and guidelines. Patterns 1(7):110115. [ UCLA News ] | [ PDF ]
Single-cell RNA-seq
-
Song, D., Wang, Q., Yan, G., Liu, T., and Li, J.J. (2023). scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nature Biotechnology. [ SOFTWARE ] | [ PDF ]
-
Jiang, R., Sun, T., Song, D., and Li, J.J. (2022). Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biology 23:31. [ CODE ] | [ PDF ]
-
Song, D., Li, K., Hemminger, Z., Wollman, R., and Li, J.J. (2021). scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling. Bioinformatics 37(Supplement_1):i358-i366. [ ISMB/ECCB 2021 ] [ SOFTWARE ] | [ PDF ]
-
Sun, T., Song, D., Li, W.V., and Li, J.J. (2021). scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured. Genome Biology 22:163. [ RECOMB 2021 ] [ SOFTWARE ] [ CODE ] | [ PDF ]
-
Song, D. and Li, J.J. (2021). PseudotimeDE: inference of differential gene expression along cell pseudotime with well-calibrated p-values from single-cell RNA sequencing data. Genome Biology 22:124. [ SOFTWARE ] [ CODE ] | [ PDF ]
-
Xi, N.M. and Li, J.J. (2021). Benchmarking computational doublet-detection methods for single-cell RNA sequencing data. Cell Systems 12:1-19. [ CODE ] [ DATA ] | [ PDF ]
-
Li, W.V. and Li, J.J. (2019). A statistical simulator scDesign for rational scRNA-seq experimental design. Bioinformatics 35(14):i41–i50. [ ISMB/ECCB 2019 ] [ SOFTWARE ] | [ PDF ]
-
Li, W.V. and Li, J.J. (2018). An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nature Communications 9:997. [ UCLA News ] [ SOFTWARE ] | [ PDF ]
Bulk RNA-seq isoform discovery and quantification
-
Li, W.V.*, Li, S.*, Tong, X., Deng, L., Shi, H., and Li, J.J. (2019). AIDE: annotation-assisted isoform discovery with high precision. Genome Research 29:2056-2072. [ SOFTWARE ] [ COVER ART ] [ UCLA News ] | [ PDF ]
-
Li, W.V. and Li, J.J. (2018). Modeling and analysis of RNA-seq data: a review from a statistical perspective. Quantitative Biology 6(3):195-209. | [ PDF ]
-
Li, W.V.*, Zhao, A., Zhang, S., and Li, J.J.* (2018). MSIQ: joint modeling of multiple RNA-seq samples for accurate isoform quantification. Annals of Applied Statistics 12(1):510-539. [ SOFTWARE ] [ COLOR PDF ] | [ PDF ]
-
Ye, Y. and Li, J.J. (2016). NMFP: a non-negative matrix factorization based preselection method to increase accuracy of identifying mRNA isoforms from RNA-seq data. BMC Genomics 17(Supp 1):11. [ SOFTWARE ] | [ PDF ]
-
Li, J.J., Jiang, C.-R., Brown, B.J., Huang, H., and Bickel, P.J. (2011). Sparse linear modeling of RNA-seq data for isoform discovery and abundance estimation. Proc Natl Acad Sci. USA 108(50):19867-19872. [ SOFTWARE ] | [ PDF ]
Central dogma and translational control
-
Li, J.J., Chew, G.-L., and Biggin, M.D. (2019). Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biology 20:162. [ CODE ] | [ PDF ]
-
Li, J.J., Chew, G.-L., and Biggin, M.D. (2017). Quantitating translational control: mRNA abundance-dependent and independent contributions and the mRNA sequences that specify them. Nucleic Acids Research 45(20):11821-11836. [ Highlight talk at RECOMB ] | [ PDF ]
-
Li, J.J. and Biggin, M.D. (2015). Statistics requantitates the central dogma. Science 347(6226):1066-1067. [ UCLA News ] [ Interview at Significance 12(3):8 ] | [ PDF ]
-
Li, J.J., Bickel, P.B., and Biggin, M.D. (2014). System wide analyses have underestimated protein abundances and transcriptional importance in animals. PeerJ 2:e270. [ Press release ] [ Guest post on "Bits of DNA" blog ] [ PeerJ Picks 2015" Collection ] [ Top Bioinformatics Papers - June 2015" Collection ] [ Top 5 most cited PeerJ articles ] | [ PDF ]
Classification methodologies and applications
-
Zhang, C., Chen, Y.E., Zhang, S., and Li, J.J. (2021). Information-theoretic classification accuracy: a criterion that guides data-driven combination of ambiguous outcome labels in multi-class classification. Journal of Machine Learning Research 23(341):1−65. [ CODE ] | [ PDF ]
-
Li, J.J., Chen, Y.E., and Tong, X. (2021). A flexible model-free prediction-based framework for feature ranking. Journal of Machine Learning Research 22(124):1-54. [ SOFTWARE ] | [ PDF ]
-
Lyu, J.*, Li, J.J.*, Su, J., Peng, F., Chen, Y.E., Ge, X., and Li, W. (2020). DORGE: Discovery of Oncogenes and tumor suppressoR genes using Genetic and Epigenetic features. Science Advances 6(46):eaba6784. [ VIDEO ] | [ PDF ]
-
Tong, X.*, Feng, Y.*, and Li, J.J. (2018). Neyman-Pearson classification algorithms and NP receiver operating characteristics. Science Advances 4(2):eaao1659. [ SOFTWARE ] [ VIDEO ] [ Francis X. Diebold's Blog on NP Classification ] | [ PDF ]
Microbiome sequencing data imputation
-
Jiang, R., Li, W.V., and Li, J.J. (2021). mbImpute: an accurate and robust imputation method for microbiome data. Genome Biology 22:192. [ SOFTWARE ] | [ PDF ]
Networks
-
Wang, Y.X.R., Li, L., Li, J.J., and Huang, H. (2021). Network modeling in biology: statistical methods for gene and brain networks. Statistical Science 36(1):89-108. | [ PDF ]
-
Sun, Y.E., Zhou, H.J., and Li, J.J. (2020). Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species. Bioinformatics 37(9):1225-1233. [ SOFTWARE ] | [ PDF ]
-
Razaee, Z.S., Amini, A.A., and Li, J.J. (2019). Matched bipartite block model with covariates. Journal of Machine Learning Research 20(34):1-44. | [ PDF ]
High-dimensional model inference
-
Liu, H., Xu, X., and Li, J.J. (2020). A bootstrap lasso + partial ridge method to construct confidence intervals for parameters in high-dimensional sparse linear models. Statistica Sinica 30:1333-1355. [ SOFTWARE ] | [ PDF ]
Comparative genomics
-
Ge, X.*, Zhang, H.*, Xie, L., Li, W.V., Kwon, S.B., and Li, J.J. (2019). EpiAlign: an alignment-based bioinformatic tool for comparing chromatin state sequences. Nucleic Acids Research 47(13):e77. [ SOFTWARE ] [ WEBSITE ] | [ PDF ]
-
Duong, D., Ahmad, W.U., Eskin, E., Chang, K.-W., and Li, J.J. (2019). Word and sentence embedding tools to measure semantic similarity of Gene Ontology terms by their definitions. Journal of Computational Biology 26(1):38-52. [ SOFTWARE ] | [ PDF ]
-
Li, W.V., Chen, Y., and Li, J.J. (2017). TROM: a testing-based method for finding transcriptomic similarity of biological samples. Statistics in Biosciences 9(1):105-136. [ SOFTWARE ] | [ PDF ]
-
Gao, R. and Li, J.J. (2017). Correspondence of D. melanogaster and C. elegans developmental stages revealed by alternative splicing characteristics of conserved exons. BMC Genomics 18:234. | [ PDF ]
-
Yang, Y.*, Yang, Y.T.*, Yuan, J., Lu, Z.J., and Li, J.J. (2017). Large-scale mapping of mammalian transcriptomes identifies conserved genes associated with different cell states. Nucleic Acids Research 45(4):1657-1672. [ DATA ] | [ PDF ]
-
Li, W.V., Razaee, Z.S., and Li, J.J. (2016). Epigenome overlap measure (EPOM) for comparing tissue/cell types based on chromatin states. BMC Genomics 17(Supp 1):10. [ SOFTWARE ] | [ PDF ]
-
Gerstein, M.B.*, Rozowsky, J.*, Yan, K.K.*, Wang, D.*, Cheng, C.*, Brown, J.B.*, Davis, C.A.*, Hillier, L*, Sisu, C.*, Li, J.J.*, Pei, B.*, Harmanci, A.O.*, Duff, M.O.*, Djebali, S.*, and 82 other authors from the modENCODE consortium (2014). Comparative analysis of the transcriptome across distant species. Nature 512(7515):445-448. [ NIH news ] | [ PDF ]
-
Li, J.J., Huang, H., Bickel, P.B., and Brenner, S.E. (2014). Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data. Genome Research 24(7):1086-1101. [ Press release ] [ Top 10 papers selected at the 2014 RECOMB/ISCB Conference on Regulatory & Systems Genomics ] [ DATA ] [ SOFTWARE ] | [ PDF ]
Gene regulation
-
MacArthur, S.*, Li, X.Y.*, Li, J.*, Brown, J.B., Chu, H.C., Zeng, L., Grondona, B.P., Hechmer, A., Simirenko, L., Keranen, S.V., Knowles, D.W., Stapleton, M., Bickel, P., Biggin, M.D., and Eisen, M.B. (2009). Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biology 10:R80. [ Faculty of 1000 recommendation ] | [ PDF ]