85. Yan, G., Hua, S.H., and Li, J.J. (2024). Categorization of 33 computational methods to detect spatially variable genes from spatially resolved transcriptomics data. Nature Communications (accepted).
84. Sun, T., Yuan, J., Zhu, Y., Li, J., Yang, S., Zhou, J., Ge, X., Qu, S., Li, W., Li, J.J., and Li, Y. (2024). Systematic evaluation of methylation-based cell type deconvolution methods for plasma cell-free DNA. Genome Biology 25:318. | [ PDF ]
83. Fernandez, E.G., Mai, W.X., Song, K., Bayley, N.A., Kim, J., Zhu, H., Pioso, M., Young, P., Andrasz, C., Cadet, D., Liau, L.M., Li, G., Yong, W.H., Rodriguez, F., Dixon, S.J., Souers, A.J., Li, J.J., Graeber, T.G., Cloughesy, T.F. & Nathanson, D.A. (2024). Integrated molecular and functional characterization of the intrinsic apoptotic machinery identifies therapeutic vulnerabilities in malignant glioma. Nature Communications 15:10089.
Sankaran, K., Kodikara, S., Li, J.J., and Le Cao, K.A. (2024). Semisynthetic simulation for microbiome data analysis. bioRxiv.
81. Li, J.J. (2024). Leadership at the Intersection of Statistics & Genomics: A COPSS-NISS Leadership Webinar with Drs. Rafael Irizarry and Mingyao Li. Statistics in Biosciences 16:547–555.
80. Patowary, A., Zhang, P., Jops, C., Vuong, C.K., Ge, X., Hou, K., Kim, M., Gong, N., Margolis, M., Vo, D., Wang, X., Liu, C., Pasaniuc, B., Li, J.J., Gandal, M.J., and De La Torre-Ubieta, L. (2024). Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms. Science 384(6698):eadh7688.
78. Chen, Y.E.*, Ge, X.*, Woyshner, K.*, McDermott, M.*, Manousopoulou, A., Ficarro, S., Marto, J., Kexin Li, Wang, L.D., and Li, J.J. (2024). APIR: a universal FDR-control framework for boosting peptide identification power by aggregating multiple proteomics database search algorithms. Genomics, Proteomics & Bioinformatics 22(2):qzae042. [ SOFTWARE ] [ CODE ]
77. Li, J.J., Zhou, H.J., Tong, X., and Bickel, P.J. (2024). Dissecting gene expression heterogeneity: generalized Pearson correlation squares and the K-lines clustering algorithm. Journal of American Statistical Association 119(548):2450–2463. [ SOFTWARE ] | [ PDF ]
76. Cui, Y., Ye, W., Li, J.S., Li, J.J., Vilain, E., Sallam, T., and Li, W. (2024). A genome-wide spectrum of tandem repeat expansions in 338,963 humans. Cell 187(9):2336–2341. | [ PDF ]
Wang, Q., Zhai, Z., Lian, Q., Song, D., and Li, J.J. (2023). Categorization and analysis of 14 computational methods for estimating cell potency from single-cell RNA-seq data. arXiv.
75. Xia, L.*, Lee, C.*, and Li, J.J. (2024). Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters. Nature Communications 15:1753. (Featured in Nature Communications Editors’ Highlights) [ Nature Methods: Seeing data as t-SNE and UMAP do ] [ SOFTWARE ] | [ PDF ]
74. Wang, L., Wang, Y.X.R., Li, J.J., and Tong, X. (2024). Hierarchical Neyman-Pearson classification for prioritizing severe disease categories in COVID-19 patient data. Journal of American Statistical Association 119:39–51.
71. Yan, G., Song, D., and Li, J.J. (2023). scReadSim: a single-cell RNA-seq and ATAC-seq read simulator. Nature Communications 14:7482. [ SOFTWARE ] | [ PDF ]
70. Xi, N.M. and Li, J.J. (2023). Exploring the optimization of autoencoder design for imputing single-cell RNA sequencing data. Computational and Structural Biotechnology Journal 21:4079-4095.
Song, D.*, Chen, S.*, Lee, C.*, Li, K., Ge, X., and Li, J.J. (2023). Synthetic control removes spurious discoveries from double dipping in single-cell and spatial transcriptomics data analyses. bioRxiv. [ SOFTWARE ]
69. Li, J.J. (2023). How the Monty Hall problem is similar to the false discovery rate in high-throughput data analysis. Nature Biotechnology 41:754–755. | [ PDF ]
73. Song, D., Wang, Q., Yan, G., Liu, T., and Li, J.J. (2024). scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nature Biotechnology 42:247–252. [ SOFTWARE ] | [ PDF ]
68. Yang, L., Chen, X., Lee, C., Shi, J., Lawrence, E.B., Zhang, L., Li, Y., Gao, N., Jung, S.Y., Creighton, C.J., Li, J.J., Cui, Y., Arimura, S., Lei, Y., Li, W., Shen, L. (2023). Functional characterization of age-dependent p16 epimutation reveals biological drivers and therapeutic targets for colorectal cancer. Journal of Experimental & Clinical Cancer Research 42:113.
67. Wu, Y., Jin, M., Fernandez, M., Hart, K.L., Liao, A., Ge, X., Fernandes, S.M., McDonald, T., Chen, Z., Röth, D., Ghoda, L.Y., Marcucci, G., Kalkum, M., Pillai, R.K., Danilov, A.V., Li, J.J., Chen, J., Brown, J.R., Rosen, S.T., Siddiqi, T., Wang, L. (2023). METTL3-mediated m6A modification controls splicing factor abundance and contributes to aggressive CLL. Blood Cancer Discovery 4(3):228–245.
66. Zong, W., Rahman, T., Zhu, L., Zeng, X., Zhang, Y., Zou, J., Liu, S., Ren, Z., Li, J.J., Sibille, E., Lee, A.V., Oesterreich, S., Ma, T., Tseng, G.C. (2023). Transcriptomic congruence analysis for evaluating model organisms. Proc Natl Acad Sci. USA 120(6):e2202584120.
65. Zhang, C., Chen, Y.E., Zhang, S., and Li, J.J. (2022). Information-theoretic classification accuracy: a criterion that guides data-driven combination of ambiguous outcome labels in multi-class classification. Journal of Machine Learning Research 23(341):1−65. [ RECOMB 2023 ] [ SOFTWARE ] | [ PDF ]
64. Zhou, H.J., Li, L., Li, Y., Li, W., and Li, J.J. (2022). PCA outperforms popular hidden variable inference methods for QTL mapping. Genome Biology 23:210. [ Highlight talk at RECOMB 2023 ] [ SOFTWARE ] | [ PDF ]
63. Say, I., Chen, Y.E., Sun, M.Z., Li, J.J., and Lu, D.C. (2022). Machine learning predicts improvement of functional outcomes in traumatic brain injury patients after inpatient rehabilitation. Frontiers in Rehabilitation Sciences 3:1005168.
62. Cui, E.H.*, Song, D.*, Wong, W.K., and Li, J.J. (2022). Single-cell generalized trend model (scGTM): a flexible and interpretable model of gene expression trend along cell pseudotime. Bioinformatics 38(16):3927–3934. [ SOFTWARE ] [ CODE ]
61. Song, D.*, Xi, N.M.*, Li, J.J., and Wang, L. (2022). scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data. Bioinformatics 38(11):3126–3127. [ PYTHON PACKAGE ] [ R PACKAGE ]
60. Eisen, T.J., Li, J.J., and Bartel, D.P. (2022). The interplay between translational efficiency, poly(A) tails, microRNAs, and neuronal activation. RNA 28:808–831.
59. Li, Y.*, Ge, X.*, Peng, F., Li, W., and Li, J.J. (2022). Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology 23:79. [ UCLA NEWS ] [ CODE ] | [ PDF ]
58. Jiang, R., Sun, T., Song, D., and Li, J.J. (2022). Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biology 23:31. [ CODE ] | [ PDF ]
57. Sun, T., Song, D., Li, W.V., and Li, J.J. (2022). Simulating single-cell gene expression count data with preserved gene correlations by scDesign2. Journal of Computational Biology 29(1):23–26. (RECOMB 2021; software article; see Publication 50 for the method article) [ SOFTWARE ]
56. Ge, X.*, Chen, Y.E.*, Song, D., McDermott, M., Woyshner, K., Manousopoulou, A., Wang, N., Li, W., Wang, L.D., and Li, J.J. (2021). Clipper: p-value-free FDR control on high-throughput data from two conditions. Genome Biology 22:288. [ UCLA NEWS ] [ SOFTWARE ] [ CODE ] [ VIDEO ] | [ PDF ]
55. Shi, J., Xu, J., Chen, Y.E., Li, J.S., Cui, Y., Shen, L, Li, J.J., and Li, W. (2021). The concurrence of DNA methylation and demethylation is associated with transcription regulation. Nature Communications 12:5285.
54. Xi, N.M. and Li, J.J. (2021). Protocol for executing and benchmarking eight computational doublet-detection methods in single-cell RNA sequencing data analysis. STAR Protocols 2(3):100699. [ SOFTWARE ]
53. Song, D.*, Li, K.*, Hemminger, Z., Wollman, R., and Li, J.J. (2021). scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling. Bioinformatics 37(Supplement_1):i358–i366. [ ISMB/ECCB 2021 ] [ SOFTWARE ]
52. Jiang, R., Li, W.V., and Li, J.J. (2021). mbImpute: an accurate and robust imputation method for microbiome data. Genome Biology 22:192. [ UCLA NEWS ] [ SOFTWARE ] | [ PDF ]
50. Sun, T., Song, D., Li, W.V., and Li, J.J. (2021). scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured. Genome Biology 22:163. [ RECOMB 2021 ] [ UCLA NEWS ] [ SOFTWARE ] [ CODE ] | [ PDF ]
49. Li, J.J., Chen, Y.E., and Tong, X. (2021). A flexible model-free prediction-based framework for feature ranking. Journal of Machine Learning Research 22(124):1–54. [ SOFTWARE ]
48. Sun, Y.E., Zhou, H.J., and Li, J.J. (2021). Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species. Bioinformatics 37(9):1225–1233. [ SOFTWARE ]
47. Sun, M.Z., Babayan, D., Chen, J.-S., Wang, M.M., Naik, P.K., Reitz, K., Li, J.J., Pouratian, N., Kim, W. (2021). Postoperative admission of adult craniotomy patients to the neuroscience ward reduces length of stay and cost. Neurosurgery 89(1):85–93.
46. Song, D. and Li, J.J. (2021). PseudotimeDE: inference of differential gene expression along cell pseudotime with well-calibrated p-values from single-cell RNA sequencing data. Genome Biology 22:124. [ UCLA NEWS ] [ SOFTWARE ] [ CODE ]
45. Xi, N.M. and Li, J.J. (2021). Benchmarking computational doublet-detection methods for single-cell RNA sequencing data. Cell Systems 12(2):176–194. [ CODE ] [ DATA ] [ SSRN's Top Downloaded Paper of Apr 9 - Jun 7, 2021 in Computational Biology eJournal ]
44. Guo, Y., Xue, Z., Yuan, R., Li, J.J., Pastor, W.A., and Liu, W. (2021). RAD: a web application to identify region associated differentially expressed genes. Bioinformatics 37(17):2741–2743. [ WEBSITE ]
43. Xu, J., Shi, J., Cui, X., Cui, Y., Li, J.J., Goel, A., Chen, X., Issa, J.-P., Su, J., and Li, W. (2021). Cellular Heterogeneity–Adjusted cLonal Methylation (CHALM) improves prediction of gene expression. Nature Communcations 12:400.
42. Wang, Y.X.R., Li, L., Li, J.J., and Huang, H. (2021). Network modeling in biology: statistical methods for gene and brain networks. Statistical Science 36(1):89–108.
39. Yu, C., Zhang, M., Song, J., Zheng, X., Xu, G., Bao, Y., Lan, J., Luo, D., Hu, J., Li, J.J., and Shi, H. (2020). Integrin-Src-YAP1 signaling mediates the melanoma acquired resistance to MAPK and PI3K/mTOR dual targeted therapy. Molecular Biomedicine 1:12.
38. Li, J.J. and Tong, X. (2020). Statistical hypothesis testing versus machine-learning binary classification: distinctions and guidelines. Patterns 1(7):110115. [ UCLA NEWS ] [ PODCAST ]
41. Li, J.J. (2021). A new bioinformatics tool to recover missing gene expression in single-cell RNA sequencing data. Journal of Molecular Cell Biology 13(1):1–2. (Highlight of the PBLR method by Zhang and Zhang)
37. Liu, H., Xu, X., and Li, J.J. (2020). A bootstrap lasso + partial ridge method to construct confidence intervals for parameters in high-dimensional sparse linear models. Statistica Sinica 30:1333–1355. [ SOFTWARE ]
36. Li, W.V.*, Li, S.*, Tong, X., Deng, L., Shi, H., and Li, J.J. (2019). AIDE: annotation-assisted isoform discovery with high precision. Genome Research 29:2056–2072. [ UCLA NEWS ] [ SOFTWARE ] [ DATA ] [ COVER ART ]
35. Li, J.J., Chew, G.-L., and Biggin, M.D. (2019). Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biology 20:162. [ CODE ]
34. Li, W.V. and Li, J.J. (2019). A statistical simulator scDesign for rational scRNA-seq experimental design. Bioinformatics 35(14):i41–i50. [ ISMB/ECCB 2019 ] [ SOFTWARE ]
33. Ge, X.*, Zhang, H.*, Xie, L., Li, W.V., Kwon, S.B., and Li, J.J. (2019). EpiAlign: an alignment-based bioinformatic tool for comparing chromatin state sequences. Nucleic Acids Research 47(13):e77. [ SOFTWARE ] [ WEBSITE ]
32. Razaee, Z.S., Amini, A.A., and Li, J.J. (2019). Matched bipartite block model with covariates. Journal of Machine Learning Research 20(34):1–44.
30. Duong, D., Ahmad, W.U., Eskin, E., Chang, K.-W., and Li, J.J. (2019). Word and sentence embedding tools to measure semantic similarity of Gene Ontology terms by their definitions. Journal of Computational Biology 26(1):38–52. [ SOFTWARE ]
29. Li, W.V. and Li, J.J. (2018). Modeling and analysis of RNA-seq data: a review from a statistical perspective. Quantitative Biology 6(3):195–209.
28. Burke, J.E., Longhurst, A.D., Merkurjev, D., Sales-Lee, J., Rao, B., Moresco, J.J., Yates III, J.R., Li, J.J., and Madhani, H.D. (2018). Spliceosome profiling visualizes operations of a dynamic RNP at nucleotide resolution. Cell 173(4):1014–1030.e17.
27. Li, W.V.*, Zhao, A., Zhang, S., and Li, J.J.* (2018). MSIQ: joint modeling of multiple RNA-seq samples for accurate isoform quantification. Annals of Applied Statistics 12(1):510–539. [ SOFTWARE ] [ COLOR PDF ]
26. Li, W.V. and Li, J.J. (2018). An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nature Communications 9:997. [ UCLA NEWS ] [ SOFTWARE ]
25. Tong, X.*, Feng, Y.*, and Li, J.J. (2018). Neyman-Pearson classification algorithms and NP receiver operating characteristics. Science Advances 4(2):eaao1659. [ SOFTWARE ] [ VIDEO ] [ Francis X. Diebold's Blog on NP Classification ]
22. Li, J.J., Chew, G.-L., and Biggin, M.D. (2017). Quantitating translational control: mRNA abundance-dependent and independent contributions and the mRNA sequences that specify them. Nucleic Acids Research 45(20):11821–11836. [ Highlight talk at RECOMB 2018 ]
23. Jonassaint, C.R., Kang, C., Abrams, D.M., Li, J.J., Mao, J., Jia, Y., Long, Q., Sanger, M., Jonassaint, J.C., De Castro, L., and Shah, N. (2018). Understanding patterns and correlates of daily pain using the sickle cell disease mobile application to record symptoms via technology (SMART). British Journal of Haematology 183(2):306–308.
21. Clifton, S.M., Kang, C., Li, J.J., Long, Q., Shah, N., and Abrams, D.M. (2017). Hybrid statistical and mechanistic mathematical model guides mobile health intervention for chronic pain. Journal of Computational Biology 24(7):675–688.
19. Li, W.V., Chen, Y., and Li, J.J. (2017). TROM: a testing-based method for finding transcriptomic similarity of biological samples. Statistics in Biosciences 9(1):105–136. [ SOFTWARE ]
18. Gao, R. and Li, J.J. (2017). Correspondence of D. melanogaster and C. elegans developmental stages revealed by alternative splicing characteristics of conserved exons. BMC Genomics 18:234.
17. Yang, Y.*, Yang, Y.T.*, Yuan, J., Lu, Z.J., and Li, J.J. (2017). Large-scale mapping of mammalian transcriptomes identifies conserved genes associated with different cell states. Nucleic Acids Research 45(4):1657–1672. [ DATA ]
16. Li, J.J. and Tong, X. (2016). Genomic applications of the Neyman–Pearson classification paradigm. Big Data Analytics in Genomics. Springer (New York).
15. Ye, Y. and Li, J.J. (2016). NMFP: a non-negative matrix factorization based preselection method to increase accuracy of identifying mRNA isoforms from RNA-seq data. BMC Genomics 17(Supp 1):11. [ SOFTWARE ]
14. Li, W.V., Razaee, Z.S., and Li, J.J. (2016). Epigenome overlap measure (EPOM) for comparing tissue/cell types based on chromatin states. BMC Genomics 17(Supp 1):10. [ SOFTWARE ]
13. Li, J.J., Huang, H., Qian, M., and Zhang, X. (2015). Chapter 24: Transcriptome analysis using next-generation sequencing. Advanced Medical Statistics (2nd Edition).
12. Liu, Z., Dai, S., Bones, J., Ray, S., Cha, S., Karger, B. L., Li, J.J., Wilson, L., Hinckle, G., and Rossomando, A. (2015). A quantitative proteomic analysis of cellular responses to high glucose media in Chinese hamster ovary cells. Biotechnology Progress 31(4):1026–38.
11. Li, J.J. and Biggin, M.D. (2015). Statistics requantitates the central dogma. Science 347(6226):1066–1067. [ UCLA NEWS ] [ Interview at Significance 12(3):8 ]
10. Gerstein, M.B.*, Rozowsky, J.*, Yan, K.K.*, Wang, D.*, Cheng, C.*, Brown, J.B.*, Davis, C.A.*, Hillier, L*, Sisu, C.*, Li, J.J.*, Pei, B.*, Harmanci, A.O.*, Duff, M.O.*, Djebali, S.*, and 82 other authors from the modENCODE consortium (2014). Comparative analysis of the transcriptome across distant species. Nature 512(7515):445–448. [ NIH NEWS ]
9. Boyle, A., Araya, C., Brdlik, C., Cayting, P., Cheng, C., Cheng, Y., Gardner, K., Hillier, L., Janette, J., Jiang, L., Kasper, D., Kawli, T., Kheradpour, P., Kundaje, A., Li, J.J., and 25 other authors from the modENCODE and ENCODE consortia (2014). Comparative analysis of regulatory information and circuits across distant species. Nature 512(7515):453–456. [ NIH NEWS ]
8. Li, J.J., Huang, H., Bickel, P.B., and Brenner, S.E. (2014). Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data. Genome Research 24(7):1086–1101. [ Press release ] [ Top 10 papers selected at the 2014 RECOMB/ISCB Conference on Regulatory & Systems Genomics ] [ DATA ] [ SOFTWARE ]
7. Li, J.J., Bickel, P.B., and Biggin, M.D. (2014). System wide analyses have underestimated protein abundances and transcriptional importance in animals. PeerJ 2:e270. [ Press release ] [ Guest post on "Bits of DNA" blog ] [ PeerJ Picks 2015" Collection ] [ Top Bioinformatics Papers - June 2015" Collection ] [ Top 5 most cited PeerJ articles ]
6. Fisher, W.W., Li, J.J., Hammonds, A.S., Brown, J.B., Pfeiffer, B., Weiszmann, R., MacArthur, S., Thomas, S., Stamatoyannopoulos, J.A., Eisen, M.B., Bickel, P.B., Biggin, M.D., and Celniker, S.E. (2012). DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila. Proc Natl Acad Sci. USA 109(52):21330–21335.
5. The ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489(7414):57–74. [ UC BERKELEY NEWS ]
4. Gao, Q., Ho, C., Jia, Y., Li, J.J., and Huang, H. (2012). Biclustering of linear patterns in gene expression data (CLiP). Journal of Computational Biology 19(6):619–631.
3. Li, J., Li, J., and Chen, B. (2012). Oct4 was a novel target of Wnt signaling pathway. Molecular and Cellular Biochemistry 362:233–240.
2. Li, J.J., Jiang, C.-R., Brown, B.J., Huang, H., and Bickel, P.J. (2011). Sparse linear modeling of RNA-seq data for isoform discovery and abundance estimation. Proc Natl Acad Sci. USA 108(50):19867–19872. [ SOFTWARE ]
1. MacArthur, S.*, Li, X.Y.*, Li, J.*, Brown, J.B., Chu, H.C., Zeng, L., Grondona, B.P., Hechmer, A., Simirenko, L., Keranen, S.V., Knowles, D.W., Stapleton, M., Bickel, P., Biggin, M.D., and Eisen, M.B. (2009). Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biology 10:R80. [ Faculty of 1000 recommendation ]