Publications on PubMed and Google Scholar
Current and former members of the JSB Group are in bold
author* indicates equal contribution 
author indicates corresponding author(s)
Preprints
Zhou, H.J.Ge, X., and Li, J.J. (2023). ClipperQTL: ultrafast and powerful eGene identification method. bioRxiv. [ SOFTWARE ]

2024

80. Patowary, A., Zhang, P., Jops, C.*, Vuong, C.K., Ge, X., Hou, K., Kim, M., Gong, N., Margolis, M., Vo, D., Wang, X., Liu, C., Pasaniuc, B., Li, J.J., Gandal, M.J., and De La Torre-Ubieta, L. (2024). Developmental isoform diversity in the human neocortex informs neuropsychiatric risk mechanisms. Science 384(6698):eadh7688. | [ PDF ]
79. Wang, W.*, Cen, Y.*, Lu, Z.*, Xu, Y., Sun, T., Xiao, Y., Liu, W., Li, J.J., and Wang, C. (2024). scCDC: a computational method for gene-specific contamination detection and correction in single-cell and single-nucleus RNA-seq data. Genome Biology 25:136. [ SOFTWARE ] | [ PDF ]
78. Chen, Y.E.*Ge, X.*, Woyshner, K.*, McDermott, M.*, Manousopoulou, A., Ficarro, S., Marto, J., Kexin LiWang, L.D., and Li, J.J. (2024). APIR: a universal FDR-control framework for boosting peptide identification power by aggregating multiple proteomics database search algorithms. Genomics, Proteomics & Bioinformatics[ SOFTWARE ] [ CODE ] | [ PDF ]
77. Li, J.J., Zhou, H.J., Tong, X., and Bickel, P.J. (2024). Dissecting gene expression heterogeneity: generalized Pearson correlation squares and the K-lines clustering algorithm. Journal of American Statistical Association[ SOFTWARE ] | [ PDF ]
76. Cui, Y., Ye, W., Li, J.S., Li, J.J., Vilain, E., Sallam, T., and Li, W. (2024). A genome-wide spectrum of tandem repeat expansions in 338,963 humans. Cell 187(9):2336–2341. | [ PDF ]
75. Xia, L.*, Lee, C.*, and Li, J.J. (2024). Statistical method scDEED for detecting dubious 2D single-cell embeddings and optimizing t-SNE and UMAP hyperparameters. Nature Communications 15:1753. (Featured in Nature Communications Editors’ Highlights) [ Nature Methods: Seeing data as t-SNE and UMAP do ] [ SOFTWARE ] | [ PDF ]
74. Wang, L., Wang, Y.X.R., Li, J.J., and Tong, X. (2024). Hierarchical Neyman-Pearson classification for prioritizing severe disease categories in COVID-19 patient data. Journal of American Statistical Association 119:39–51. | [ PDF ]
73. Song, D., Wang, Q., Yan, G., Liu, T., and Li, J.J. (2024). scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics. Nature Biotechnology 42:247–252. [ SOFTWARE ] | [ PDF ]

2023

72. Zhang, C., Zhang, S., and Li, J.J. (2023). A Python package itca for information-theoretic classification accuracy: a criterion that guides data-driven combination of ambiguous outcome labels in multiclass classification. Journal of Computational Biology 30(11):1246–1249. (RECOMB 2023; software article; see Publication 65 for the method article) [ SOFTWARE ] | [ PDF ]
71. Yan, G.Song, D., and Li, J.J. (2023). scReadSim: a single-cell RNA-seq and ATAC-seq read simulator. Nature Communications 14:7482. [ SOFTWARE ] | [ PDF ]
70. Xi, N.M. and Li, J.J. (2023). Exploring the optimization of autoencoder design for imputing single-cell RNA sequencing data. Computational and Structural Biotechnology Journal 21:4079-4095. | [ PDF ]
68. Yang, L., Chen, X., Lee, C., Shi, J., Lawrence, E.B., Zhang, L., Li, Y., Gao, N., Jung, S.Y., Creighton, C.J., Li, J.J., Cui, Y., Arimura, S., Lei, Y., Li, W., Shen, L. (2023). Functional characterization of age-dependent p16 epimutation reveals biological drivers and therapeutic targets for colorectal cancer. Journal of Experimental & Clinical Cancer Research 42:113. | [ PDF ]
67. Wu, Y., Jin, M., Fernandez, M., Hart, K.L., Liao, A., Ge, X., Fernandes, S.M., McDonald, T., Chen, Z., Röth, D., Ghoda, L.Y., Marcucci, G., Kalkum, M., Pillai, R.K., Danilov, A.V., Li, J.J., Chen, J., Brown, J.R., Rosen, S.T., Siddiqi, T., Wang, L. (2023). METTL3-mediated m6A modification controls splicing factor abundance and contributes to aggressive CLL. Blood Cancer Discovery 4(3):228–245. | [ PDF ]
66. Zong, W., Rahman, T., Zhu, L., Zeng, X., Zhang, Y., Zou, J., Liu, S., Ren, Z., Li, J.J., Sibille, E., Lee, A.V., Oesterreich, S., Ma, T., Tseng, G.C. (2023). Transcriptomic congruence analysis for evaluating model organisms. Proc Natl Acad Sci. USA 120(6):e2202584120. | [ PDF ]

2022

65. Zhang, C., Chen, Y.E., Zhang, S., and Li, J.J. (2022). Information-theoretic classification accuracy: a criterion that guides data-driven combination of ambiguous outcome labels in multi-class classification. Journal of Machine Learning Research 23(341):1−65. [ RECOMB 2023 ] [ SOFTWARE ] | [ PDF ]
64. Zhou, H.J., Li, L., Li, Y., Li, W., and Li, J.J. (2022). PCA outperforms popular hidden variable inference methods for QTL mapping. Genome Biology 23:210. [ Highlight talk at RECOMB 2023 ] [ SOFTWARE ] | [ PDF ]
63. Say, I., Chen, Y.E., Sun, M.Z., Li, J.J., and Lu, D.C. (2022). Machine learning predicts improvement of functional outcomes in traumatic brain injury patients after inpatient rehabilitation. Frontiers in Rehabilitation Sciences 3:1005168. | [ PDF ]
62. Cui, E.H.*, Song, D.*, Wong, W.K., and Li, J.J. (2022). Single-cell generalized trend model (scGTM): a flexible and interpretable model of gene expression trend along cell pseudotime. Bioinformatics 38(16):3927–3934. [ SOFTWARE ] [ CODE ] | [ PDF ]
61. Song, D.*, Xi, N.M.*, Li, J.J., and Wang, L. (2022). scSampler: fast diversity-preserving subsampling of large-scale single-cell transcriptomic data. Bioinformatics 38(11):3126–3127. [ PYTHON PACKAGE ] [ R PACKAGE ] | [ PDF ]
60. Eisen, T.J., Li, J.J., and Bartel, D.P. (2022). The interplay between translational efficiency, poly(A) tails, microRNAs, and neuronal activation. RNA 28:808–831. | [ PDF ]
59. Li, Y.*, Ge, X.*, Peng, F., Li, W., and Li, J.J. (2022). Exaggerated false positives by popular differential expression methods when analyzing human population samples. Genome Biology 23:79. [ UCLA NEWS ] [ CODE ] | [ PDF ]
58. Jiang, R., Sun, T.Song, D., and Li, J.J. (2022). Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biology 23:31. [ CODE ] | [ PDF ]
57. Sun, T.Song, D., Li, W.V., and Li, J.J. (2022). Simulating single-cell gene expression count data with preserved gene correlations by scDesign2. Journal of Computational Biology 29(1):23–26. (RECOMB 2021; software article; see Publication 50 for the method article) [ SOFTWARE ] | [ PDF ]

2021

56. Ge, X.*, Chen, Y.E.*, Song, D., McDermott, M., Woyshner, K., Manousopoulou, A., Wang, N., Li, W., Wang, L.D., and Li, J.J. (2021). Clipper: p-value-free FDR control on high-throughput data from two conditions. Genome Biology 22:288. [ UCLA NEWS ] [ SOFTWARE ] [ CODE ] [ VIDEO ] | [ PDF ]
55. Shi, J., Xu, J., Chen, Y.E., Li, J.S., Cui, Y., Shen, L, Li, J.J., and Li, W. (2021). The concurrence of DNA methylation and demethylation is associated with transcription regulation. Nature Communications 12:5285. | [ PDF ]
53. Song, D.*, Li, K.*, Hemminger, Z., Wollman, R., and Li, J.J. (2021). scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling. Bioinformatics 37(Supplement_1):i358–i366. [ ISMB/ECCB 2021 ] [ SOFTWARE ] | [ PDF ]
52. Jiang, R., Li, W.V., and Li, J.J. (2021). mbImpute: an accurate and robust imputation method for microbiome data. Genome Biology 22:192. [ UCLA NEWS ] [ SOFTWARE ] | [ PDF ]
51. Wang, N., Lefaudeux, D., Mazumder, A., Li, J.J., Hoffmann, A. (2021). Identifying the combinatorial control of signal-dependent transcription factors. PLOS Computational Biology 17(6):e1009095. | [ PDF ]
49. Li, J.J.Chen, Y.E., and Tong, X. (2021). A flexible model-free prediction-based framework for feature ranking. Journal of Machine Learning Research 22(124):1–54. [ SOFTWARE ] | [ PDF ]
48. Sun, Y.E.Zhou, H.J., and Li, J.J. (2021). Bipartite tight spectral clustering (BiTSC) algorithm for identifying conserved gene co-clusters in two species. Bioinformatics 37(9):1225–1233. [ SOFTWARE ] | [ PDF ]
47. Sun, M.Z., Babayan, D., Chen, J.-S., Wang, M.M., Naik, P.K., Reitz, K., Li, J.J., Pouratian, N., Kim, W. (2021). Postoperative admission of adult craniotomy patients to the neuroscience ward reduces length of stay and cost. Neurosurgery 89(1):85–93. | [ PDF ]
44. Guo, Y., Xue, Z., Yuan, R., Li, J.J., Pastor, W.A., and Liu, W. (2021). RAD: a web application to identify region associated differentially expressed genes. Bioinformatics 37(17):2741–2743. [ WEBSITE ] | [ PDF ]
43. Xu, J., Shi, J., Cui, X., Cui, Y., Li, J.J., Goel, A., Chen, X., Issa, J.-P., Su, J., and Li, W. (2021). Cellular Heterogeneity–Adjusted cLonal Methylation (CHALM) improves prediction of gene expression. Nature Communcations 12:400. | [ PDF ]
42. Wang, Y.X.R., Li, L., Li, J.J., and Huang, H. (2021). Network modeling in biology: statistical methods for gene and brain networks. Statistical Science 36(1):89–108. | [ PDF ]
41. Li, J.J. (2021). A new bioinformatics tool to recover missing gene expression in single-cell RNA sequencing data. ​Journal of Molecular Cell Biology 13(1):1–2. (Highlight of the PBLR method by Zhang and Zhang) | [ PDF ]

2020

40. Lyu, J.*, Li, J.J.*, Su, J., Peng, F., Chen, Y.E., Ge, X., and Li, W. (2020). DORGE: Discovery of Oncogenes and tumor suppressoR genes using Genetic and Epigenetic features. Science Advances 6(46):eaba6784. [ VIDEO ] | [ PDF ]
39. Yu, C., Zhang, M., Song, J., Zheng, X., Xu, G., Bao, Y., Lan, J., Luo, D., Hu, J., Li, J.J., and Shi, H. (2020). Integrin-Src-YAP1 signaling mediates the melanoma acquired resistance to MAPK and PI3K/mTOR dual targeted therapy. Molecular Biomedicine 1:12. | [ PDF ]

2019

36. Li, W.V.*, Li, S.*, Tong, X., Deng, L., Shi, H., and Li, J.J. (2019). AIDE: annotation-assisted isoform discovery with high precision. Genome Research 29:2056–2072. [ UCLA NEWS ] [ SOFTWARE ] [ DATA ] [ COVER ART ] | [ PDF ]
35. Li, J.J., Chew, G.-L., and Biggin, M.D. (2019). Quantitative principles of cis-translational control by general mRNA sequence features in eukaryotes. Genome Biology 20:162. [ CODE ] | [ PDF ]
34. Li, W.V. and Li, J.J. (2019). A statistical simulator scDesign for rational scRNA-seq experimental design. Bioinformatics 35(14):i41–i50. [ ISMB/ECCB 2019 ] [ SOFTWARE ] | [ PDF ]
33. Ge, X.*, Zhang, H.*, Xie, L., Li, W.V., Kwon, S.B., and Li, J.J. (2019). EpiAlign: an alignment-based bioinformatic tool for comparing chromatin state sequences. Nucleic Acids Research 47(13):e77. [ SOFTWARE ] [ WEBSITE ] | [ PDF ]
32. Razaee, Z.S., Amini, A.A., and Li, J.J. (2019). Matched bipartite block model with covariates. Journal of Machine Learning Research 20(34):1–44. | [ PDF ]
31. Li, J.J. (2019). Review of "Statistical modeling and machine learning for molecular biology" by Moses, A.M. ​The American Statistician 73(1):103–104. | [ PDF ]
30. Duong, D., Ahmad, W.U., Eskin, E., Chang, K.-W., and Li, J.J. (2019). Word and sentence embedding tools to measure semantic similarity of Gene Ontology terms by their definitions. Journal of Computational Biology 26(1):38–52. [ SOFTWARE ] | [ PDF ]

2018

29. Li, W.V. and Li, J.J. (2018). Modeling and analysis of RNA-seq data: a review from a statistical perspective. Quantitative Biology 6(3):195–209. | [ PDF ]
28. Burke, J.E., Longhurst, A.D., Merkurjev, D., Sales-Lee, J., Rao, B., Moresco, J.J., Yates III, J.R., Li, J.J., and Madhani, H.D. (2018). Spliceosome profiling visualizes operations of a dynamic RNP at nucleotide resolution. Cell 173(4):1014–1030.e17. | [ PDF ]
27. Li, W.V.*, Zhao, A., Zhang, S., and Li, J.J.* (2018). MSIQ: joint modeling of multiple RNA-seq samples for accurate isoform quantification. Annals of Applied Statistics 12(1):510–539. [ SOFTWARE ] [ COLOR PDF ] | [ PDF ]
26. Li, W.V. and Li, J.J. (2018). An accurate and robust imputation method scImpute for single-cell RNA-seq data. Nature Communications 9:997. [ UCLA NEWS ] [ SOFTWARE ] | [ PDF ]
24. Zhang, Y., Harris, C.J., Liu, Q., Liu, W., Ausin, I., Long, Y., Xiao, L., Feng, L., Chen, X., Xie, Y., Chen, X., Zhan, L., Feng, S., Li, J.J., Wang, H., Zhai, J., and Jacobsen. S.E. (2018). Large-scale comparative epigenomics reveals hierarchical regulation of non-CG methylation in Arabidopsis. Proc Natl Acad Sci. USA 115(5):E1069–E1074. | [ PDF ]
23. Jonassaint, C.R., Kang, C., Abrams, D.M., Li, J.J.Mao, J.Jia, Y., Long, Q., Sanger, M., Jonassaint, J.C., De Castro, L., and Shah, N. (2018). Understanding patterns and correlates of daily pain using the sickle cell disease mobile application to record symptoms via technology (SMART). British Journal of Haematology 183(2):306–308. | [ PDF ]

2017

22. Li, J.J., Chew, G.-L., and Biggin, M.D. (2017). Quantitating translational control: mRNA abundance­­-dependent and independent contributions and the mRNA sequences that specify them. Nucleic Acids Research 45(20):11821–11836. [ Highlight talk at RECOMB 2018 ] | [ PDF ]
21. Clifton, S.M., Kang, C.Li, J.J., Long, Q., Shah, N., and Abrams, D.M. (2017). Hybrid statistical and mechanistic mathematical model guides mobile health intervention for chronic pain. Journal of Computational Biology 24(7):675–688. | [ PDF ]
20. Tong, X. and Li, J.J. (2017). Discussion of "Random-projection ensemble classification" by Cannings, T.I. and Samworth, R.J. Journal of the Royal Statistical Society: Series B 79(4):1025–1026. | [ PDF ]
19. Li, W.V.Chen, Y., and Li, J.J. (2017). TROM: a testing-based method for finding transcriptomic similarity of biological samples. Statistics in Biosciences 9(1):105–136. [ SOFTWARE ] | [ PDF ]
17. Yang, Y.*, Yang, Y.T.*, Yuan, J., Lu, Z.J., and Li, J.J. (2017). Large-scale mapping of mammalian transcriptomes identifies conserved genes associated with different cell states. Nucleic Acids Research 45(4):1657–1672. [ DATA ] | [ PDF ]

2016

16. Li, J.J. and Tong, X. (2016). Genomic applications of the Neyman–Pearson classification paradigm. Big Data Analytics in Genomics. Springer (New York).
14. Li, W.V.Razaee, Z.S., and Li, J.J. (2016). Epigenome overlap measure (EPOM) for comparing tissue/cell types based on chromatin states. BMC Genomics 17(Supp 1):10. [ SOFTWARE ] | [ PDF ]

2015

13. Li, J.J., Huang, H., Qian, M., and Zhang, X. (2015). Chapter 24: Transcriptome analysis using next-generation sequencing. Advanced Medical Statistics (2nd Edition).
12. Liu, Z., Dai, S., Bones, J., Ray, S., Cha, S., Karger, B. L., Li, J.J., Wilson, L., Hinckle, G., and Rossomando, A. (2015). A quantitative proteomic analysis of cellular responses to high glucose media in Chinese hamster ovary cells. Biotechnology Progress 31(4):1026–38. | [ PDF ]
11. Li, J.J. and Biggin, M.D. (2015). Statistics requantitates the central dogma. Science 347(6226):1066–1067. [ UCLA NEWS ] [ Interview at Significance 12(3):8 ] | [ PDF ]

2014

10. Gerstein, M.B.*, Rozowsky, J.*, Yan, K.K.*, Wang, D.*, Cheng, C.*, Brown, J.B.*, Davis, C.A.*, Hillier, L*, Sisu, C.*, Li, J.J.*, Pei, B.*, Harmanci, A.O.*, Duff, M.O.*, Djebali, S.*, and 82 other authors from the modENCODE consortium (2014). Comparative analysis of the transcriptome across distant species. Nature 512(7515):445–448. [ NIH NEWS ] | [ PDF ]
9. Boyle, A., Araya, C., Brdlik, C., Cayting, P., Cheng, C., Cheng, Y., Gardner, K., Hillier, L., Janette, J., Jiang, L., Kasper, D., Kawli, T., Kheradpour, P., Kundaje, A., Li, J.J., and 25 other authors from the modENCODE and ENCODE consortia (2014). Comparative analysis of regulatory information and circuits across distant species. Nature 512(7515):453–456. [ NIH NEWS ] | [ PDF ]

2012

6. Fisher, W.W., Li, J.J., Hammonds, A.S., Brown, J.B., Pfeiffer, B., Weiszmann, R., MacArthur, S., Thomas, S., Stamatoyannopoulos, J.A., Eisen, M.B., Bickel, P.B., Biggin, M.D., and Celniker, S.E. (2012). DNA regions bound at low occupancy by transcription factors do not drive patterned reporter gene expression in Drosophila. Proc Natl Acad Sci. USA 109(52):21330–21335. | [ PDF ]
4. Gao, Q., Ho, C., Jia, Y., Li, J.J., and Huang, H. (2012). Biclustering of linear patterns in gene expression data (CLiP). Journal of Computational Biology 19(6):619–631. | [ PDF ]
3. Li, J., Li, J., and Chen, B. (2012). Oct4 was a novel target of Wnt signaling pathway. Molecular and Cellular Biochemistry 362:233–240. | [ PDF ]

2011

2. Li, J.J., Jiang, C.-R., Brown, B.J., Huang, H., and Bickel, P.J. (2011). Sparse linear modeling of RNA-seq data for isoform discovery and abundance estimation. Proc Natl Acad Sci. USA 108(50):19867–19872. [ SOFTWARE ] | [ PDF ]

2009

1. MacArthur, S.*, Li, X.Y.*, Li, J.*, Brown, J.B., Chu, H.C., Zeng, L., Grondona, B.P., Hechmer, A., Simirenko, L., Keranen, S.V., Knowles, D.W., Stapleton, M., Bickel, P., Biggin, M.D., and Eisen, M.B. (2009). Developmental roles of 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biology 10:R80. [ Faculty of 1000 recommendation ] | [ PDF ]