Yudi Pawitan

Yudi Pawitan

Professor
Visiting address: Nobels väg 12a, 17165 Solna
Postal address: C8 Medicinsk epidemiologi och biostatistik, C8 MEB Pawitan, 171 77 Stockholm

About me

  • Education
    - 1982 BSc in Statistics from Bogor Agriculture Institute, Indonesia
    - 1984 MSc in Statistics from University of California at Davis
    - 1987 PhD in Statistics from University of California at Davis

Research

  • I work mostly in the development of statistical methods for analyses of high-throughput data as currently generated in genomic studies, including SNP and RNA arrays, and next-generation sequencing. 

Articles

All other publications

Grants

  • Swedish Research Council
    1 January 2023 - 31 December 2025
    Fragile X syndrome (FXS) and Creatine Transporter Deficiency (CTD) are the two most common causes of X-linked intellectual disability. FXS and CTD share common clinical traits, such as cognitive dysfunction, autistic-like features, motor abnormalities and seizures. They also have similar pathophysiology, including alterations of brain energetics. There is no cure for these disorders and the efficacy study of potential treatments is hindered by the scarcity of unbiased, quantitative, non-invasive biomarkers for monitoring brain function. This is an important problem, because behavioural endpoints is subjective, and the use of objective measures is crucial to assess efficacy of new drugs. Since abnormal hemodynamic responses (HR) to sensory stimulation have been reported in preclinical studies of FXS and CTD, the objective of this project is to exploit optical imaging techniques to devise a non-invasive biomarker for these disorders. We will use imaging of intrinsic optical signals (IOS) in animal models and functional near-infrared spectroscopy (fNIRS) in patients. These non-invasive tools allow sensitive detection the changes of hemoglobin species and local blood flow inside the brain, thus providing an indirect measure of neuronal activity. We will: 1) test IOS responses between mutants and controls in animal models of FXS and CTD
    2) investigate molecular mechanisms underlying altered IOS
    3) validate the clinical relevance using fNIRS in patient population.
  • Swedish Research Council
    1 January 2023 - 31 December 2025
    Heritability and genetic correlation are central parameters for understanding the genetic architecture of complex diseases. It is essential to assess the enrichment of these parameters in genomic functional regions. Making use of genome-wide association study summary statistics resources, linkage disequilibrium (LD) score regression was developed for the estimation of heritability and genetic correlation, without accessing individual-level data. Stratified LD score regression (S-LDSC) can assess heritability enrichment on various genome annotations. However, LD score regression does not consider the full linkage information in the genome, and certain types of genomic annotations, including pairwise topological interactions between DNA segments, can hardly be considered in S-LDSC. In 2020, the PI published the high-definition likelihood (HDL) method in Nature Genetics, improving the estimation precision of LD score regression, however, the stratified likelihood model is yet to be developed. Following the demand of the research community, this project aims to develop the stratified high-definition likelihood (S-HDL) method, including both the statistical genetics model and an efficient computational algorithm. The method will be evaluated by both simulations and real data applications. The new method will identify significantly more functional enrichment results for human complex diseases, so that will better reveal the underlying genetic architecture and disease etiology.
  • Detection of biomarkers by analysis of "Next generation" sequencing data.
    Swedish Cancer Society
    1 January 2018
    Molecular research is currently dominated by sequence-based technology, partly because the cost has decreased dramatically but also because the technology promises very detailed information on individual cancers. But many challenges remain
    Although we can produce a correct list of mutations in a single cancer, it is far from self-evident what they mean in terms of cancer biology or what one can clinically use it for. There is still a large gap between the list of mutations and clinical decisions for which treatment to use for the individual. The overall goals of our research include: (i) computer modeling and analysis of the large-scale molecular data currently dominated by DNA and RNA sequence data, and (ii) integration of multiple omics data to improve disease prognosis. As cancer cells develop, many genomic changes such as mutations or copy variation accumulate in the cell. An important step towards better biological understanding and treatment is to try to separate the genomic cause / effect changes from independent / accompanying ones. It is the effect changes that the cancer cells depend on for survival and growth. Our hope is to develop robust methods for identifying cancer cells by integrating omics data, not just the obvious genomics and transcriptomics data such as mutations, copying variability and RNA expression, but also by using biology databases and data networks for interaction genomics such as gene or protein interaction. With the same approach, we have previously shown improved prognoses for breast cancer survival, and we plan to continue the research and development by using the technique of omission data on other cancers.
  • Detection of biomarkers by analysis of "Next generation" sequencing data.
    Swedish Cancer Society
    1 January 2017
    Molecular research is currently dominated by sequence-based technology, partly because the cost has decreased dramatically but also because the technology promises very detailed information on individual cancers. But many challenges remain
    Although we can produce a correct list of mutations in a single cancer, it is far from self-evident what they mean in terms of cancer biology or what one can clinically use it for. There is still a large gap between the list of mutations and clinical decisions for which treatment to use for the individual. The overall goals of our research include: (i) computer modeling and analysis of the large-scale molecular data currently dominated by DNA and RNA sequence data, and (ii) integration of multiple omics data to improve disease prognosis. As cancer cells develop, many genomic changes such as mutations or copy variation accumulate in the cell. An important step towards better biological understanding and treatment is to try to separate the genomic cause / effect changes from independent / accompanying ones. It is the effect changes that the cancer cells depend on for survival and growth. Our hope is to develop robust methods for identifying cancer cells by integrating omics data, not just the obvious genomics and transcriptomics data such as mutations, copying variability and RNA expression, but also by using biology databases and data networks for interaction genomics such as gene or protein interaction. With the same approach, we have previously shown improved prognoses for breast cancer survival, and we plan to continue the research and development by using the technique of omission data on other cancers.
  • Swedish Research Council
    1 January 2017 - 31 December 2020
  • Detection of biomarkers by analysis of "Next generation" sequencing data.
    Swedish Cancer Society
    1 January 2016
    Molecular research is currently dominated by sequence-based technology, partly because the cost has decreased dramatically but also because the technology promises very detailed information on individual cancers. But many challenges remain
    Although we can produce a correct list of mutations in a single cancer, it is far from self-evident what they mean in terms of cancer biology or what one can clinically use it for. There is still a large gap between the list of mutations and clinical decisions for which treatment to use for the individual. The overall goals of our research include: (i) computer modeling and analysis of the large-scale molecular data currently dominated by DNA and RNA sequence data, and (ii) integration of multiple omics data to improve disease prognosis. As cancer cells develop, many genomic changes such as mutations or copy variation accumulate in the cell. An important step towards better biological understanding and treatment is to try to separate the genomic cause / effect changes from independent / accompanying ones. It is the effect changes that the cancer cells depend on for survival and growth. Our hope is to develop robust methods for identifying cancer cells by integrating omics data, not just the obvious genomics and transcriptomics data such as mutations, copying variability and RNA expression, but also by using biology databases and data networks for interaction genomics such as gene or protein interaction. With the same approach, we have previously shown improved prognoses for breast cancer survival, and we plan to continue the research and development by using the technique of omission data on other cancers.
  • Use next-generation sequence data to find biomarkers for cancer
    Swedish Cancer Society
    1 January 2015
    The growth of large-scale data sets continues unabated with the advent of next generation sequencing. Sequencing can reveal unsolicited information on a single tumor's mutation spectrum. An application of sequencing can be used to detect person-specific cancer biomarkers for treatment and follow-up decisions. This means that we have now reached the highly anticipated area of personal medication. Although the cost of sequencing is now considered affordable, around US $ 5,000 for the entire genome, it presents data that generates great challenges, both in basic IT infrastructure and statistical analysis. The primary objective of our research group is to develop statistical and bioinformatic tools and methods for analyzing large-scale data sets within molecular medicine. Currently, we focus on the challenges we face when analyzing the next generation of sequencing data. We pay special attention to searching for biomarkers for cancer. The specific problems we address are: (i) detection of mutations from sequence data, (ii) identification of so-called control genes and biological processes by means of integration of several different types of molecular data, (iii) identification of group-specific markers for breast cancer. We hope to answer the above problems by analyzing the next generation of sequencing data on about 400 breast cancer patients. The data set has already been collected via Cancer Genome Atlas, a large NIH-funded study in cancer genome where approximately 5,000 cancer samples of 20 different cancers have been sequenced. As this is one of the largest collections of ordered data for breast cancer, we hope, with our analyzes, to be able to identify new mutations of breast cancer, and among these mutations, determine which are the most likely mutations that drive the development of the individual cancer.
  • Use next-generation sequence data to find biomarkers for cancer
    Swedish Cancer Society
    1 January 2014
    The growth of large-scale data sets continues unabated with the advent of next generation sequencing. Sequencing can reveal unsolicited information on a single tumor's mutation spectrum. An application of sequencing can be used to detect person-specific cancer biomarkers for treatment and follow-up decisions. This means that we have now reached the highly anticipated area of personal medication. Although the cost of sequencing is now considered affordable, around US $ 5,000 for the entire genome, it presents data that generates great challenges, both in basic IT infrastructure and statistical analysis. The primary objective of our research group is to develop statistical and bioinformatic tools and methods for analyzing large-scale data sets within molecular medicine. Currently, we focus on the challenges we face when analyzing the next generation of sequencing data. We pay special attention to searching for biomarkers for cancer. The specific problems we address are: (i) detection of mutations from sequence data, (ii) identification of so-called control genes and biological processes by means of integration of several different types of molecular data, (iii) identification of group-specific markers for breast cancer. We hope to answer the above problems by analyzing the next generation of sequencing data on about 400 breast cancer patients. The data set has already been collected via Cancer Genome Atlas, a large NIH-funded study in cancer genome where approximately 5,000 cancer samples of 20 different cancers have been sequenced. As this is one of the largest collections of ordered data for breast cancer, we hope, with our analyzes, to be able to identify new mutations of breast cancer, and among these mutations, determine which are the most likely mutations that drive the development of the individual cancer.
  • Swedish Research Council
    1 January 2014 - 31 December 2016
  • Swedish Research Council
    1 January 2012 - 31 December 2015
  • Swedish Research Council
    1 January 2010 - 31 December 2012

Employments

  • Professor, Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 2002-

News from KI

Events from KI