Project: Using text mining to aid cancer risk assessment

Denna sida på svenska

The CRAB-project investigates a novel approach to cancer risk assessment which could greatly assist risk assessors with the management of large textual data and aid knowledge discovery. This approach is based on text mining - a growing field of computer science which discovers new knowledge by automatically extracting information from written texts. We develop text mining-technology for the needs of cancer risk assessment and research with the aim to integrate this technology in a practical tool which can assist researchers and risk assessors in their work and contribute to effective management of health risks in the future.

In this project we collaborate with the Computer Laboratory, University of Cambridge, UK.

Link to The CRAB-project

Contact persons


  • Swedish Research Council (VR)
  • The Swedish Governmental Agency for Innovation Systems (Vinnova)
  • Medical Research Council, UK
  • Royal Society, UK
  • Biotechnology and Biological Sciences Research Council, UK
  • Engineering and Physical Sciences Research Council, UK


Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review.
Guo Y, Silins I, Stenius U, Korhonen A
Bioinformatics 2013 Jun;29(11):1440-7

Text mining for literature review and knowledge discovery in cancer risk assessment and research.
Korhonen A, Séaghdha D, Silins I, Sun L, Högberg J, Stenius U
PLoS ONE 2012 ;7(4):e33427

Data and literature gathering in chemical cancer risk assessment.
Silins I, Korhonen A, Högberg J, Stenius U
Integr Environ Assess Manag 2012 Jul;8(3):412-7

Weakly supervised learning of information structure of scientific abstracts--is it accurate enough to benefit real-world tasks in biomedicine?
Guo Y, Korhonen A, Silins I, Stenius U
Bioinformatics 2011 Nov;27(22):3179-85

A comparison and user-based evaluation of models of textual information structure in the context of cancer risk assessment.
Guo Y, Korhonen A, Liakata M, Silins I, Hogberg J, Stenius U
BMC Bioinformatics 2011 Mar;12():69

Identifying the information structure of scientific abstracts: An investigation of three different schemes.
Guo Y, Sun L, Korhonen A, Liakata M, Silins I, Sun L and Stenius U (2010).
In Proceedings of Bio-Natural Language Processing (BioNLP) Uppsala, Sweden.

The first step in the development of Text Mining technology for Cancer Risk Assessment: identifying and organizing scientific evidence in risk assessment literature.
Korhonen A, Silins I, Sun L, Stenius U
BMC Bioinformatics 2009 Sep;10():303

User-Driven Development of Text Mining Resources for Cancer Risk Assessment.
Sun L, Korhonen A, Silins I, and Stenius U. (2009).
In Proceedings of the Natural Language Processing in Biomedicine (BioNLP) 2009. Boulder, Colorado.

A New Challenge for Text Mining: Cancer Risk Assessment.
Lewin I, Silins I, Korhonen A, Hogberg J and Stenius U. (2008).
In Proceedings of the ISMB BioLINK Special Interest Group on Text Data Mining. Toronto, Canada.