PROPOSAL:SCOPES 2013-2016: Joint Research Projects

JRP Title:

Decision Support System for Leukemia Cancer Diagnosis, Survival Prognosis, Treatment Monitoring, and Biomarkers Discovery Using Fusion of Artificial Intelligence Methods and Gene Microarray Technologies in the Area of Personalized Medicine

1. Summary

The discovery of the human genome has increased our knowledge of cancer on a molecular level. The potential for translation of this genomic information to the biomedical-clinical research and therapeutic management of patients has generated great excitement. During the past decade, considerable progress has been made in defining new prognostic markers, diagnostic parameters, and treatment options but progress has been slower than expected. However, the concept of personalized or “precision” medicine that integrates genomic knowledge (such as microarray gene analysis of the patient’s tumor) and other laboratory research with input from health records, along with social and environmental data, for the selection of the optimal therapy for the individual patient remains attractive. Leukemia cancer research is currently one of the leading fields of clinical research. Individual leukemia subtypes differ in their response to selected type of treatment. Identifying predictive diagnoses (leukemia subtypes) is an imprecise process and is labor intensive, requiring combined expertise of clinical experts (hematologist, oncologist, pathologist, and cytogeneticist), molecular biology experts, and bioinformatics experts. The literature has shown that the use of microarrays (looking at thousand of genes concurrently) can accurately identify the known important leukemia subtypes, and may further enhance our ability to assess patient's therapy success (risk of failing therapy, relapse of disease, resistance on therapy). Further, identified gene expression profiles were found to include new diagnostic classifications and biomarkers. Data acquired with microarray experiments provide valuable information on cancer diagnosis, survival prognosis, treatment monitoring, and biomarkers discovery. Also, microarray technology is used for comparing genome features among individuals and their tissues and cells - an example of personalized medicine. The use of such powerful tools and their influence on the molecular approach to biomedical and clinical research is best realized through multi-disciplinary collaboration as proposed in Figure 1.1.

Figure 1.1 Integration of complementary knowledge of clinical and bioinformatics research groups

1



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

The aim of this project is to develop a system that uses artificial intelligence methods (fusion of methods), experts’ knowledge, and patient records, to predict diagnostic classification, tumor specific markers (genes), comprehensive monitoring of treatment, and a more precise patient survival prognosis. Created models will be validated and optimized according to the feedback from the clinical research. That could result in transferring domain experts’ knowledge into models through learning process. That way the system will be able to produce precise and reliable information about: disease classification, suggestion about novel tumor-specific markers (genes), state of treatment monitoring, and patient survival prognosis.

Global diagram of the system (Figure 1.2) consists of three subsystems: (1) Learning Diagnostic Classifier (LDC), (2) Knowledge Based Treatment Monitor (KBTM), and (3) Learning Prognostic Classifier (LPC).

Figure 1.2 Integration of intelligent subsystems, clinical knowledge, and doctors’ experience in

proposed decision support system (DSS)

Functions of the first subsystem-LDC are implemented in two modes. In the first (OFF-line mode), the model can be trained/learned by microarray data sets available in NCBI, GEOSOFT, GEMS, UCI and other databases, or with the available data from clinic. In the second mode (ON-line mode) the subsystem will receive real data whose sources are microarray data obtained from samples of the treated patient. The application of sophisticated (fusion) methods will optimize: (a) the microarray input data set using Genetic Algorithm (GA) and Artificial Neural Network (ANN), (b) the microarray input data set using Genetic Algorithm (GA) and Clustering Methods (CM), and (c) the microarray input data set using Co-evolutionary Fuzzy Logic (CFL). Further, the model will execute optimal microarray data clustering, selection of the most significant genes related to subtype of leukemia, and the reduction of classification subsets of genes to discriminating genes (genes feature selection). The output (results) from this subsystem will provide predictive cancer diagnosis and predictive biomarkers that may improve the accuracy of the classical diagnostic techniques. The LDC is the first level of support in estimating patient survival scenario. The function of the second subsystem KBTM uses inputs from: (a) gene expression (new clinical microarray data about over-expressions and under-expressions), and (b) therapy parameters. With the application of algorithms for pattern recognition optimized with GA or Adaptive Neuro Fuzzy System (ANFIS), and intelligent

2



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

control engineering algorithms, the subsystem generates an output signal called "success of one treatment phase (therapy)". The KBTM is the second level of support in estimating patient survival scenario. The function of the third subsystem-LPC uses inputs from: (a) biomarkers (discriminators genes), (b) prognostic factors (optional markers), (c) medical statistics data, and (d) success of one treatment phase (therapy). Applying the Self Organizing Maps (SOM), the ANFIS and ensemble methods, the subsystem generates an output signal called the predictive survival prognosis. The subsystem is capable of incorporating multiple factors for prediction of medical prognosis including: cure prediction, unforeseen complications, disease recurrence, level of functions, length of hospital stay, and patient survival. The LPC is the most responsible subsystem in estimating the patient survival. The doctor may use this data to be informed about the symptoms, predictive cancer diagnosis, predictive survival prognosis, predictive biomarkers, and the success of treatment. Once the diagnosis is confirmed, the clinician may accurately make an assessment of survival rate, compose a probable survival scenario, incorporate patient’s goals of treatment, and synthesize a treatment strategy. Justification of the proposed system development within clinical research is based on the following conclusions. 1. In accordance with our knowledge and available resources, the most recent research is based on:

• laboratory experiments following one specific method that results in inadequate analysis, or

• applying one or more methods on existing data sets without real continuous clinic validation Our research will create initial models from existing data sets and then validate those models in real clinical conditions. The model creation will be based on fusion and ensemble of artificial intelligence methods, emphasizing the advantages and eliminating the disadvantages of single methods. 2. In accordance with our knowledge and available resources, the most recent research is based on (as it relates to microarray data):

• cancer classification based on patient-specific data on a single microarray , or

• integrated microarray with samples of different patients collected over prolonged period of time. Our research will generate more than one microarray for a patient during their treatment and prior to change in therapy. Further, the research will enable continuous therapy monitoring and patient survival/healing prognoses. The fight against cancer is a marathon and requires the integration of multidisciplinary knowledge and involvement of various scientists. This proposed project is one part of our long-term research project goal in “Global Approach Based on Microarray, MicroRNA and Cell Signaling Pathways Using Fusion of Artificial Intelligence Methods, Bioinformatics Knowledge and Biomedical Engineering incorporated with Clinical Research Laboratories to Improve Early Diagnosis and Innovative Methods for Cancer Treatment in Personalized Medicine".

3



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

2. Research plan

2.1 Current state of research in the field The field of medicine has undergone major advances in recent decades. With the completion of the human genome project (1), an enormous amount of new information has been gained about human genome and the genetic variations between individuals. In parallel with this, we have also the great advances in computer systems, biomedical engineering and bioinformatics that resulted in a revolutionary shift in medical care to the era of personalized medicine (PM) (2). The aim of PM is to match the right drug to the right patient, or to match right treatment for the right patient at the right time according to his genotype (3). A complete process leading to effective personalized medicine will typically include the following five key elements: obtaining patient genetic/genomic data using array and other high throughput technology, multi- category classification identifying one or more biomarkers, developing new or selecting available therapies, measuring the relationship between biomarkers and clinical outcomes, including the prognosis and response to therapy, and verifying the relationship in a prospective randomized clinical trial (4). As the first step in realizing the promise of PM is to use patients' genomic (gene microarray) profiles, and there are several platforms for microarray analysis, including the most popular mRNA microarray, miRNA microarrays, DNA arrays, and protein microarray. Microarray gene profiles are often used to analyze what genes are being expressed specifically by tumors. In this way, doctors have the ability to distinguish between types of cancer and other diseases based on their gene expression signature. The data obtained from microarray experiment is extremely complicated. In clinical bioinformatics, selecting appropriate software to analyze the microarray data for medical decision making is crucial. Because microarray assay can analyze the expression of multiple genes in parallel, they have been proposed as a method for diagnosis in a clinical laboratory. In (6) there are "results from 3,334 patients who were analyzed as part of international study group formed around European LeukemiaNet (ELN) in 11 laboratories across three continents. The collaborative Microarray Innovations in leukemia study program was designed to assess the clinical accuracy of gene expression profiles (compared with current routine diagnostic work-up) of 16 acute and chronic leukemia subclass, myelodysplastic syndromes, and so-called "none of the target classes" control group that included nonmalignant disorders and normal bone marrow." Studying cancer microarray gene expression data is a challenging task because microarray is high dimensional- low sample data set with a lot of noisy or irrelevant genes and missing data (5). The ability to measure gene expression has resulted in data with the number of genes far exceeding the number of samples. Standard statistical methods in classification and prediction do not function well enough for the case when the number of genes far exceeds the number of samples. Modification of existing statistical methods, or integration with other methods, or use of artificial intelligence methods, or development of new methods is needed for the analysis of microarray data. There are many studies, many papers addressing the description and application of statistical, data mining, and artificial intelligence methods in the processing of microarray data. These methods are mostly "racing" in solving next objective (7): a) gene finding ( feature selection) with aim to reduce the dimensionality of microarray

data set by selecting the most informative genes, b) class discovery(clustering) with aim to determine new disease or cancer c) class prediction (classification) with aim to classifying samples (cancerous or normal) or to

discriminate different types or subtypes of cancer. Aspects of reliability and reproducibility are very important in cancer diagnosis and prognosis, and more researchers use classical statistical ensemble models (8), (9). The high-throughput (molecular "-omics") information associated with personalized medicine is clearly beyond our cognitive abilities. Thus to achieve personalized medicine, we must also develop and implement personalized decision support systems. The development of integrated analyses, making use of complex data, and biomedical informatics methods, models, and algorithms is an issue of critical relevance to clinical medicine (10). "Following the screening of 3416 articles (MEDLINE and Embase were searched from 1999 to 2011) related to Clinical Decision Support (CDS) for Genetically Guided Personalized Medicine (GPM), 38 primary research articles were identified. Focal areas of research included history-driven CDS, cancer management, and pharmacogenomics. Nine randomized controlled trials of CDS interventions for GPM were identified, seven of which reported positive results. The majority of manuscripts were published on or after 2007, with increased recent focus on genotype-driven CDS and the integration of CDS with primary clinical

4



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

information systems" (11).The main task of CDS is to become a bridge overcoming barriers to GPM shown in Figure 2.1(11).

5

Figure 2.1 Clinical Decision Support as bridge between current state and future state

References 1. International Human Genome Sequencing Consortium, Finishing the euchromatic sequence of the human genome,

Nature 431, 931-945 2. Maria Diamandis, Nicole M.A.White and George M.Yousef, Personalized Medicine: Marking a New Epoch in Cancer

Patient Management, Mol Cancer Res 2010; 81175-1187 3. K. K. Jain, Personalized Medicine - Scientific and Commercial Aspects, Research and Market Review, Dublin, July 2013. 4. Zheng Ren, Marie Davidiany, Stephen L. Georgez, Richard M. Goldberg, Fred A. Wrightyy Anastasios A. Tsiatisz Michael R. Kosorok, Research Methods for Clinical Trials in Personalized Medicine: A Systematic Review, Biostatistics Technical Report Series, UNC at Chapel Hill, 2012. 5. Hala M. Alshamlan, Ghada H.Badr, and Yousef Alohali, A Study of Cancer Microarray gene Expression Profile: Objectives

and Approaches, Proceedings of the World Congress on Engineering 2013, Vol II, London. 6. Tosten Haferlach, Alexander Kohlmann, Lothar Wieczorex, Clinical Utility of Microarray-based Gene Expression Profiling in the Diagnosis and Subclassification of Leukemia: Report from the interanational Microarray Innovaions in Leukemia Study Group, Journal of Clinical Ontology, May 2010. 7. Amhun Chaiboonchoe, Sandhya Samarasinghe and Don Kulasiri, Machine Learning for Childhood AcuteLymphoblastic

Leukemia Gene Expression Data Analysis, Current Bioinformatics, 2010, 5, 118-133. 8. Mihir S. Sewak, Narender P. Reddy and Zhong-Hui Duan, Gene Expression Based Leukemia Sub-Classification Using

Committee Neural Networks, Bioinformatics and Biology Insights 2009: 3 89-98, 9. Fatemeh, Aminzadeh, Bita Shadgar, Alireza Osareh, A Robust Model for Gene Analysis and Classification, The

international Journal of Multimedia&Its Applications (IJMA) Vol.3,No.1, February 2011. 10. Ashwin Belle, Mark A. Kon, and Kayvan Najarian, Biomedical Informatics for Computer-Aided Decision Support Systems ,

The Scientific World Journal, Volume 2013, Article ID 769639. 11. Brandon M Welch, Kensaku Kawamotto, Clinical Decision Support for Genetically Guided Personalized Medicine: a

systematic review, Am Med. Inorm Assoc, 2012.

2.2 Past performance of the applicants in the research field

2.2.1 University of Bern/Faculty of Medicine /Department of Clinical Research/Research Group

Haematology /Oncology During the past decade the research group of PD Dr. A. Arcaro performed several projects in the field of pediatric and adult cancer research. Small cell lung cancer SCLC represents 13% of all lung cancer cases and is the most aggressive form of lung cancer with an overall 5-year survival less than 5%. The combination of cisplatin or carboplatin with etoposide remains the standard treatment for SCLC. Despite a good initial response to therapy, most SCLC patients suffer from the development of chemotherapy resistance and relapse. Second-line chemotherapy should then be applied, which however frequently results in only a low survival increase.Recent progress in the understanding of SCLC biology has led to the identification of critical signaling pathways, which allowed the development of specific targeted therapies for the disease. A number of new molecules are currently under clinical evaluation



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

in SCLC. These inhibitors target the proteasome, receptors tyrosine kinases, farnesyltransferase, Bcl-2, or angiogenic pathways. At Imperial College London (2000-2003), we identified and validated several molecular targets implicated in the growth, survival and chemoresistance of (SCLC), including the fibroblast growth factor receptor (FGFR), ribosomal protein S6 kinases (S6K), mammalian target of rapamycin (mTOR), Ras, Src and PI3K isoforms (1-7). These studies have lead to 2 clinical trials in lung cancer patients funded by Cancer Research UK, using: (i) statins combined with chemotherapy, and (ii) RAD001 combined with chemotherapy (Everolimus). This work is of importance, in view of the unsatisfactory prognosis of lung cancer patients with current therapeutic regimens. Common tumors in children include leukemia, brain tumors and neuroblastoma. Current treatments of childhood malignancies such as radiotherapy and chemotherapy are inefficient, due to the resistance of the tumor cells to apoptotic signals. Promising new therapies are, however, emerging, which are based on blocking receptor tyrosine kinase (RTK) signalling to some of their downstream signalling targets such as phosphoinositide 3-kinase (PI3K), protein kinase B (PKB) /Akt, the mammalian target of rapamycin (mTOR) or mitogen-activated extracellular signal-regulated kinase activating kinase (MEK). At the University of Zurich (2004-2009), we identified and validated several molecular targets implicated in the growth, survival, chemoresistance and migration/invasion of neuroblastoma, brain tumors and acute leukemia including the insulin-like growth factor-I receptor (IGF-IR) and PI3K isoforms (8-12). We have shown that the IGF-IR tyrosine kinase inhibitor NVP-AEW541 in combination with Akt or chemotherapeutic agents may represent a novel approach to target human neuroblastoma cell proliferation (8). A subsequent study showed that the p110delta PI3K isoform contributes to neuroblastoma cell growth and survival by regulating the activation of the mTOR/S6K pathway and the expression levels of antiapoptotic Bcl-2 family proteins (11). In acute myeloid leukemia (AML), we have described a novel role for autocrine IGF-I signaling in the growth and survival of primary AML cells (10). This study has shown that IGF-IR inhibitors in combination with chemotherapeutic agents may represent a novel approach to target human AML (10). In a study in medulloblastoma, we could show that the PI3K isoform p110alpha represents a novel drug target for medulloblastoma in view of its increased expression in primary tumor samples and the ability of selective inhibitors to decrease medulloblastoma cell proliferation, survival, chemoresistance and migration (12). In AT/RTs (atypical teratoid/rhabdoid tumours) of the CNS (central nervous system) our results demonstrated a novel role for autocrine signalling by insulin and the IR in growth and survival of malignant human CNS tumour cells via the PI3K/Akt pathway (9). At the Department of Clinical Research of the University of Bern, we are continuing our work on the development of novel targeted therapies for childhood cancers, with a focus on embryonal tumors (Ewing's sarcoma, medulloblastoma and neuroblastoma) and acute leukemia (ALL and AML) (13-18). We have characterised the mechanism of action of the quassinoid analogue NBT-272 in embryonal tumors, with a particular emphasis on its cytotoxic effects (14). In medulloblastoma, we have identified the bone morphogenetic protein (BMP) pathway as a downstream target of c-Myc (16). Using RNA interference screening, we have identified novel molecular targets for medulloblastoma, such as the PI3K isoform p110gamma (18). Using RNA interference screening, we have also identified novel molecular targets for neuroblastoma, such as the fibroblast growth factor receptor-2 (FGFR2) (19). We have also further evaluated inhibitors of the IGF-IR/PI3K pathway as anti-proliferative agents in medulloblastoma and neuroblastoma (20). In addition we have characterized the first selective pharmacological inhibitors of the class II PI3KC2beta as anti-proliferative agents in a variety of human cancer cell lines (21). In parallel, we have described a novel role for PI3KC2beta in the organisation of the actin cytoskeleton, which involved a complex with Dbl, an exchange factor for Rho family small GTP-binding proteins (22). Finally, we have described a key role for the class IA PI3K isoform p110alpha in SCLC tumor growth and cell survival, through regulation of Bcl-2 family proteins (23).

References 1. Pardo O.E., Arcaro, A., Salerno, G., Tetley, T.D., Valovka, T., Gout, I., and Seckl, M.J. Novel cross talk between MEK and S6K2 in

FGF-2 induced proliferation of SCLC cells. Oncogene 20, 7658-7667 (2001) 2. Pardo, O.E., Arcaro, A., Salerno, G., Raguz, S., Downward J., and Seckl, M.J. Fibroblast growth factor-2 induces translational regulation of Bcl-XL and Bcl-2 via a MEK-dependent pathway: correlation with resistance to etoposide-induced apoptosis. J. Biol. Chem., 277, 12040-12046 (2002) 3. Arcaro, A., Khanzada, U.K., Vanhaesebroeck, B., Tetley, T.D., Waterfield, M.D., and Seckl, M.J. Two distinct phosphoinositide 3-kinases

mediate polypeptide growth factor-stimulated PKB activation. EMBO J., 21, 5097-5108 (2002) 4. Pardo, O.E., Lesay, A., Arcaro, A., Lopes, R., Ng, B.L., Warne, P.H., McNeish, I.A., Tetley, T.D., Lemoine, N.R., Mehmet, H., Seckl, M.J., and Downward, J. Fibroblast Growth Factor 2-mediated translational control of IAPs blocks mitochondrial release of Smac/DIABLO and apoptosis in Small Cell Lung Cancer cells. Mol. Cell. Biol., 23, 7600-7610 (2003)

6



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

5. Khanzada, U.K., Pardo, O.E., Meier, C., Downward, J., Seckl, M.J., and Arcaro, A. Potent inhibition of small cell lung cancer cell growth

by simvastatin reveals selective functions of Ras isoforms in growth factor signalling. Oncogene, 25, 877-887 (2006) 6. Arcaro, A., Aubert, M., Espinosa del Hierro, M.E., Khanzada, U.K., Angelidou, S., Tetley, T.D., Bittermann, A.G., Frame, M.C., and Seckl,

M.J. Critical role for lipid raft-associated Src kinases in activation of PI3K-Akt signaling. Cell. Signal., 19, 1081-1092 (2007) 7. Marinov, M., Ziogas, A., Pardo, O.E., Tan, L.T., Lane, H.A., Lemoine, N.R., Zangemeister-Wittke, U., Seckl, M.J., and Arcaro, A. Akt/mTOR pathway activation and Bcl-2 family proteins modulate the sensitivity of human small cell lung cancer cells to RAD001 (Everolimus). Clinical Cancer Research, 15, 1277-1287 (2009) 8. Guerreiro, A.S., Boller, D., Shalaby, T., Grotzer, M.A., and Arcaro, A. Protein kinase B modulates the sensitivity of human

neuroblastoma cells to insulin-like growth factor receptor inhibition. Int. J. Cancer, 119, 2527-2538 (2006) 9. Arcaro, A., Doepfner, K.T., Boller, D., Guerreiro, A.S., Shalaby, T., Jackson, S.P., Schoenwaelder, S.M., Delattre, O., Grotzer, M.A., and

Fischer, B. Novel role for insulin as an autocrine growth factor for malignant brain tumour cells. Biochem. J., 406, 57-66 (2007) 10. Doepfner, K.T., Spertini, O., and Arcaro, A. Autocrine insulin-like growth factor-I signaling promotes growth and survival of human

acute myeloid leukemia cells via the phosphoinositide 3-kinase/Akt pathway. Leukemia, 21, 1921–1930 (2007) 11. Boller, D., Schramm, A., Doepfner, KT., Shalaby, T., von Bueren, A.O., Eggert, A., Grotzer, M.A., and Arcaro, A. Targeting the phosphoinositide 3-kinase isoform p110d isoform impairs growth and survival in neuroblastoma cells. Clinical Cancer Research, 14, 1172-1181 (2008) 12. Guerreiro, A.S., Fattet, S., Fischer, B., Shalaby, T., Jackson, S.P., Schoenwaelder, S.M., Grotzer, M.A., Delattre, O., and Arcaro, A. Targeting the PI3K p110a isoform inhibits medulloblastoma proliferation, chemoresistance and migration. Clinical Cancer Research, 14, 6761-6769 (2008) 13. Shalaby, T., von Bueren, A.O., Hürlimann, M.L., Fiaschetti, G., Castelletti, D., Tera, M., Nagasawa, K., Arcaro, A., Jelesaro v, I., Shin-ya, K., and Grotzer, M. A. Disabling c-Myc in Childhood Medulloblastoma and Atypical Teratoid/Rhabdoid Tumor Cells by the Potent G- Quadruplex Interactive Agent S2T1-6OTD. Mol. Cancer. Ther., 9, 167-79 (2010) 14. Castelletti, D., Fiaschetti, G., Di Dato, V., Ziegler, U., Kumps, C., De Preter, K., Zollo, M., Speleman, F., Shalaby, T., De Martino, D., Berg, T., Eggert, A., Arcaro, A., and Grotzer, M.A. The quassinoid derivative NBT-272 targets both the Akt and Erk signaling pathways in embryonal tumors. Mol. Cancer. Ther., 9, 3145-3157 (2010) 15. De Laurentiis, A., Pardo, O.E., Palamidessi, A., Jackson, S.P., Schoenwaelder, S.M., Reichmann, E., Scita, G., and Arcaro, A. The Catalytic

Class IA PI3K Isoforms Play Divergent Roles in Breast Cancer Cell Migration. Cell. Signal., 23, 529-41 (2011) 16. Fiaschetti, G., Castelletti, D., Zoller, S., Schramm, A., Schroeder, C., Nagaishi, M., Stearns, D., Mittelbronn, M., Eggert, A., Westermann, F., Ohgaki, H., Shalaby, T., Pruschy, M., Arcaro, A., and Grotzer, M.A. Bone morphogenetic protein-7 is a MYC target with pro-survival functions in childhood medulloblastoma. Oncogene, 30, 2823–2835 (2011) 17. Grunder, E., D'Ambrosio, R., Fiaschetti, G., Abela, L., Arcaro, A., Zuzak, T.J., Ohgaki, H., Lu, S.-Q., Shalaby, T., and Grotzer, M.A.

MicroRNA-21 Suppression Impedes Medulloblastoma Cell Migration. Eur. J. Cancer, 16, 2479-2490 (2011) 18. Guerreiro, A.S., Fattet, S., Kulesza, D.W., Atamer, A., Elsing, A.N., Shalaby, T., Jackson, S.P., Schoenwaelder, S.M., Grotzer, M.A., Delattre, O., and Arcaro, A. A sensitized RNA interference screen identifies a novel role for the PI3K p110g isoform in medulloblastoma cell proliferation and chemoresistance. Mol. Cancer Res., 9, 925-935. (2011) 19. Salm, F., Cwiek, P., Ghosal, A., Buccarello, A.-L., Largey, F., Wotzkow, C., Höland, K., Styp-Rekowska, B., Djonov, V., Zlobec, I., Bodmer, N., Gross, N., Westermann, F., Schäfer, S.C., and Arcaro, A. RNA interference screening identifies a novel role for autocrine fibroblast growth factor signaling in neuroblastoma chemoresistance. Oncogene, 32, 3944–3953 (2013) 20. Wojtalla, A., Salm, F., Christiansen, D.G., Cremona,T., Cwiek, P., Shalaby, T., Gross, N., Grotzer, M.A., and Arcaro. A. Novel Agents Targeting the IGF-1R/PI3K Pathway Impair Cell Proliferation and Survival in Subsets of Medulloblastoma and Neuroblastoma. Plos One, 7(10):e47109. (2012) 21. Boller, D., Doepfner, K.T., De Laurentiis, A., Guerreiro, A.S., Marinov, M., Tarek Shalaby, T., Depledge, P., Robson, A., Saghir, N., Hayakawa, M., Kaizawa, H., Koizumi, T., Ohishi, T., Fattet, S., Delattre, O., Schweri-Olac, A., Höland, K., Grotzer, M.A., Frei, K., Spertini, O., Waterfield, M.D., and Arcaro, A. Targeting PI3KC2b impairs proliferation and survival in acute leukemia, brain tumours and neuroendocrine tumours. Anticancer Research, 32, 3015-3028. (2012) 22. Błajecka, K., Marinov, M., Leitner, L., Uth, K., Posern, G., and Arcaro, A. Phosphoinositide 3-Kinase C2β Regulates RhoA and the Actin

Cytoskeleton through an Interaction with Dbl. Plos One, 7(9):e44945. (2012) 23. Wojtalla, A., Fischer, B., Kotelevets, N., Mauri, F.A., Sobek, J., Rehrauer, H., Wotzkow, C., Tschan, M.P., Seckl, M.J., Zangemeister- Wittke, U., and Arcaro A. Targeting the Phosphoinositide 3-Kinase p110α Isoform Impairs Cell Proliferation, Survival and Tumor Growth in Small Cell Lung Cancer. Clinical Cancer Research, 19:96-105. (2013)

2.2.2 University of Sarajevo/ Faculty of Electrical Engineering in Sarajevo/Department for

Computing and Informatics/Research Group Artificial Intelligence and Bioinformatics

In the last decade the research group of Prof. Z. Avdagic conducted more projects and studies in the field of bioinformatics and biomedical engineering (1-6). The following three projects describe the most important research.

Research project 1: Modeling of Computer Aided Simulator in Control of NSCLC Treatment Based on EGFR Gene Mutations’ Artificial Neural Network Classifier and Microarray Expression Analysis (1) Cancer research is currently one of the leading fields of clinical research. Pulmonary malignances including Non Small Cell Lung Cancer (NSCLC) are the most common cancers worldwide and the leading cause of death. Discussions on mutation on the Epidermal Growth Factor Receptor (EGFR) gene inside tyrosine kinase part of gene have become a common topic in the lung cancer community at large, and particularly in NCB Pub Med

7



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

literatures. Different combinations of mutations within this domain exist in cancers of patients with NSCLC. The most frequently observed mutations are on the exons 18, 19, 20, and 21. Currently, the specimens of needle biopsies are usually analyzed by pathologists. Since experienced pathologists and well-equipped institutions are rare, and costly, reliable pathological diagnosis is not always available. Because of that, modeling of computer aided simulator in control of NSCLC treatment is needed for experimental and clinical support in lung cancer diagnosing and therapy. The purpose of this approach is to personalize therapy design identifying chemotherapy drug treatments that have the greatest likelihood of helping each patient as an individual. This is the opposite of empirical therapy selection in which patients are treated blindly with drugs that worked in the past for some percentage of patients who received them.

Methods In the process of designing our system, the following artificial intelligence methods have been used: Artificial Neural Networks (ANN) and Fuzzy logic approach, along with statistical calculus and language programming in MATLAB/Simulink software package. ANN is a massively parallel computational model that simulates the structure and the functional aspects of biological nervous system. MATLAB Neural Network Toolbox is utilized in this work because it contains various functions and backpropagation training algorithms for implementing feed forward neural networks. There are various backpropagation training algorithms with diverse capability based on the nature of problem the network is designed to solve. Fuzzy logic is a highly flexible methodology to transform linguistic observations into quantitative specification. We have used fuzzy logic approach to design Sugeno Multiple Inputs - Multiple Outputs Fuzzy Logic Controller (FLC) for generating the most appropriate treatment having got difference between EGFR gene expression of control sample and EGFR gene expression of test sample. Besides this input, the physician can set some other inputs, and fuzzy logic inference mechanism produces a treatment.

Figure 2.2 Block structure of Artificial Neural Network Classifier

Result 1: Artificial Neural Network Classifier Artificial Neural Network Classifier (ANNC) has as its goal to improve diagnosis of NSCLC based on sample patients’ data with microdeletion mutations (exon 18,19,20) and nucleotide conversion (exon 21) extracted from online EGFR mutation database, and samples data with prediction microdeletion mutations generated in our own generator. We have developed an integrated software suit based on module for preprocessing data (extraction, encoding, and normalization), module for exon microdeletions generation (statistical, and prediction data bases), module for training/learning of ANN, and module for post-processing (classification, and evaluation). Experiments have been done on eleven different training/learning algorithms in combination with different number of cells, layers, and activation functions. The best results have been achieved with cascade-forward backpropagation algorithm based on Levenberg-Marquardt learning mechanism, including best performance (error 5e-031) with the minimum epochs ( training iterations 6) and the regression fit curves (training, validation and testing R=1). The whole set have been divided in 700 training pairs and 411 pairs which serve for validation. Through free

8



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

selection of validating pairs the classifier has successfully divided the positive cases (affected by illness) and the negative (healthy) ones. But this approach to gene based cancer classifications uses the data about different exon mutations from public databases, where the exact classification is possible only if in the case of a new patient the exact match is found in database.

Figure 2.3 Software suite of integrated modules for preprocessing, training and exploiting.

Result 2: Microarray Gene Comparator

Figure 2.4 Block diagram of MGC

The question arises: How are we supposed to follow the effects of a treatment? One possibility would be to rely on the help of the microarray gene expression analysis, with the focus on the EGFR gene and the genes related to NSCLC cancer type. Deoxyribonucleic acid (DNA) micro array technology provides tools for examination of expression levels of a huge number of different genes. This same technology makes it possible for us to simultaneously measure the expression of thousands of genes in one experiment (sample). Gene

9



PROPOSAL:SCOPES 2013-2016: Joint Research Projects

expression data from DNA microarray are based on many variables (genes) measured within only a couple of observations (experiments). These data are used a lot in the analysis of a disease and the cancer diagnosis. In our research the microarray expression plays a crucial role in finding out whether EGFR gene expression correlates according to the expected effects of an assigned treatment. That’s why we have developed another functional module named Microarray Gene Comparator (MGC). MGC is based on the comparison of the wave length of the control gene expression with the wave length of the test gene expression. If the wave length of the control EGFR gene equals the one of test EGFR gene, then the expression intensity is same, which leads us to the conclusion that the test sample is healthy, which in its turn means that EGFR of both samples generates the same quantity of mRNA. The difference between the wave lengths is in this case zero. In case there is a difference in the expression wave lengths of control and test samples, than a certain treatment is need. This difference will serve as a signal to activate a particular FLC controller rule, whose parameters are drug type, the amount, the duration of therapy and breaks. That means that activation of one IF-THEN rule with more input parameters (variables) responds to a concrete treatment(s) available in the knowledge base.

Result 3: Integration of ANNC and MGC Knowing the principles of particular modules, we can now follow the functioning of our complete system. ANNC for input DNK sample reveals whether the observed gene is a healthy or a mutated one.

Figure 2.5 Integration of ANNC, MGC, FLC and treatment knowledge base

Or, to be more precise, it identifies the exon of more of them on which mutations are present. According to mutations, the algorithm detects a certain treatment, or their combination, from the treatment knowledge base. A physician is free to accept that treatment as it is, or to adjust it to his previous experience. Today’s cancer treatments target and turn off EGFR signals. These treatments use antibodies that recognize only EGFR, stick to it, and block it from sending messages. By interrupting the signals, cancer cells are no longer told to overgrow, and eventually die. These treatments are called EGFR-targeted therapies. By inhibiting the adenosine triphosphate (ATP,) formation of phosphotyrosine residues in EGFR is not possible and the signal cascades are not initiated. First treatment based on Gefitinib (Iressa) inhibits EGFR tyrosine kinase domain by binding to the ATP-binding site of the enzyme [3]. Thus the function of the EGFR tyrosine kinase in activating the anti- apoptotic RAS signal transduction cascade is inhibited, and malignant cells are inhibited. Second one treatment Erloinib (Tarceva) specifically targets the EGFR tyrosine kinase, which is highly expressed and occasionally mutated. It binds in a reversible fashion to the ATP binding site of the receptor (7).

10