Program

9:00

Registration and poster hanging

9:30

Opening address

9:40

Session 1 Chair: Djordje Djordjevic

Michael Silk - Walter & Eliza Hall Institute, The University of Melbourne

Mapping human variation across the exome to identify regions intolerant to change and assist predictions of deleteriousness.        

10:00

Rebecca Poulos - UNSW

Transcription and differential DNA repair underlies promoter mutation hotspots in cancer genomes        

10:20

Andrew Pattison – Monash University, VIC

Get your Wet-Lab Collaborators to Make their Own Figures: Making Interactive Web Applications with RStudio and Shiny         

10:40

Morning tea

11:00

Poster Session 1

11:30

Session 2 Chair: Shila Ghazanfar

Nathan Bachmann - University of the Sunshine Coast

Culture-independent genome sequencing of clinical samples reveals an unexpected heterogeneity of infections by Chlamydia pecorum

11:50

Stuart Lee - Walter & Eliza Hall Institute, The University of Melbourne

Assessing clonality in malaria parasites from massively parallel sequencing data

12:10

Nikeisha Caruana - La Trobe University

A combined transcriptomic and proteomic approach reveals putative toxins in the mucus secretions of the Southern Bottletail Squid, Sepiadarium austrinum.

12:30

Lunch

13:10

Poster Session 2

13:40

Session 3 Chair: Scott Ritchie

Helen Dockrell - Flinders University

Computational Modeling of Serotonin Movement Through the Human Colonic Mucosa

14:00

Lingxiao Zhou - UNSW

Combining spatial and chemical information for clustering pharmacophores

14:20

Andrew Lonsdale - The University of Melbourne, VIC

COMBINE: how to get involved

14:40

Afternoon tea

15:00

Walk to Forest Lodge Hotel - 117 Arundel St, Forest Lodge

15:20

Careers Panel - Read more about our panellists here.

Aaron Darling

Rob Davidson

Jean Yang

Bruno Gaeta

Eleni Giannoulatou

16:50

Prizes and Symposium close

17:00

COMBINE Social Event at the Forest Lodge Hotel

Careers Panel

We are proud to announce the panellists for the COMBINE Symposium 2015 Careers Panel:

aaron_darling

Aaron Darling

Aaron Darling is an Associate Professor in Computational Genomics and Bioinformatics in the UTS Faculty of Science's ithree institute. He has over a decade of experience developing computational methods for comparative genomics and evolutionary modeling and in 2013 moved from the University of California-Davis to start a computational genomics group at UTS.

Darling embarked on his research career at the University of Wisconsin-Madison. Following a bachelor's degree in Computer Science, he worked with members of the UW-Madison Genome Center to sequence and analyze the first genomes of pathogenic E. coli. During this time Darling led the development of some widely used computational methods for analysing genomic data, including the mpiBLAST open source parallel BLAST software and the Mauve software for comparing multiple genome sequences.

Following the award of a Ph.D. at UW-Madison, Darling received a fellowship from the US National Science Foundation to pursue postdoctoral studies at The University of Queensland. After two years at UQ he then returned to UC Davis to develop a research program in computational metagenomics -- the study of uncultivated microorganisms from the environment using computational methods.

Darling now brings his experience to understand the relationship between humans and microorganisms in collaboration with microbiologists at the ithree institute.

rob_davidson

Rob Davidson

Rob Davidson is a data scientist at the journal GigaScience. Prior to this he was a bioinformatician at the University of Birmingham, UK. He would like to point out that bioinformatics has been doing data science for a couple of decades now but no one called that the sexiest job of the 21st Century. He admits that he asked for the title ‘data scientist’ when he moved to GigaScience. In his current role, he develops systems for sharing scientific ‘big-data’ and software so that scientific publications can become reproducible once more. These systems include GigaDB, a database that stores all data associated with GigaScience articles and GigaGalaxy, a platform for re-running analyses found in the articles. His current project is an image viewing system for easy access to full resolution versions of images stored in GigaDB. In his spare time he is also heavily involved in the Open Data community of Hong Kong.

jean_yang

Jean Yang

Associate Professor Jean Yang is an applied statistician with expertise in translational bioinformatics.  She was awarded the 2015 Moran Medal in statistics from the Australian Academy. This is in recognition of her work on developing methods for molecular data arising in cutting edge biological and medical research.

Her research stands at the interface between medicine and methodology development and has centered on the development of statistical methodology and the application of statistics to problems in genomics, proteomics and biomedical research. The overarching goal of her research work is to develop statistical methods for integrating –omics data from multiple platforms and devising models for incorporating related biological information with such data. As a statistician who works in the bioinformatics area, she enjoys research in a collaborative environment, working closely with scientific investigators from diverse backgrounds.

bgaeta_1282804372

Bruno Gaeta

Bruno Gaëta obtained his PhD in molecular genetics from UNSW in 1992. After a post-doctoral research fellowship in gene regulation, he turned away from the lab and joined the Australian Genomic Information Centre at the University of Sydney, to provide training and education in the use of computer software for analysing molecular biology data. For several years he travelled around Australia and the world, introducing biologists to the then nascent field of bioinformatics.

In 1999 his group at the University of Sydney spun off a company to commercialise some of its software, and Bruno joined the newly formed eBioinformatics as Director of Education and Senior Scientist. During this time with the company he took sabbatical leave several times to teach AusAid-sponsored bioinformatics workshops in South-East Asia and introduce local biologists to new computer-based research methods.

In 2002 Bruno returned to academia and his alma mater UNSW, as senior lecturer in computer science, and director of the newly established Bachelor of Engineering (Bioinformatics) program. He has established a network of collaborations with biologists and computer scientists both within and outside UNSW. His research interests cover multiple areas of bioinformatics including gene regulation and protein structure, currently with a focus on the immune system, antibody genes and the generation of antibody diversity. He is maintaining his interest in bioinformatics education and in introducing bioinformatics in developing countries, and is currently a director as well as treasurer-elect of the International Society for Computational Biology as well as the vice-president (Education) of the Asia-Pacific Bioinformatics Network.

150216_11_EleniGiannoulatou_0828

Eleni Giannoulatou

Eleni studied Engineering in Greece before moving to the UK, where she received her MPhil in Computational Biology from Cambridge University and her DPhil in Bioinformatics from Oxford University in 2010. During her postdoctoral work, she was involved in genome-wide association studies as part of the Wellcome Trust Case Control Consortium.  In her last years in Oxford, she was also involved in genetic studies of rare diseases by investigating paternal-age-effect mutations in sperm. She moved to Australia in late 2013 and joined the Bioinformatics lab in the Victor Chang Cardiac Research Institute where she is working on the identification of mutations causing congenital heart defects.

How to find the Forest Lodge Hotel

Screen Shot 2015-10-06 at 8.41.03 am.png

Talk Abstracts

Michael Silk - Walter & Eliza Hall Institute, The University of Melbourne

Mapping human variation across the exome to identify regions intolerant to change and assist predictions of deleteriousness.        

Personal genomics is rapidly advancing in its efficacy in identifying causative variants of Mendelian disorders. To isolate these variants from the background of benign variation, in silico scores of deleteriousness are used to prioritise those most likely to affect protein function. While progress has been made in developing these scores, no score or combination of scores has a level of accuracy truly fit for clinical use.

Many in silico scores currently in use measure how conserved a variant’s position is at the amino acid and genetic levels. This approach is confounded by the necessity to compare across species, where functionally relevant mutations in humans may not be conserved in other species. Instead, we propose a novel scoring approach using a newly released variant database of human exomes to identify regions intolerant to variation. We explore different approaches to summarising variability within predefined regions, within and across genes and exons, and discuss the complexities of this strategy. We also consider ways to assess our results using well-studied genes where the consequences of variation are well known. We expect that this information can be used to improve in silico scoring metrics to better patient diagnoses from personal genomic pipelines.

Rebecca Poulos - UNSW

Transcription and differential DNA repair underlies promoter mutation hotspots in cancer genomes

Promoters are DNA sequences that play an essential role in controlling gene expression. While recent whole cancer genome analyses have identified numerous hotspots of somatic mutations within promoters, most do not appear to be functional as they do not perturb gene expression. As such, positive selection does not adequately explain the frequency of promoter mutations in cancer genomes. We have found that increased mutation density at gene promoters is in fact linked to transcriptional activity and differential DNA repair. By analysing 1,163 cancer genomes, we find evidence for increased local density of somatic point mutations within the DNase I hypersensitive centre of gene promoters across 14 cancer types. Mutated promoters are strongly associated with transcriptional activity, with mutation density highest within transcription factor binding sites. By analysing genome-wide maps of nucleotide excision repair (NER), we find that NER is impaired within the DNase I hypersensitive centre of active gene promoters, inversely mirroring the increase in somatic mutation density. Thus, our analysis has uncovered the presence of a previously unknown mechanism linking transcription initiation and DNA repair, thereby implicating localised differential DNA repair as the underlying cause for the somatic mutation hotspots observed at gene promoters of cancer genomes.

Andrew Pattison - Monash University

Get your Wet-Lab Collaborators to Make their Own Figures: Making Interactive Web Applications with RStudio and Shiny         

We specialise in studying the 3’ end of adenylated RNA to identify changes to 3’UTR choice (alternative polyadenylation) and length-distribution of the poly (A)-tail, in addition to the normal digital gene expression. We do this using a process called PAT-seq (for Poly (A)-Test-sequencing), and a sophisticated bespoke bioinformatics pipeline. However, to extract the full meaning from these data and to ask some more complex questions, it is important to analyse single genes. For example, to understand changes to poly (A)-length distribution the data need to be manually explored in R. Because biologists tend not to interact with “that computer stuff” generating never-ending plots to show this can be tedious. A simple solution is turning you R scripts into an interactive ‘Shiny’ web application from the makers of RStudio http://shiny.rstudio.com/ using nothing but the R programming language. The interactive widgets that Shiny provides enable even early-career bioinformaticians to build interactive online applications that allow wet-lab biologists to explore their data. I will present an app that I developed to interactively display changes to 3’UTR dynamics. I will show how this simple app has transformed how we explore our sequencing data and, using download functions, can produce publication ready plots.

Nathan Bachmann - University of the Sunshine Coast

Culture-independent genome sequencing of clinical samples reveals an unexpected heterogeneity of infections by Chlamydia pecorum

Chlamydia pecorum is an important global pathogen of livestock and it is also a significant threat to the long-term survival of Australia’s koala populations. This study employed a culture-independent DNA capture approach to sequence C. pecorum genomes directly from clinical swabs samples collected from koalas with chlamydial disease as well as from sheep with arthritis and conjunctivitis. Investigations into single nucleotide polymorphisms within each of the swab samples revealed that a portion of the reads in each sample belonged to separate C. pecorum strains, suggesting that all of the clinical samples analyzed contained mixed populations of genetically distinct C. pecorum isolates. Using the genomes of strains identified in each of these samples, whole genome phylogenetic analysis revealed that a clade containing a bovine and a koala isolate is distinct from other clades comprised of livestock or koala C. pecorum strains. Providing additional evidence to support exposure of koalas to Australian livestock strains, two ‘minor’ strains assembled from the koala swab samples clustered with livestock strains rather than koala strains. Culture-independent probe-based genome capture and sequencing of clinical samples provides the strongest evidence yet to suggest that naturally occurring chlamydial infections are comprised of multiple genetically distinct strains.

Stuart Lee - Walter & Eliza Hall Institute, The University of Melbourne

Assessing clonality in malaria parasites from massively parallel sequencing data

Parasite resistance to drug treatments for malaria have begun to emerge in south-east Asian and Pacific populations, highlighting the importance of methods for detecting which genes in the parasites are under selection pressure. In highly endemic areas of malaria, hosts may harbour multiple parasite (clonal) infections at the same time. This multiplicity of infection (MOI) complicates detection of genes under selection because current methods implicitly assume that hosts have single clone infections. Samples with MOI > 1 may be removed from downstream analysis but this results in decreased power and systematic bias.

Here we describe a new approach for estimating MOI from massively parallel sequencing (MPS)data using probabilistic clustering. This technique takes advantage of the expected symmetry in the distribution of read-counts supporting single nucleotide variants (SNVs) under MOI. We evaluate our method by simulation of MPS data with known MOI under varying coverage, sequencing error and mixtures of clones. We also apply our method to a set of MPS data from the malaria parasite Plasmodium falciparum sampled from Papua New Guinea. To assess our method's performance we compare our estimates to several other available software packages and wet-lab data. Our simulation and estimation methods are available in an R package called moimix. This MOI detection method has potential applications as the basis for a selection detection method that does not require removal of samples with MOI > 1.

Nikeisha Caruana - La Trobe University

A combined transcriptomic and proteomic approach reveals putative toxins in the mucus secretions of the Southern Bottletail Squid, Sepiadarium austrinum.

Cephalopods comprise over 800 species, possess advanced nervous, cardiovascular and visual systems and are masters of camouflage. Perhaps less well appreciated are the adhesive and/or toxic secretions produced by many cephalopod species.  These secretions are often involved in defending the species from predators. One such species, Sepiadarium austrinum, the Southern bottletail squid, secretes viscous mucus from its underside, which is potentially toxic.  Identifying the proteins in cephalopods is important both from an ecological and evolutionary perspective as well as having the high potential to produce biomedical products through the proteins involved in toxicity, mucus and adhesive components of the secretions.  Until recently proteomics studies have largely been restricted to model organisms because protein sequences must be known beforehand in order to be identified or quantified by mass spectrometry. To solve this problem we used predicted proteins for S. austrinum based on a de novo assembled transcriptome and were able to identify 4,695 proteins, with ten of these being flagged for putative toxicity within the mucus of S. austrinum. The workflow for this analysis, incorporating both transcriptomic and proteomic tools was implemented in Galaxy. This method will be used on further species to create a comparative proteomic study on the defence secretions of cephalopods.

Helen Dockrell - Flinders University

Computational Modeling of Serotonin Movement Through the Human Colonic Mucosa

The control of gut motility remains poorly defined even in healthy individuals and this limited understanding therefore makes it difficult to treat disorders in patient populations. This is particularly true in the human colon, where we still have very little information on normal or abnormal motor patterns, and even less information on the mechanisms that control them. A preliminary computational model of the human colonic mucosa has been developed to characterise the three dimensional spread of serotonin which is influenced by pressure changes, tissue structure and serotonin release rates. The model integrates data of the kinetics of serotonin secretion from individual human enterochromaffin cells with optical fiber readings of intracolonic pressure patterns and assumes that the mucosal mesentery acts as a compressible porous medium. Preliminary findings indicate that serotonin concentrates in the mucus and moves quickly to the lumen, rather than moving deeper into the colonic tissue. The concentration of serotonin in the mucus may explain the contrast between serotonin concentrations measured in whole colon and those measured in single-cell analyses. Future development of this model will involve alternative mathematical modeling of the mucosal mesentery, further statistical robustness of source data and ideally, refinement until a link between serotonin release kinetics, muscle movement pattern and patient presentation is found. Alternatively, separate characterisation of serotonin release kinetics and muscle movement patterns will be elucidated, for individual application to patient presentation.

Lingxiao Zhou - UNSW

Combining spatial and chemical information for clustering pharmacophores

Background

A pharmacophore model consists of a group of chemical features arranged in three-dimensional space that can be used to represent the biological activities of the described molecules. Clustering of molecular interactions of ligands on the basis of their pharmacophore similarity provides an approach for investigating how diverse ligands can bind to a specific receptor site or different receptor sites with similar or dissimilar binding affinities. However, efficient clustering of pharmacophore models in three-dimensional space is currently a challenge.

Results

We have developed a pharmacophore-assisted Iterative Closest Point (ICP) method that is able to group pharmacophores in a manner relevant to their biochemical properties, such as binding specificity etc. The implementation of the method takes pharmacophore files as input and produces distance matrices. The method integrates both alignment-dependent and alignment-independent concepts.

Conclusions

We apply our three-dimensional pharmacophore clustering method to two sets of experimental data, including 31 globulin-binding steroids and 4 groups of selected antibody-antigen complexes. Results are translated from distance matrices to Newick format and visualised using dendrograms. For the steroid dataset, the resulting classification of ligands shows good correspondence with existing classifications. For the antigen-antibody datasets, the classification of antigens reflects both antigen type and binding antibody. Overall the method runs quickly and accurately for classifying the data based on their binding affinities or antigens.