My PhD project involves using bioinformatics techniques to understand the progression of Glioblastoma multiforme. Using whole genome and whole exome sequencing of matched primary and recurrent samples, I am aiming to investigate the mode of evolution of Gioblastoma and the effect that treatment has on disease progression.
This involves detecting both single nucleotide and copy number variants from the sequencing data and performing subclonal deconvolution with them to determine what subclones are present in the tumours. However, computational methods, both for detecting variants and for deconvoluting them, are suceptible to high error rates. It is therefore necessary to be able to benchmark these processes, using simulated sequencing datasets with known ground truths, in order to determine their reliability. Existing programs for simulating sequencing datasets were previously unable to capture the noise and complexity seen in real tumours, making them unsuitable for benchmarking methods used for subclonal deconvolution.
The first part of my PhD project therefore consisted of developing programs to enable me to create suitable simulated sequencing datasets:
- I created HeteroGenesis to simulate genome sequences for multiple clones in a tumour and matched germline. It also provides the overall variant profiles for individual clones, as well as for bulk samples with specified proportions of each clone and contamination from a matched germline.
- I modified an exiting program to create w-Wessim, an in silico whole exome sequencing tool that can be used to simulate sequencing on a computer in order to create reads from a genome, and with realistic error rates. Unlike its predecessor (Wessim), w-Wessim is able to create highly realistic distributions of reads when aligned to a reference genome and is also able to capture the effect of copy number variants in a genome.
Both HeteroGenesis and w-Wessim are publicly available on my GitHub page:
Barthel FP, Johnson KC, Varn FS, Moskalik AD, Tanner G, Kocakavuk E, Anderson KJ, Abiola O, Aldape K, Alfaro KD, Alpar D, Amin SB, Ashley DM, Bandopadhayay P, Barnholtz-Sloan JS, Beroukhim R, Bock C, Brastianos PK, Brat DJ, Brodbelt AR, Bruns A, Bulsara KR, Chakrabarty A, Chakravarti A, Chuang JH, Claus EB, Cochran EJ, Connelly J, Costello JF, Finocchiaro G, Fletcher MN, French PJ, Gan HK, Gilbert MR, Gould PV, Grimmer MR, Iavarone A, Ismail A, Jenkinson MD, Khasraw M, Kim H, Kouwenhoven MCM, LaViolette PS, Li M, Lichter P, Ligon KL, Lowman AK, Malta TM, Mazor T, McDonald KL, Molinaro AM, Nam DH, Nayyar N, Ng HK, Ngan CY, Niclou SP, Niers JM, Nouchmehr H, Noorbakhsh J, Ormond DR, Park CK, Poisson LM, Rabadan R, Radlwimmer B, Rao G, Reifenberger G, Sa JK, Schuster M, Shaw BL, Short SC, Sillevis Smitt PA, Sloan AE, Smits M, Suzuki H, Tabatabai G, Van Meir EG, Watts C, Weller M, Wesseling P, Westerman BA, Widhalm G, Woehrer A, Alfred Yung WK, Zadeh G, GLASS Consortium, Huse JT, de Groot JF, Stead L, Verhaak RGW. (2019)
Longitudinal Molecular Trajectories of Diffuse Glioma in Adults.
Tanner, G., Westhead, D.R., Droop, A. and Stead. L.F. (2019)
Simulation of Heterogeneous Tumour Genomes with HeteroGenesis and In Silico Whole Exome Sequencing.
Bioinformatics. 35(16), pp. 2850-2852
Droop, A., Bruns, A.F., Tanner, G., Rippaus, N., Morton, R., Harrison, S., King, H., Ashton, K., Syed, K., Jenkinson, M.D., Brodbelt, A., Chakrabarty, A.,Ismail, A., Short, S. and Stead. L.F. (2018)
How to Analyse The Spatiotemporal Tumour Samples Needed To Investigate Cancer Evolution: A Case Study using Paired Primary and Recurrent Glioblastoma
International Journal of Cancer. Apr;142(8):1620-1626
- BSc Biology - University of Leeds 2013
- MRes Post-Genomic Biology - University of York 2014