<p dir="ltr">I graduated from the University of York with a degree in Biology and Computer Science and a D. Phil. in ecology. I have maintained interdisciplinary research interests ever since, in a field that has become known as biodiversity informatics.</p><p dir="ltr">My teaching experience has been equally wide-ranging. I taught biological computation at York, computing at undergraduate and postgraduate level at the University of Kent and at the Open University. Having chaired modules on virtual team-working since 2003 and co-authored others, I have published extensively on collaboration in virtual teams. I am currently chairing the development and a delivery of a module on IT project and service management; the latter half of the module focussing on ITIL. Also at the Open University I have contributed to courses on Java programming, introductory computing, project management and the environment.</p>
<p>My primary research interest is in biodiversity informatics, specifically in information extraction from the legacy scientific literature. The research challenges that I am investigating through these projects include:</p><ul><li><strong>Big data</strong>- the source data is the biodiversity literature. The legacy print literature in this domain has been estimated to run to 300 million pages.</li><li><strong>Noisy data</strong>- Optical Character Recognition (OCR) errors introduced during the scanning process means that up to two thirds of named entities (e.g. scientific names) are rendered incorrectly; simple spell checking or look up against an authority is not sufficient to address this problem, so the context of use is very important.</li><li><strong>Disambiguation</strong>- taxonomic nomenclature calls for unique names only within Kingdoms, hence there is a bacteria genus'<em>Bacillus</em>' and an insect genus'<em>Bacillus</em>' that need to be distinguished.</li><li><strong>Domain specific terminology</strong>- the literature makes extensive use of terse language, abbreviations and special characters such as male ♂ and female ♀, and also mixes Latin formal descriptions with vernacular text. </li></ul><h3>SysMIC</h3><p>I am involved in the BBSRC-funded SysMIC project (Systems Training in Maths, Informatics and Computational Biology;<a href="http://www.sysmic.ac.uk" rel="nofollow">http://www.sysmic.ac.uk</a>), which has developed and is presenting distance learning courses in the fundamentals of systems biology. The project is a consortium of UCL, Birkbeck, Edinburgh and the Open University.</p>
<p dir="ltr">My primary research interest is in information extraction, particularly information extraction from the legacy scientific literature. My research focus has and remains to be biodiversity informatics so I have been investigating the problems and challenges surrounding information extraction from the old biodiversity literature.</p><p dir="ltr">In the past five years I have won four grants, worth approximately£590,000 in this research area. Two of these grants have been funded by JISC, the other two by the EU, so I am experienced in working on nationally and on internationally funded research projects.</p><p dir="ltr">The research challenges that are being investigated by these projects include:</p><p dir="ltr"><em>Big data</em>- our source data is the biodiversity literature. The legacy, print, literature in this domain has been estimated to run to 300 million pages.</p><p dir="ltr"><em>Noisy data</em>- Optical Character Recognition (OCR) errors introduced during the scanning process means that up to two thirds of named entities (e.g. scientific names) are spelt incorrectly; simple spell checking or look up against an authority is not sufficient to address this problem. For example'Homo', the genus name for humans, can be mis-interpreted by an OCR engine as the butterfly genus'Homa', so the context of use is very important.</p><p dir="ltr"><em>Disambiguation</em>- taxonomic nomenclature calls for unique names only within Kingdoms, hence there is a bacteria genus'Bacillus' and an insect genus'Bacillus'.</p><p dir="ltr"><em>Domain specific terminology</em>- the domain makes extensive use of terse language, abbreviations and special characters such as male ♂ and female ♀, and mixes Latin formal descriptions with vernacular text.</p><p dir="ltr">The four grants that David has won recently are:</p><p dir="ltr">A Community-driven Curation Process for Taxonomic Databases<em>. JISC Digital Infrastructure Programme: Managing Research Data</em>call, for£85,902.</p><p dir="ltr">A data infrastructure to support agricultural scientific communities. Promoting data sharing and development of trust in agricultural sciences.<em>EU Seventh Framework Programme , Capacities– Research Infrastructures</em>. Principal investigator at the OU. The total budget is€4 million, with the OU share being£222,745.</p><p dir="ltr">Virtual Biodiversity Research and Access Network for Taxonomy.<em>EU Seventh Framework Programme , Capacities– Research Infrastructures</em>. Principal investigator at the OU and Workpackage leader. The total budget is€4.75 million, with the OU share being£207,685.</p><p dir="ltr">Automatic Biodiversity Literature Enhancement.<em>JISC Digitisation Programme: Enhancing Digital Resources</em>call, for£73,261.</p>
<p dir="ltr">I have taught a range of courses in computing and related subjects, from biological computation at York, to service management at the Open University, via computer graphics and other computing courses at both undergraduate and postgraduate level at Kent.</p><p dir="ltr">At the OU I have chaired modules on virtual team-working since 2003 and co-authored others. I am currently chairing the development and a delivery of a module on IT project and service management; the latter half of the module focussing on ITIL. Also at the Open University I have contributed to courses on Java programming, introductory computing, project management and the environment.</p><h3 dir="ltr">Virtual teamworking</h3><p dir="ltr">I have published extensively in the area of virtual teamworking, building on one of my major teaching commitments at the OU. Issues I have investigated include the identification and dissemination of good practice in encouraging effective team formation and the maintenance of effective team dynamics. The principal motivation for this work has been to improve the experience of students working in virtual teams. I am also interested in the wider context of virtual team working in the workplace and research context.</p><h3 dir="ltr">SysMIC</h3><p dir="ltr">I am a co-investigator on the BBSRC-funded SysMIC project (<a href="http://www.sysmic.ac.uk/"><span style="color: rgb(0,0,255)">http://www.sysmic.ac.uk</span></a>) (Systems Training in Maths, Informatics and Computational Biology). This project has developed and is delivering distance learning courses in the fundamentals of systems biology to postgraduate students. Led by University College London, the consortium includes Birkbeck College, the University of Edinburgh and the Open University.</p>