Thesis Defence: Brandon Heung

April 19, 2017

PhD Thesis Defence

  • Name: Brandon Heung
  • Date: April 19, 2017
  • Time: 11:00 am
  • Location: Library Thesis Defence Room (WAC 2020)
  • Thesis Title: "Regional-Scale Digital Soil Mapping in British Columbia using Legacy Soil Survey Data and Machine-Learning Techniques"
  • Examining Committee
    • Margaret Schmidt, Senior Supervisor
    • Anders Knudby (U Ottawa), Supervisor
    • Chuck Bulmer (BC Ministry of Forests), Supervisor
      Suzana Dragicevic, Internal Examiner
    • Brian Klinkenberg (UBC), External Examiner


Digital soil mapping (DSM) is the intersection of geographical information systems (GIS), and (spatial) statistics and is a sub-discipline of soil science that has been increasingly relevant in helping to address emerging issues such as food production, climate change, land resource management, and the management of earth systems. Even with the need for digital soil information in the raster format, such information is limited for British Columbia (BC) where much of it is digitized from legacy soil survey maps with inherent spatial problems related to polygon boundaries; attribute specificity due to multi-component map units; and map scale where small-scale surveys have limited use in addressing local and regional needs. In spite of these issues, legacy soil survey data are still useful as sources of training data where machine-learning techniques may be used to extract soil-environmental relationships from a survey and a suite of digital environmental covariates.
This dissertation describes a framework for developing training data from conventional soil survey maps and compares various machine-learning techniques for predicting the spatial patterns of qualitative soil data such as soil parent material and soil classes. Results of this research included maps of soil parent material, Great Groups, and Orders for the Lower Fraser Valley and a soil Great Group map for the Okanagan-Kamloops region at a 100 m spatial resolution. Key findings included (1) the recognition of Random Forest being the most effective machine-learner based on two model comparison studies; (2) the conclusion that model choice greatly impacted the accuracy of predictions; (3) the method for developing training data greatly impacted the accuracy through a comparison of four methods; and (4) that training data derived from soil survey maps were more effective in representing the feature space of various classes in comparison to using training data derived from soil pits. This study advances the understanding of model selection and training data development in DSM and may facilitate the future development of methodologies for provincial maps of BC.