Expert Defined Networks : Curate a diverse range of known BNs (and underlying data)•Algorithm Defined Networks : Use existing algorithms (Chow Liu, K2, Bdeu etc.) topredict BNs from real data sets and include these ‘known’ BN structure data sets;•Categorical Variable Support : extend the code base to support categorical, non-ordinalvariables within the CNN, e.g. through one hot encoding etc.;•Real Data Testing : A comprehensive assessment of the method (and future refinements)against a set of true BN datasets (with known DAG structures);Conversely a variety of techniques are available to improve the neural networks training quality•CNN generalisation : support the generalisability of the approach, and minimize the re-quirement for individual networks, a single large network is envisioned. By padding/fillingimage a single neural network could cater for BNs predictions from 2 (i.e. the smallestpossible network size) to an indeterminate n;•Tuned pre-processing : vary the pre-processing techniques to improve the signal to noiseration within the training data, for example altering the image compression and normal-isation strategy;•Alternate architectures : Further alterations to the neural network architecture (e.g.increasing width and depth, additional pooling layers etc.);•Transfer learning : Where a pre-trained CNN e.g. the VGG16 architecture(Simonyan &Zisserman 2014b), is re-tuned to this problem space;A comparison of the output network model’s predictive power could also be compared to avariety of common machine learning classification algorithms. While exploration of the relativemerits of BNs versus more modern techniques was out of the scope of this thesis, establishinga baseline of comparison does offer an opportunity to assess the relative impact of a BN’sstructural accuracy. This is especially important as the author’s broad hypothesis is thatnodes that have a strong influence on other nodes (and hence a strong signal for the CNNto observe) are likely to possess more important edges, and hence greater influence over thenetwork’s predictive power. Conversely nodes with very little influence on others are unlikely tobe identified as possessing edges by the CNN and so, while their absence is strictly erroneous,it is in fact of little consequence. Furthermore the presence of such edges becomes problematicfor the network’s interpretability, negatively impacting what is arguably the primary strengthof the BN approach.