1. Introduction
Collectively, the gastrointestinal tract makes up one of the largest organs in the body and over 30% of body weight [
1]. It is, therefore, of concern that anomalies of the gastrointestinal tract have been identified in association with prenatal or community cannabis exposure in several studies, including in reports from the Centres for Disease Control (CDC) Atlanta, Georgia with esophageal atresia with or without tracheoesophageal fistula [
2], from Hawaii where pyloric stenosis and anal, rectal, large bowel atresia/stenosis were identified [
3], from Australia where small bowel atresia and stenosis and anal stenosis were identified [
4], and from the USA where esophageal atresia with or without tracheoesophageal fistula, rectal, large bowel atresia/stenosis, Hirschsprung disease and biliary atresia were identified [
5]. Similarly, a relationship between cannabidiol and small bowel atresia and stenosis were positively identified in the USA [
5]. Naturally, these reports considered collectively raise great concern as they identify most of the major organs along the length of the gastrointestinal tract (GIT), namely, esophagus, small and large intestines, anorectum and bile duct. We were naturally keen to study these issues further in the rich European datasets, particularly given that new data on European exposure to cannabinoids have recently been made available [
6].
It is important to appreciate that research over several decades has identified multiple mechanisms by which cannabis exerts its genotoxic effects including grossly abnormal sperm morphology [
7], high rates of abnormal oocyte division and oocyte loss [
8], single- and double-stranded DNA breaks [
9], chromosomal breaks [
10,
11], end-to-end chromosomal fusions and translocations [
7,
12], ring and chain chromosomal formation [
7], double minute chromosomes [
7,
13], micronucleus formation [
13,
14], oxidation of DNA bases [
9], epigenetic changes including reduced histone synthesis and post-translational acetylation and phosphorylation [
15,
16,
17,
18] and altered patterns of DNA methylation with both hyper-methylation and hypomethylation being reported [
19,
20,
21,
22,
23,
24,
25,
26,
27]. Importantly, both the changes to the DNA methylome and those to the histones have been shown to be heritable via sperm and to affect the behaviour and immune response of offspring in rats [
18,
25,
26,
27]. The altered mental development of children prenatally exposed to cannabis has also been reported in all four of the long-term studies to have examined this association [
28,
29,
30,
31,
32,
33,
34], and close relationships with autistic-like intellectual disabilities have also been reported [
27,
35,
36,
37,
38,
39].
It is important to bear in mind in discussing pathways towards and the phenomenology of cannabinoid teratogenesis that this forms a part of the overall picture of cannabis-related genotoxicity, which also includes cannabinoid-induced carcinogenesis [
40,
41,
42,
43,
44,
45,
46,
47,
48,
49] and cannabinoid-accelerated cellular and organismal ageing, which has also been demonstrated clinically [
50,
51]. It is important that this broader literature is also considered in the present context.
One of the key substrates of epigenomic reactions is the metabolic state of the cell and its mitochondrial metabolism. This is because mitochondria not only supply energy and substrates to the nucleus for genomic and epigenomic reactions, but they are also in close communication with the nucleus via mitonuclear and mitohormetic balance and can induce powerful cellular stress reactions when perturbed; thus, the disruption of mitochondrial metabolism necessarily modifies epigenomic stability. Many papers demonstrate that the dose–response relationship of cannabinoids with both genomic mutagenicity and mitochondrial toxicity is strongly exponential [
52,
53,
54,
55,
56]. Moreover, this exponentiation of the dose–response effect has been extended to epidemiological studies where it has been repeatedly demonstrated that the passage from the fourth to fifth quintile of cannabinoid exposure is accompanied by a discontinuous quantum jump in congenital anomaly rates [
5,
57,
58,
59,
60].
This is presumably also well demonstrated by the recent French experience—parts of France where large crops of cannabis are grown have suddenly reported 60-fold increases in the rates of calves and human babies being born without limbs [
61,
62,
63], whilst this has not been reported in Switzerland, which is nearby where cannabis is not permitted to become involved in the food chain. Similar features presumably occur in cannabis-growing parts of the USA where atrial septal defects in states such as Kentucky and Mississippi have suddenly leaped to rates 20 time those of five years ago [
64].
Of concern is the concomitant increase in prevalence rates of cannabis use, increased intensity of use on all or nearly all days, and increasing potency of Δ9-tetrahydrocannabinol (THC) in cannabis preparations—all of which imply a greatly increased community cannabinoid exposure [
6,
65] in a manner which launches society relatively abruptly into a higher cannabinoid dose-exposure range, where genotoxic effects will be more common. Given the multiple earlier reports noted above, the present study investigated continental European trends for digestive system disorders in the context of the changing continental cannabinoid environment in a formal, causal, inferential analytical paradigm and in its native space-time context.
2. Methods
Data: The data, which were analysed in this study on congenital anomalies, were obtained by direct download from the European Network of Population-Based Registries for Epidemiological Surveillance of Congenital Anomalies (EUROCAT) website [
66]. The data variable which was the centre of our analytical attention was the total congenital anomaly rate, which is defined as being the sum total of the total live birth rate, the total miscarriage rate after 20 weeks of gestation and the total number of early termination for anomaly (ETOPFA) which was practised. Thus, this total congenital anomaly rate very usefully captured all forms of live births and major birthing complications.
Nations were chosen based upon the availability of their congenital anomaly data across the period from 2010 to 2019. Data on tobacco and alcohol consumption were sourced from online databases at the World Health Organization [
67]. The unit of tobacco measured was the percentage of daily tobacco usage. The unit of alcohol measured was the amount in litres of pure alcohol consumed per capita per year. Drug-use data were taken from the European Monitoring Centre for Drugs and Drug Addiction (EMCDDA) [
68]. The drugs of interest were amphetamines, cocaine and cannabis. The unit measured for amphetamine and cocaine was the prevalence of use in the last year. The major index of cannabis studied was the prevalence of use in the past month (prior to the completion of the data survey in each country). These data were supplemented by recent published descriptions of the mean THC concentration of the cannabis herb and resin available in each country [
6,
68], which are covariates described, respectively, as Herb_THC and Resin_THC, in the present report. Data on the prevalence of daily cannabis use were also taken from these sources—the data were also available on the EMCDDA website. Data on the median household income (measured in USD) were taken from the World Bank online sources [
69].
Countries were divided in to two groups based on their cannabis status, as described in recent leading epidemiological reports of cannabis use in Europe [
6]. Nations were either assigned to those with high and/or rising levels of daily cannabis use or low and/or falling rates of daily cannabis use based on the levels and use; the trends and results are reported in
Supplementary Figure S4 in reference [
6]. In this manner, Croatia, Germany, Italy, Belgium, the Netherlands, Norway, Portugal, France and Spain were categorized as nations which were undergoing increasing daily use, and Hungary, Bulgaria, Finland, Poland and Sweden were countries which were categorized as experiencing low or falling levels of daily cannabis use.
Derived Data: Because multiple measurements could be used to measure cannabis use and THC exposure, this created some ambiguities for analysis. These indices could evidently be combined in different ways. Hence, the past month cannabis use was multiplied by the THC concentration of cannabis herb and resin to derive a product for each. This metric was then multiplied by the interpolated daily-cannabis-use rate for both herb and resin products to derive further compound indices.
Data Imputation: Missing data were addressed by the use of linear interpolation. This technique was mainly applied to the data on rates of daily cannabis use. The EMCDDA dataset had only 59 datapoints in this dataset for this covariate for all these nations across this period. The dataset was partially completed by linear interpolation and an extra 70 datapoints were added, totalling 129 datapoints in all. Further details are provided in the
Section 3. We were not able to identify any Swedish data for the THC concentration in cannabis resin in any of the years studied. However, it was noticed that the ratio of the THC concentration of cannabis resin to cannabis herb was quite steady in nearby Norway, with a value of 17.7, thus, this ratio was used together with the Swedish data for cannabis herb THC concentration to calculate an estimate of Swedish resin THC concentration. In a similar regard, Polish data for the THC concentration of cannabis resin were also unavailable. The ratio of the THC concentration of cannabis resin to cannabis herb in nearby Germany was available. This was used with the Polish herb THC concentration data to calculate and estimate the concentration of THC in cannabis resin.
Currently, extant geospatial techniques do not permit missing data. For this reason, absent data for Croatia for 2018 and 2019 were completed by the last observation carried forwards method. Absent data for the Netherlands were completed by the last observation carried backwards method. It was not possible to use the techniques of multiple imputation on this data as panel and geospatial methods have not been developed which accept imputed datasets neither at the time of conducting this analysis nor at the time of writing.
Statistics. R Studio version 1.4.1717 based on R version 4.1.1 from the Comprehensive R Archive Network and the R Foundation for Statistical Computing was used for data processing [
70]. The statistical analysis of this data was conducted in December 2021. Data input and wrangling was performed using dplyr from the tidyverse [
71]. Data were log-transformed as needed in the interests of approximating the normal distribution, as indicated by the results of the Shapiro–Wilks test. This test was performed in the R Base module. Graphs were drawn using ggplot2 also from the tidyverse [
71]. ggplot2 was used together with rnaturalearth and sf (simple features) for map drawing [
72]. Colour palettes which were used in maps were taken from the viridis and vidirislite packages [
73]. Some original colour palettes were also used. Bivariate maps were filled with a colour system derived from the colorplaner package [
73]. The illustrations presented are all original. They have not been previously published.
Linear regression was performed in the R Base module. Mixed-effects regression was performed using the R package nlme [
74]. All original full multivariate models were reduced by the classical technique of model reduction in sequential dropping of the least significant term. This yielded a final model where all terms are significant, and this is the model which is presented. Using the R Packages purrr and broom together, it was possible to process multiple linear, mixed-effects or panel models at a single pass by techniques which have been previously published [
71,
75,
76].
Covariate Selection: The existence of multiple covariates as measurements of cannabis use caused a dilemma for statistical investigations in terms of over-controlling, redundant covariates and unnecessary consumption of degrees of freedom. This latter problem would have the effect of forcing the omission of covariate or interactions from initial regression formulae. This issue was thus directly addressed by using random forest regression which was conducted in the R package range [
77]. Tables of relative variable importance were also drawn up for each GCA, which was calculated using the R package vip (variable importance plot) [
78]. The most high-ranking cannabis metrics were then entered into the regression equations which have been presented. These variable importance tables are presented below.
Panel and Geospatial Analysis: Panel analysis was conducted using the R package plm [
79]. Analysis was conducted across both space and time using the “twoways” effect. A weighting matrix for spatial regression was constructed using the edge and corner “queen” relationships (so-called from the analogy regarding the moves of the chess piece of that name) in the R packed spdep (spatial dependencies) [
80]. Geospatial modelling was conducted in spml (spatial panel maximum likelihood) using the spreml (spatial panel random effects maximum likelihood) function which allows for and facilitates detailed modelling of the spatial error structure of the model constructs [
81,
82]. Such models can produce up to four model coefficients which can be used as a diagnostic to determine the factors operating in the error structure of the nominated models. These four coefficients returned are rho, the spatial coefficient; psi, the serial correlation effect; phi, the random error effect; and theta, the spatial autocorrelation coefficient. This process was investigated carefully for each GCA; the optimum model error structure is presented for each GCA. We have also striven to endure that the model error structure is similar across related models for the same GCA so that the fitting performance models of the models can be directly compared. The optimum error structure was determined by the backwards error method, as has been previously described [
83]. Temporal lagging to one or two years was applied to panel and geospatial models, as detailed in the
Section 3.
Causal Inference: The formal techniques of causal inference were employed to quantitatively assess the potentially causal nature of the relationships described. Inverse probability weighting (ipw) is the technique of choice applied to observationally derived data to transform it into a pseudo-randomized framework from which it is entirely appropriate to draw causal conclusions. As has been described in the New England Journal of Medicine and many other sources, it is entirely appropriate to draw causal inferences from such models [
84]. All the panel models which were performed have inverse probability weighting applied to them. The R package ipw was used to calculate the inverse probability weights [
84]. Similarly, E-values (or expected values) quantify the correlation which would be required by some hypothetical extraneous and unmeasured variable, with both the exposure concerned and the outcome in which we were interested in, in order to explain a relationship which might initially otherwise appear to be causally related [
85,
86,
87]. Thus, the E-value provides a quantitative metric for sensitivity analysis to determine the susceptibility of the model to outside variables that were not included in the regression modelling procedures. Associated with E-values is a confidence interval, and the lower limit of this interval is particularly important to causal inference. For this reason, both the E-value estimate and its 95% lower bound is extensively reported in this present paper. The threshold for causality is usually described as being 1.25 [
88]. The E-value for the relationship between tobacco and lung cancer is nine, and this is generally considered to be high [
89]. The R Package EValue was used to calculate E-values from the odds ratios and regression coefficients described in the present report [
90]. Both E-values and inverse probability weighting are very important devices employed in quantitative causal inferential methods and allow for causality to be formally studied and assessed from observational studies performed in the real world.
Ethics: Ethical approval for this study was provided from the Human Research Ethics Committee of the University of Western Australia, number RA/4/20/4724, on 24 September 2021.
3. Results
The plan of analysis for this section is straightforward. Data are first presented in univariate and bivariate form, then the analysis moves to multivariate adjustment first by panel regression, and secondly, by spatiotemporal regression. The advantage of panel regression is that inverse probability weighting can be applied; therefore, data can be analysed in a strict pseudo-randomized, and thus, causal inferential framework. The advantage of spatiotemporal analysis is that the non-random effects of space and time can be formally accounted for in the analysis, thereby formally establishing spatiotemporal relationships. Temporal lagging can be studied in both panel and spatiotemporal regression models. Finally, the formal techniques of quantitative causal inferential analysis are explored, which allow for the presentation to move from the consideration of mere associations to formally assessing the potential role of causality in the relationships which have been observed.
Supplementary Table S1 presents the overview of the dataset in the present analysis. As can be seen, 961 datapoints relating to eight CA’s in the digestive system were downloaded from the EUROCAT database. Most of these anomalies had 122 datapoints in each set. This table also provides information on drug use including various cannabis-exposure metrics and median household income.
As shown in
Supplementary Table S2, the dataset for daily cannabis use was largely incomplete when obtained from the EMCDDA and recently published epidemiological data resources [
6,
68]. The 59 datapoints are listed in
Supplementary Table S2. To enable this important data source to be used in analysis, the missing data were completed by linear interpolation, as shown in
Supplementary Table S3, with the addition of a further 70 datapoints.
Figure 1 shows the relationship of the eight anomalies of interest with the various substances: tobacco, alcohol, daily cannabis use interpolated, amphetamine and cocaine. Interestingly, tobacco and alcohol do not appear to be strongly related to any of these CA’s. Amphetamine exposure does appear to be positively related to annular pancreas and anorectal stenosis or atresia. Cocaine appears to be significantly related to most of the anomalies on this list. Daily cannabis use appears to be positively related to annular pancreas, bile duct atresia, digestive system disorders and Hirschsprungs disease with weaker relationships with some other CA’s.
Figure 2 continues this graphical analysis by listing the various anomalies against the different metrics for cannabis exposure. The last month cannabis use appears to be positively related to bile duct atresia and Hirschsprungs disease. Cannabis herb THC concentration appears to be related to small intestinal stenosis or atresia. Cannabis resin THC concentration appears to be strongly related to anorectal stenosis or atresia and small intestinal stenosis or atresia. Daily cannabis use interpolated shows strong positive relationships with annular pancreas, bile duct atresia and Hirschsprungs disease. Many of the compound metrics derived from these primary covariates also show strong positive slopes, as indicated. Some of the regression lines for bile duct atresia and Hirschsprungs disorder appear to be particularly steep.
The correlations from this Figure are shown graphically in
Figure 3 and listed in
Supplementary Table S6. The significance levels of these correlations are shown quantitatively in
Figure 4 and
Supplementary Table S7 and semi-quantitatively in
Supplementary Figure S4. It appears from these results that many of the congenital anomalies have moderate correlations with various cannabis metrics including bile duct atresia, oesophageal stenosis or atresia, digestive system anomalies, anorectal stenosis or atresia, Hirschsprungs disease, annular pancreas and duodenal stenosis or atresia, and small intestinal stenosis or atresia.
Figure 5 shows the rate of congenital gastrointestinal disorders across Europe over the last decade. As one reviews these maps, it is useful to keep in mind the nations which were noted to have high or rising daily use as described in the
Section 2. Rates seem to have increased across the continent and particularly in France, Spain, Germany and Croatia. They have remained fixed and high in the Netherlands and Norway. It is noted that all of these nations are in the high or rising daily cannabis-use group.
Figure 6 shows the rates of esophageal stenosis or atresia across Europe. Fluctuations in several countries are evident.
The rates of small intestinal stenosis or atresia are depicted in
Figure 7. Increased rates in Spain, Croatia, Sweden, Germany and Bulgaria are apparent. The rates in Italy have declined. The rates in the low countries have fluctuated across this period.
The rates of anorectal stenosis or atresia are illustrated in
Figure 8. The rates are noted to have increased in Spain, Sweden, Poland and Bulgaria. The rates in the Netherlands were often high. The rates in Germany declined across this decade.
The rates of the compound cannabis metric, last month cannabis use: resin THC concentrations over time, are shown across Europe in
Figure 9. The rates have increased in most places, with particularly marked rises in Spain, France and the Netherlands.
Figure 10 is a bivariate plot of the coincident rate of digestive system disorders and the compound cannabis-exposure metric of last month cannabis use: cannabis resin THC concentration. The plot is read by observing the areas which have turned pink or purple, which indicate simultaneously high rates of both covariates. Other colours have meanings, as shown in the colorplane key. Thus, the plot explains clearly the convergence of elevated rates of this cannabis-exposure metric and congenital digestive disorders over most of the continent including France, Spain, Bulgaria and the Netherlands. High rates in nations with rising daily cannabis use are clearly shown.
Figure 11 shows the rates of esophageal stenosis or atresia against the compound cannabis metric of last month cannabis use: cannabis resin THC concentration: daily cannabis use interpolated. The areas of land covered by Spain and France are noted to turn purple towards the end of the decade, indicating simultaneously elevated rates of both covariates.
Figure 12 performs a similar exercise for small intestinal stenosis or atresia. Coincident elevations are not seen in this plot.
When a similar exercise was repeated for anorectal stenosis or atresia, the area of France was noted to have turned purple in the later years of this decade (
Figure 13).
When bile duct atresia was considered along with last month cannabis resin THC concentration and daily use interpolated, the appearance shown in
Figure 14 is seen. The clear emergence of confluent trends in Spain and France are evident. Similar trends are seen in Spain and France for Hirschsprungs disease, as shown in
Figure 15.
Based on recently published epidemiological reports, European nations were categorized as having either high and rising daily cannabis use or low and/or falling daily cannabis use in the last decade [
6]. When all the CA’s are aggregated together, the appearances illustrated in
Figure 16 are shown. Countries in which daily use is increasing have higher rates than those of low and/or falling daily cannabis use. This feature is confirmed at linear regression (β-est. = 0.2273, t = 1.959,
p = 3.15 × 10
−3; model Adj.R.Squ. = 0.0080, F = 8376, df = 1, 959,
p = 0.0032). The E-values applicable to these effects are E-value estimate = 1.88 and minimum E-Value (mEVv) = 1.46.
When this exercise was undertaken for each CA separately, the appearances disclosed in
Figure 17 are found. It appears that for several CA’s, the nations with increasing daily use have higher digestive CA rates. This view was also established at mixed-effects regression (using the anomaly as the random effect; β-est. = 5.49 × 10
−5, t = 2.909,
p = 0.0037; model AIC = 1557.610, Log.Lik. = 158.935).
Supplementary Table S8 presents the 96 regression models implied from
Figure 1 and
Figure 2 along with their slopes, significance levels and E-values. The table is ordered by descending minimum E-values. It is of interest that daily cannabis use interpolated occupies the first entries in the table and that the highest anomalies listed are Hirschsprungs disease, bile duct atresia and digestive system disorders.
When the 26 models with positive regression coefficients and significant
p-values are extracted from this list of models,
Table 1 is obtained. The very high E-values reported here clearly imply that a causal relationship is in operation.
The next step in the analysis is to move into a multivariable framework where the importance of these various covariates can be compared. However, in the presence of 13 variables for substance exposure, it is most appropriate for use in the specific regressions.
This issue was directly addressed using random forest regression followed up with variable importance calculations to generate variable importance tables, of which are listed as
Supplementary Tables S9–S13.
Five specific CA’s were chosen for detailed study for the reasons outlined in the
Section 4.
Supplementary Table S14 presents several final inverse-probability-weighted-panel regression models for esophageal atresia (with or without tracheo-esophageal fistula) which are additive, interactive or temporally lagged, respectively. The inverse probability weighting is a technique of considerable importance in that it allows for the analysis to progress beyond a merely observational study into a pseudo-randomized theoretical conception from which causal inferences can be drawn. It is noted that in this table with additive and lagged models that various terms including cannabis metrics survive model reduction, and thus are significant after controlling for all other covariates, have positive regressions coefficients and are statistically highly significant (from
p = 1.83 × 10
−5).
Supplementary Tables S15–S18 continue this panel analysis for bile duct atresia, small intestinal stenosis or atresia, anorectal atresia and Hirschsprungs disease, as indicated. In each of these models, the metrics for cannabis exposure appear in the final models after model reduction and full adjustment, have positive regression coefficients and are highly statistically significant. The sole exception to this is bile duct atresia at two years of temporal lag.
The next issue relates to the performance of these models in a geospatial regression paradigm which appropriately and adequately controls for methodological issues such as random error effects, serial autocorrelation, spatial correlation and spatial autocorrelation in the data.
Supplementary Figure S5 shows the initial, adjusted and finished geospatial international links which were calculated and modified and used to form the spatial weightings which were employed in the geospatial regression modelling. The figure represents in a map-graphical format that the spatial relationships which are digitized in a sparse matrix form to generate the spatial weight matrix used in the geospatial regression equations.
Table 2 shows the final geo-spatiotemporal models for additive, interactive and temporally lagged models for esophageal atresia. The terms for cannabis exposure appear in the model lagged to two years.
For bile duct atresia, no cannabis terms remain in the final models (
Table 3). However, this may be an artefactual issue related to the fact that 49 of the 110 values were at zero, which is likely a coding artefact. As a result, the error structure of these models was constrained to be an ordinary least squares error structure which is a less sensitive analytical framework.
Geospatial models for small intestinal stenosis or atresia (SISA) and anorectal stenosis or atresia are presented in
Table 4 and
Table 5. In all of the models, listed terms incorporating cannabis metrics appear in the final models after full adjustment, have positive coefficients, are overwhelmingly positive and are highly statistically significant.
The Hirschsprungs disease dataset was also grossly incomplete with 30 values of the total 110 values set at zero, which was clearly a coding artefact. Data for Croatia, Bulgaria, Germany, the Netherlands, Portugal and Belgium were completed by mean substitution for each country. For Bulgaria, all data were absent, so the data were set at the overall mean of the total sample (0.932/10,000 births). When these adjustments were made for data imputation, the pattern seen for SISA and anorectal stenosis or atresia was confirmed in this dataset when adjustments for serial correlation were incorporated into the error structure of the models (
Table 6).
E-values are applicable to each of the multivariable models presented. E-values for panel models are presented in
Table 7 and for geospatial models in
Table 8.
Table 9 lists all 35 together in descending order of the minimum E-value. It is noted that Hirschsprungs disease, SISA and anorectal stenosis or atresia head up the list of anomalies in this table, and the first 10 all relate to daily cannabis exposure. However, the results relating to Hirschsprungs disease should be interpreted with some caution in view of the large number of missing data in this dataset.
In
Table 10, these 35 E-values are listed in descending order of mEVv. From this Table, it is apparent that 25/35 (71.4%) of E-value estimates exceed 9 and are, therefore, in the high zone [
89], and all 35 (100%) exceed the threshold for causality of 1.25 [
88]. For the list of mEVv’s, 19/35 (54.3%) exceed nine and are in the high zone, and 34/35 (97.1%) exceed the 1.25 threshold for causality.
E-values may be listed in order of the anomaly concerned and this listing is shown in
Table 11. Summary data for these E-value results are shown by anomaly in
Table 12 where they are also listed in descending order of median mEVv. Hirschsprungs disease and esophageal stenosis and atresia are noted to move up this table and have high median E-value estimates and median mEVv’s.
Table 9 can also be listed in order of the independent variable term. This is shown in
Table 13. These terms may be grouped by the major primary cannabis-related variable of interest, which is, respectively, daily use, or the THC concentration of cannabis resin or herb. This assignment is coded in as the new added “Group” column in this table.
Table 14 summarizes the selected descriptive statistics from this group and again lists the regression terms in descending order of the mEVv. Daily use moves up this table ahead of herb THC concentration and resin THC concentrations.
These data may be formally compared using Wilcoxon tests with results, as indicated in
Table 15. From this table, it is apparent that the statistical comparisons between daily use and both herb and resin THC concentration are significant for both the estimate of the E-value itself and the mEVv, but the comparisons between herb and resin did not achieve statistical significance.
When interpreting these results, it is important to bear in mind the above-cited shortcoming with the many absent data points, particularly in the Hirschsprungs and bile duct atresia datasets.
4. Discussion
4.1. Main Results
The main result of this study was that seven gastrointestinal CA’s (GCA’s) were significantly related on either bivariate or multivariable testing to different measurements of cannabis consumption. The GCA’s identified at bivariate analysis were bile duct atresia, Hirschsprungs disease, digestive system disorders, annular pancreas and anorectal stenosis or atresia. The two additional GCA’s which were shown to be cannabis-related on the multivariable analysis were esophageal stenosis or atresia and small intestinal stenosis or atresia. Hence, the results from this large and versatile dataset confirmed the findings of other recent published series from many places [
3,
4,
5,
57,
91]. Two new disorders were added to the list which has previously been described (see Introduction), namely, annular pancreas and digestive system disorders generally.
Tobacco and alcohol were not strongly related to any of these CA’s. Amphetamine exposure was positively related to annular pancreas and anorectal stenosis or atresia, and cocaine was significantly related to most of the anomalies.
Interestingly, a companion paper to the present paper was recently published and showed that VACTERL syndrome (vertebral, anorectal, cardiac, tracheo-esophageal fistulae/esophageal atresia, renal and limb anomalies) was strongly and causally linked with European cannabinoid exposure [
57,
92]. In that esophageal atresia is part of the VACTERL syndrome this finding provides evidence from this other reference, confirming the present findings reported herein.
4.2. Main Results in Detail
In the bivariate analysis, it was shown that the relationships of tobacco and alcohol to GCA’s were very weak or negative. The relationship of bile duct atresia with last month cannabis use was strong positive. The relationship of digestive system anomalies with cannabis herb was strong positive. The relationship of anorectal atresia with cannabis resin THC concentration was strong positive. The relationship of daily cannabis use with bile duct atresia and Hirschsprungs disease was strong positive.
Bivariate maps were considered. When the relationship between gastrointestinal disorders and last month cannabis use x cannabis resin THC was considered, both variables were noted to increase together in the Netherlands, France, Spain, Italy and Bulgaria (
Figure 10). When the bivariate relationship of esophageal stenosis or atresia to last month cannabis use x cannabis resin THC concentration x daily cannabis use interpolated was considered, both covariates were noted to increase together in Spain and France (
Figure 11). When the bivariate relationship of large intestinal stenosis or atresia to last month cannabis use x cannabis resin THC concentration x daily cannabis use interpolated was considered, both covariates were noted to increase together in France (
Figure 13).
When countries with featured higher daily cannabis exposure were compared with countries which were not, the overall rate of GCA’s was higher in the former group when considered overall (p = 0.0032) and when considered on a by GCA basis (p = 0.0037).
In the bivariate analysis, five GCA’s were related to metrics of cannabis exposure ordered by median mEVv bile duct atresia (12.5), Hirschsprungs (9.27), digestive disorders (5.48), annular pancreas (1.67) and anorectal stenosis or atresia (1.18) (
Table 1).
When the sequence of GCA’s was considered by inverse-probability-weighted-panel modelling, as follows, they were noted to be significantly linked with cannabis metrics, as indicated: esophageal stenosis or atresia, bile duct atresia, small intestinal stenosis or atresia, anorectal stenosis or atresia, Hirschsprungs disease: 1.83 × 10−5, 0.0046, 3.55 × 10−12, 7.35 × 10−6 and 2.00 × 10−12, respectively. When the same series of anomalies was considered in geospatial modelling, they were significantly cannabis-related from 0.0003, N.S., 0.0086, 6.652 × 10−5, 0.0002.
A total of 25/35 (71.4%) E-value estimates and 19/35 54.3% minimum E-values (mEVv’s) exceeded 9.0, and therefore, were in the higher zones [
89]. A total of 100% E-value estimates and 97.1% mEVv’s were greater than 1.25, which is said to be the cut-off limit for causality [
88]. The order of cannabis sensitivity by median mEVv was Hirschsprungs > esophageal atresia > small intestinal atresia > anorectal atresia > bile duct atresia.
It was fascinating to us that the top position in
Table 1, which lists the significant bivariate regression coefficients, was taken by Hirschsprungs disease, confirming a recent report from the USA, especially in view of the finding detailed below that cannabis dependence is characterized by differential DNA methylation affecting three genes (GDNF, KIF26A and PSAP) which control enteric ganglion formation (supplementary material Page 308,
p = 0.000274).
As noted, there were some shortcomings in the GCA rates. In the Hirschsprungs dataset, 30/122 (24.6%) zeros were replaced by mean substitution to allow for the analysis to proceed. In the biliary atresia dataset, 49/122 (40.2%) of the GCA rates were zero and were not able to be meaningfully substituted. These limitations of the data must be kept in mind when interpreting the present results.
4.3. Choice of Anomalies
The anomalies chosen for further study were selected because they had been identified in previous published series and/or they demonstrated strong signals in the bivariate analysis (
Figure 2).
4.4. Qualitative Causal Inference
The Hill criteria [
93] are accepted criteria for determining when an association may be considered to be causal in nature. Nine criteria are listed, including strength of association, consistency amongst studies, specificity, temporality, coherence with known data, biological plausibility, biological dose–response curve, analogy with similar situations and experimental confirmation. All of these criteria are fulfilled by the five GCA’s positively identified in this study which have previously been described in this literature. We would be inclined to accept the two which have not been previously described based on the external validity of the other results with the published literature overall and described in this report.
4.5. Quantitative Causal Inference
This study has employed two key tools of formal quantitative causal analysis. The first one, inverse probability weighting, answers the common criticism of observational studies—that the various experimental groups are not comparable across the population. Inverse probability weighting is the technique of choice to address this issue and it was applied to all the panel regression multivariable models presented. Its effect is to transform the analysis from a mere observational study into a pseudo-randomized study from which causal inferences may properly be drawn.
A second major criticism of observational studies is that they may be subject to uncontrolled confounding and that the apparently casual effect they report may in fact be due to some extraneous external covariate which has not been controlled. E-values (or expected values) are used to set quantitative constraints on this hypothetical external confounder by determining the degree of association required of it with both the exposure of concern and the outcome of interest. E-values over 9.0 are described as being in the high zone [
89] and values greater than 1.25 are generally said to be required to attribute to causation [
88]. The tobacco–lung cancer relationship has an E-value of nine, so this datum provides some appreciation of the real world meaning of these numbers. E-values also have confidence intervals. The 95% lower confidence interval is of particular interest in this context as it demonstrates quantitatively the confidence which can be had that the E-value estimate is different from unity. Therefore, the generally elevated E-value estimates reported in the present study are 71.4% in the very elevated zone, thus providing confidence regarding the robustness of these results.
4.6. Biological Mechanisms
Because pathophysiological mechanisms are central to the argument for causality, it is worthwhile considering a few of the mechanisms which have been described in some detail. Thus, these findings from the basic laboratory sciences are germane and become central to the present epidemiological discussion.
4.7. Cannabinoid Inhibition of Morphogens
It is important to appreciate that embryonic patterning and morphogenesis happens largely under the control, guidance and specification of gradients of various tissue morphogens. Many of the key morphogens are disrupted by cannabinoids including bone morphogenetic proteins [
94,
95,
96], fibroblast growth factor [
97,
98], sonic hedgehog [
99] and retinoic acid [
100,
101,
102]. This may occur either directly or via epigenomic means, especially in the case of sonic hedgehog.
4.8. Epigenomic Control of Gastrointestinal Morphogenesis
A fascinating serial epigenomic study from Schrott and colleagues was recently published, which is packed with important information pertinent to the present discussion and is, therefore, worth exploring in some detail [
25]. The researchers compared epigenome-wide DNA methylation patterns in human and rat sperm both in cannabis dependence and eleven weeks later in cannabis withdrawal. The period of one human sperm cycle is 11 weeks. Comparisons were made both between groups and over time within subjects.
In the body of their report, investigators noted from functional annotations from Ingenuity Pathway Analysis that pathways relating to growth of an organism, liver lesions and agenesis were differentially methylated regions (DMR’s). During cannabis withdrawal, the featured annotations included organismal death.
The researchers also published a 359-page supplementary appendix providing further details of these findings. Remarkably, there were 112 functional annotations pertaining to gastrointestinal disease. Clearly, there is too many to describe in detail, but by the way of example, one such finding noted 333 genes involved in cannabis dependence in gastrointestinal tumours (page 253, p = 1.02 × 10−15).
There were nine annotations relating to esophageal lesions (e.g., 31 genes, page 324; 16 genes, page 348).
We were not able to identify any annotations for the small intestine.
Many annotations were identified for the large intestines, including malignant large intestinal neoplasm (306 genes p = 7.65 × 10−15 in cannabis dependence, page 257; 258 genes, page 258, p = 1.11 × 10−14; 314 genes, p = 6.00 × 10−14, page 260, cannabis dependence; 313 genes, p = 7.15 × 10−14 page 261; 120 genes, p = 7.45 × 10−6, page 326, cannabis withdrawal; 122 genes 6.79 × 10−5, page 331 cannabis withdrawal; and 121 genes, page 333, p = 0.000108 in cannabis withdrawal).
Gastrointestinal tumour was also noted (74 genes, p = 3.42 × 10−5, pages 328; 121 genes, p = 3.56 × 10−5, cannabis withdrawal).
In view of the strong signal detected in this study for Hirschsprungs disease, which was concordant with other studies [
5], it was fascinating that three genes were identified controlling the enteric ganglion formation (GDNF, KIF26A and PSAP, page 308,
p = 0.000274).
There were two hits for hepatobiliary carcinoma (176 genes, page 284, cannabis dependence; and 179 genes, page 286). Pancreatobiliary tumour was also identified (24 genes, page 288, cannabis dependence).
Pathways in pancreatic disease were also identified in pancreatic cancer (95 genes, p = 1.19 × 10−5, page 297 in cannabis dependence; 106 genes, page 298; and 90 genes, page 299; 85 genes page 300).
From this brief survey, it is clear that epigenomic signals can effectively explain many of the epidemiological findings in the present report.
4.9. Generalizability
Since this dataset utilizes one of the largest and most comprehensive CA datasets in the world, the data are inherently robust. We have also used advanced modelling techniques such as inverse probability weighting, random Forrest regression and geotemporospatial regression to further explore using multivariable adjustment relationships which appeared to be robust at bivariate regression. The study as a whole uses the formal quantitative techniques of causal inferential modelling throughout so that the analytical paradigm within which we are working transitions from the merely observational to the pseudorandomized environment from which it is quite appropriate to draw causal conclusions. Since our study demonstrates causal effects and is consistent with other reports in the world literature, these results are likely to be broadly generalizable with the sole caveat that the results reported for Hirschsprungs disease and biliary atresia likely represent lower bounds on the real effects due to the limitations of these particular datasets.
4.10. Strengths and Limitations
This study had a number of strengths and limitations. Its strengths included using one of the largest and most comprehensive CA datasets in the world and the use of advanced causal inferential modelling techniques, particularly inverse probability weighting and E-values, along with geospatiotemporal regression. We have also used maps to illustrate many key trends, including bivariate maps which allow for the clear and explicit display of bivariate simultaneous trends. We used random forest regression for the covariate selection, which adds discipline and rigour to this complex process. The study’s limitations include the unavailability of individual exposure data to the study’s investigators, of which is a limitation shared by many epidemiological studies. Some exposure data were missing and had to be completed by linear interpolation. This applied especially to the daily exposure data. The quality of some of the GCA data was also questionable. As noted, 24.6% of the Hirschsprungs disease rates were reported as zeros and were replaced by mean substitution. A total of 40.2% of the bile duct atresia data were also zeros, and hence, the analysis was greatly weakened. These technical limitations must be kept in mind when considering the present results.