[shared0]    Shared component model for mapping
                  multiple diseases: Oral cavity cancer and
                  lung cancer cancer in West Yorkshire, UK


Knorr-Held and Best (2001) analysed data on mortality from oral cavity and oesophageal cancer in Germany using a shared component model. This model is similar in spirit to conventional factor analysis, and partitions the geographical variation in two diseases into a common (shared) component ( f ), and two disease-specific (residual) components ( y 1 and y 2 ). Making the rare disease assumption, the likelihood for each disease is assumed to be independent Poisson, conditional on an unknown mean m ik

          Y ik ~ Poisson( m ik )
          log m i1 = log E i1 + a 1 + f i *d + y i1
          log m i2 = log E i2 + a 2 + f i / d + y i2

where Y
ik and E ik are the observed and age/sex standardised expected counts for cancer k in area i respectively, a k is an intercept term representing the baseline (log) relative risk of cancer k across the study region, and d is a scaling factor to allow the risk gradient associated with the shared component to be different for each disease (this is in some sense similar to the factor loadings in conventional factor analysis - see Knorr-Held and Best (2001) for more details). Each of the three components ( f , y 1 and y 2 ) is assumed to be spatially structured with zero mean; the components are assumed to be independent of each other. Knorr-Held and Best (2001) used a spatial partition model as a prior for each component. In this example, we fit a similar model, but assume an BYM convolution prior for each component. Here we re-consider the data on incidence of oral cavity cancer and lung cancer in 126 electoral wards in the West Yorkshire region of England:


          model {

          # Likelihood
              for (i in 1 : Nareas) {
                for (k in 1 : Ndiseases) {
                   Y[i, k] ~ dpois(mu[i, k])
                   log(mu[i, k]) <- log(E[i, k]) + alpha[k] + eta[i, k]
                }
             }

             for(i in 1:Nareas) {
          # Define log relative risk in terms of disease-specific (psi) and shared (phi)
         # random effects
         # changed order of k and i index for psi (needed because car.normal assumes
         # right hand index is areas)
                eta[i, 1] <- phi[i] * delta + psi[1, i]
                eta[i, 2] <- phi[i] / delta + psi[2, i]
             }

          # Spatial priors (BYM) for the disease-specific random effects
              for (k in 1 : Ndiseases) {
                for (i in 1 : Nareas) {
         # convolution prior = sum of unstructured and spatial effects
                   psi[k, i] <- U.sp[k, i] + S.sp[k, i]
         # unstructured disease-specific random effects
                   U.sp[k, i] ~ dnorm(0, tau.unstr[k])
                }
         # spatial disease-specific effects
                S.sp[k,1 : Nareas] ~ car.normal(adj[], weights[], num[], tau.spatial[k])
             }
          # Spatial priors (BYM) for the shared random effects
              for (i in 1:Nareas) {
         # convolution prior = sum of unstructured and spatial effects
                phi[i] <- U.sh[i] + S.sh[i]
         # unstructured shared random effects
                U.sh[i] ~ dnorm(0, omega.unstr)         
             }
         # spatial shared random effects    
             S.sh[1:Nareas] ~ car.normal(adj[], weights[], num[], omega.spatial)

             for (k in 1:sumNumNeigh) {
               weights[k] <- 1
            }
                
          # Other priors
              for (k in 1:Ndiseases) {
                    alpha[k] ~ dflat()
                   tau.unstr[k] ~ dgamma(0.5, 0.0005)
                   tau.spatial[k] ~ dgamma(0.5, 0.0005)
                }
                omega.unstr ~ dgamma(0.5, 0.0005)
                omega.spatial ~ dgamma(0.5, 0.0005)
          # scaling factor for relative strength of shared component for each disease
                logdelta ~ dnorm(0, 5.9)
         # (prior assumes 95% probability that delta^2 is between 1/5 and 5;
                delta <- exp(logdelta)
          # lognormal assumption is invariant to which disease is labelled 1
          # and which is labelled 2)
         # ratio (relative risk of disease 1 associated with shared component) to
         # (relative risk of disease 2 associated with shared component)
         # - see Knorr-Held and Best (2001) for further details
             RR.ratio <- pow(delta, 2)

          # Relative risks and other summary quantities
          # The GeoBUGS map tool can only map vectors, so need to create separate vector
          # of quantities to be mapped, rather than an array (i.e. totalRR[i,k] won't work!)
              for (i in 1 : Nareas) {
                SMR1[i] <- Y[i,1] / E[i,1]         # SMR for disease 1 (oral)
                SMR2[i] <- Y[i,2] / E[i,2]         # SMR for disease 2 (lung)

                   totalRR1[i] <- exp(eta[i,1])         # overall RR of disease 1 (oral) in area i
                   totalRR2[i] <- exp(eta[i,2])         # overall RR of disease 2 (lung) in area i
          # residulal RR specific to disease 1 (oral cancer)
                   specificRR1[i]<- exp(psi[1,i])
         # residulal RR specific to disease 2 (lung cancer)
                   specificRR2[i]<- exp(psi[2,i])         
         # shared component of risk common to both diseases
                   sharedRR[i] <- exp(phi[i])
         # Note that this needs to be scaled by delta or 1/delta if the
         # absolute magnitude of shared RR for each disease is of interest
                   logsharedRR1[i] <- phi[i] * delta
                   logsharedRR2[i] <- phi[i] /delta
                }
         # empirical variance of shared effects (scaled for disease 1)
             var.shared[1] <- sd(logsharedRR1[])*sd(logsharedRR1[])
         # empirical variance of shared effects (scaled for disease 2)
                var.shared[2] <- sd(logsharedRR2[])*sd(logsharedRR2[])
         # empirical variance of disease 1 specific effects
                var.specific[1] <- sd(psi[1,])*sd(psi[1,])
         # empirical variance of disease 2 specific effects
                var.specific[2] <- sd(psi[2,])*sd(psi[2,])            

          # fraction of total variation in relative risks for each disease that is explained
         # by the shared component
             frac.shared[1] <- var.shared[1] / (var.shared[1] + var.specific[1])
             frac.shared[2] <- var.shared[2] / (var.shared[2] + var.specific[2])
          }


Data     (Click to open)

Inits for chain 1     Inits for chain 2     (Click to open)

Results

A 2000 iteration burn-in followed by a further 30000 updates gave the following results


[shared1]

These indicate that for oral cancer, about 75% of the total between-area variation in risk is captured by the shared component, while for lung cancer about 64% of the total between-area variation in risk is captured by the shared component, although the 95% CI for frac.shared are very wide. RR.ratio is slightly less than 1 (although again with a wide credible interval), indicating that the shared component has a slightly weaker association with risk of oral cancer (disease 1) than with risk of lung cancer (disease 2).

Maps showing the spatial pattern of the shared component and the disease-specific residual components are shown below. The map file for this study region is called
WestYorkshire , and is available in the maps directory of GeoBUGS 1.2.



[shared2]


[shared3]

[shared4]