This vignette is written for *baggr* users who want to analyse
binary data. For more general introduction to meta-analysis concepts and
*baggr* workflow, please read `vignette("baggr")`

.

Here, we will show how to:

- Run meta-analysis of odds ratio (OR), risk ratio (RR) or risk differences (RD).
- Prepare standard outputs for the above.
- Convert individual-level data to summary-level and apply corrections for rare events.
- Run meta-analysis directly on individual-level data.
- Model impact of covariates.

There are certain aspects of modelling binary data that are not yet
covered by *baggr*, but may be important for your data:

- Directly modelling parameters relating to proportions (see “Model 3” in this comprehensive tutorial with Stan)
- Understanding biases in reporting
- Modelling data on rates, ordered categorical data and more with generalised linear models. A good overview and examples are provided by NICE DSU at University of Sheffield.

Typically, a meta-analysis of binary data is done on summary statistics of ratios such as \(\log(OR)\) or \(\log(RR)\) or of differences in risks, \(RD\). The reason for this is two-fold: 1) they are the statistics most commonly reported by studies and 2) they are approximately normally distributed. The second assumption needs a bit more attention, as we will show later.

For the running example in this vignette we will use a dataset based on (part of) Table 6 in Yusuf et al. (1985), a famous meta-analysis of the impact of beta blockers on occurrence of strokes and mortality.

```
df_yusuf <- read.table(text="
trial a n1i c n2i
Balcon 14 56 15 58
Clausen 18 66 19 64
Multicentre 15 100 12 95
Barber 10 52 12 47
Norris 21 226 24 228
Kahler 3 38 6 31
Ledwich 2 20 3 20
", header=TRUE)
```

In a typical notation, `a`

(`c`

) are numbers of
events in treatment (control) groups, while `n1`

(`n2`

) are total patients in treatment (control) group. We
can also calculate \(b = n_1 - a\) and
\(d = n_2 - c\), i.e. numbers of
“non-events” in treatment and control groups respectively.

In our examples below we will focus on the OR metric; \(\log(OR)\) and its SE can easily be calculated from the counts above (if you’re not familiar with OR, see here for an introduction):

```
# This is a calculation we could do by hand:
# df <- df_yusuf
# df$b <- df$n1i-df$a
# df$d <- df$n2i-df$c
# df$tau <- log((df$a*df$d)/(df$b*df$c))
# df$se <- sqrt(1/df$a + 1/df$b + 1/df$c + 1/df$d)
# But prepare_ma() automates these operations:
df_ma <- prepare_ma(df_yusuf, group = "trial", effect = "logOR")
df_ma
#> group a n1 c n2 b d tau se
#> 1 Balcon 14 56 15 58 42 43 -0.04546237 0.4303029
#> 2 Clausen 18 66 19 64 48 45 -0.11860574 0.3888993
#> 3 Multicentre 15 100 12 95 85 83 0.19933290 0.4169087
#> 4 Barber 10 52 12 47 42 35 -0.36464311 0.4855042
#> 5 Norris 21 226 24 228 205 204 -0.13842138 0.3147471
#> 6 Kahler 3 38 6 31 35 25 -1.02961942 0.7540368
#> 7 Ledwich 2 20 3 20 18 17 -0.46262352 0.9735052
```

We use `tau`

and `se`

notation for our effect,
same as we would for analysing continuous data with
`baggr()`

. In fact, the model we use for \(\log(OR)\) is the same default “Rubin”
model (Rubin (1974)) with partial pooling.
Once \(\log(OR)\) and is SE have been
calculated, there are no differences between this process and analysis
continuous quantities in *baggr*.

The argument we specified, `effect`

, does not impact
results, but it makes for nicer output labels, as we will see in the
next section.

As a side note, you don’t even have to use the
`prepare_ma`

step. The following one-liner would have worked
just the same:

As with continuous quantities, the prior is chosen automatically.
However, it is important to review the automatic prior choice when
analysing quantities on log scale, because the priors may be needlessly
diffuse (in other words, when using the command above *baggr*
does not “know” that data are logged, although it does try to adjust the
scale of priors).

We summarised our data as \(\log(OR)\). Alternatively, we could work with \(\log(RR)\) or \(RD\). Each model makes different assumptions on how the treatment of interest works. How do we choose the appropriate model for our data?

The basic tool that can help us choose between OR, RD and RR is a
`labbe`

plot (introduced by L’Abbé,
Detsky, and O’Rourke (1987)), which is built into
*baggr*:

```
labbe(df_ma, plot_model = TRUE, shade_se = "or")
#> Warning: There were 5 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: There were 6 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Removed 1 row containing missing values (`geom_line()`).
```

When we make the plot, two Bayesian models are run, one for OR and one for RR. The mean treatment effect from these 2 lines is then used to plot the dotted/dashed lines corresponding to OR and RR estimates. In our case we can see that there is no meaningful difference between the two models, given the size of the effect and estimates.

A very good discussion of these choices and intuitive plots are provided by Deeks (2002) (see especially Figures 1-3).

Authors of Cochrane Handbook for Systematic Reviews of Interventions summarise two important differences between using RD versus RR or OR:

However, the clinical importance of a risk difference may depend on the underlying risk of events in the population. For example, a risk difference of 0.02 (or 2%) may represent a small, clinically insignificant change from a risk of 58% to 60% or a proportionally much larger and potentially important change from 1% to 3%.

The risk difference is naturally constrained which may create difficulties when applying results to other patient groups and settings. For example, if a study or meta-analysis estimates a risk difference of –0.1 (or –10%), then for a group with an initial risk of, say, 7% the outcome will have an impossible estimated negative probability of –3%. Similar scenarios for increases in risk occur at the other end of the scale.

In terms of the L’Abbe plot, RD implies an effect line which runs parallel to the 45 degree line.

As with ORs, risk ratios are also normally distributed and can be
easily calculated from contingency tables (i.e. `a`

,
`b`

, `c`

, `d`

columns). While we will
focus on the OR example below, all of the analytic workflow would be
identical in the RR case—with exception of interpretation of treatment
effect, of course. Before proceeding let us quickly show how risk and
odd ratios compare. If an event is rare (rule of thumb: up to 10%), OR
and RR will be similar. For really rare events there is no difference.
The higher the event rate, the more discrepancy, e.g.

```
a <- 9; b <- 1; c <- 99; d <- 1
cat("Risk ratio is", (a/(a+b))/(c/(c+d)), "\n" )
#> Risk ratio is 0.9090909
cat("Odds ratio is", a*d/(b*c), "\n")
#> Odds ratio is 0.09090909
```

```
a <- 10; b <- 20; c <- 100; d <- 100
cat("Risk ratio is", (a/(a+b))/(c/(c+d)), "\n" )
#> Risk ratio is 0.6666667
cat("Odds ratio is", a*d/(b*c), "\n")
#> Odds ratio is 0.5
```

```
par(mfrow = c(2,3), oma = rep(2,4))
for(es in c(1, .9, .8, .5, .25, .1)){
p_bsl <- seq(0,1,length=100)
p_trt_rr <- es*p_bsl
odds_trt <- es*(p_bsl/(1-p_bsl))
p_trt_or <- odds_trt / (1 + odds_trt)
plot(p_trt_or ~ p_bsl, type = "l",
xlab = "control event rate", ylab = "treatment event rate", main = paste0("RR=OR=",es))
lines(p_trt_rr ~ p_bsl, lty = "dashed")
}
title(outer = TRUE, "Compare RR (dashed) and OR (solid) of the same magnitude")
```

We will return to this topic at the end of the vignette when we discuss leave-one-out cross-validation approaches.

We now continue discussion of analytic workflow using the OR models.
All of the outputs we will show are explained in more detail in the
“main” package vignette, `vignette("baggr")`

. Here, we simply
illustrate their behaviour when applied to binary data.

```
bg_model_agg
#> Model type: Rubin model with aggregate data
#> Pooling of effects: partial
#>
#> Aggregate treatment effect (on logarithm of odds ratio), 7 groups:
#> Hypermean (tau) = -0.18 with 95% interval -0.68 to 0.22
#> Hyper-SD (sigma_tau) = 0.2563 with 95% interval 0.0089 to 0.9051
#> Total pooling (1 - I^2) = 0.78 with 95% interval 0.21 to 1.00
#>
#> Treatment effects on logarithm of odds ratio:
#> mean sd 2.5% 50% 97.5% pooling
#> Balcon -0.133 0.25 -0.63 -0.138 0.37 0.76
#> Clausen -0.153 0.25 -0.64 -0.150 0.35 0.73
#> Multicentre -0.064 0.27 -0.55 -0.081 0.52 0.75
#> Barber -0.211 0.28 -0.82 -0.196 0.30 0.79
#> Norris -0.152 0.22 -0.58 -0.153 0.28 0.67
#> Kahler -0.267 0.34 -1.10 -0.223 0.28 0.88
#> Ledwich -0.193 0.35 -1.00 -0.171 0.44 0.91
```

Default plot (`plot(bg_model_agg)`

) will show pooled group
results. If you prefer a typical forest plot style, you can use
`forest_plot`

, which can also plot comparison with source
data (via `print`

argument):

Hypothetical treatment effect in a new trial is obtained through
`effect_plot`

.

We can transform printed and plotted outputs to show exponents (or other transforms); for example:

```
gridExtra::grid.arrange(
plot(bg_model_agg, transform = exp) + xlab("Effect on OR"),
effect_plot(bg_model_agg, transform = exp) + xlim(0, 3) + xlab("Effect on OR"),
ncol = 2)
```

We can compare with no pooling and full pooling models using
`baggr_compare`

```
# Instead of writing...
# bg1 <- baggr(df_ma, pooling = "none")
# bg2 <- baggr(df_ma, pooling = "partial")
# bg3 <- baggr(df_ma, pooling = "full")
# ...we use this one-liner
bg_c <- baggr_compare(df_ma, effect = "logarithm of odds ratio")
```

We can also examine the compared models directly, by accessing them
via `$models`

. This is useful e.g. for
`effect_plot`

:

```
effect_plot(
"Partial pooling, default prior" = bg_c$models$partial,
"Full pooling, default prior" = bg_c$models$full) +
theme(legend.position = "bottom")
```

You can use leave-one-out cross-validation to compare the models
quantitatively. See documentation of `?loocv`

for more
details of calculation.

Let’s assume that you have access to underlying individual-level
data. In our example above, we do not, but we can create a data.frame
with individual level data since we know the contingency table. In
practice this is not necessary as *baggr* will do the appropriate
conversions behind the scenes, but for our example we will convert our
data by hand. Fortunately, a built-in function,
`binary_to_individual`

can do the conversion for us:

```
df_ind <- binary_to_individual(df_yusuf, group = "trial")
head(df_ind)
#> group treatment outcome
#> 1 Balcon 1 1
#> 2 Balcon 1 1
#> 3 Balcon 1 1
#> 4 Balcon 1 1
#> 5 Balcon 1 1
#> 6 Balcon 1 1
```

We can now use a logistic regression model

Note: as in other cases,

`baggr()`

will detect model appropriately (in this case by noting that`outcome`

has only 0’s and 1’s) and notify the user, so in everyday use, it is not necessary to set`model = "logit"`

. Alternatively, if we wrote`baggr(df_yusuf, model = "logit")`

, the conversion to individual-level data would be done automatically behind the scenes.

Note: The results of this vignette hosted on CRAN have very short run time (

`iter = 2000, chains = 4`

) due to server constraints. We recommend replicating this example yourself.

If we denote binary outcome as \(y\), treatment as \(z\) (both indexed over individual units by \(i\)), under partial pooling the logistic regression model assumes that

\[ y_i \sim \text{Bernoulli}(\text{logit}^{-1}[\mu_{\text{group}(i)} + \tau_{\text{group}(i)}z_i]) \]

where \(\mu_k\)’s and \(\tau_k\)’s are parameters to estimate. The former is a group-specific mean probability of event in untreated units. We are primarily interested in the latter, the group-specific treatment effects.

Under this formulation, odds ratio of event between treatment and
non-treatment are given by \(\tau_k\).
This means, the modelling assumption is the same as in the model of
summary data we saw above. Moreover, *baggr*’s default prior for
\(\tau_k\) is set in the same way for
summary and individual-level data (you can see it as
`bg_model_ind$formatted_prior`

). One difference is that
parameter \(\mu\) is estimated.
However, unless dealing with extreme values of \(\mu\), i.e. with rare or common events
(which we discuss below), this should not impact the modelled
result.

Therefore, the result we get from the two models will be close (albeit with some numerical variation):

```
baggr_compare(bg_model_agg, bg_model_ind)
#>
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> Model 1 -0.676378 -0.178309 0.216140 -0.168606 0.228784
#> Model 2 -0.591938 -0.168874 0.255116 -0.166212 0.215986
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> Model 1 0.00885086 0.256308 0.905122 0.187025 0.252931
#> Model 2 0.00931181 0.254885 0.899649 0.185259 0.242173
#>
#> Posterior predictive effects:
#> 2.5% mean 97.5% median sd
#> Model 1 -1.06702 -0.17638 0.594144 -0.158556 0.420455
#> Model 2 -1.07007 -0.17034 0.645039 -0.156770 0.408368
```

There will be difference in speed – in our example logistic model has to work with 1101 rows of data, while the summary level model only had 7 datapoints. Therefore in “standard” cases you are better off summarising your data and using the faster aggregate data model.

To sum up:

- If you have no covariates and no particular priors on proportion of events in control groups, you are better off using a summary-level model. We will show how you can summarise your data in the next section.
- If your data includes covariates or events are rare, consider the logistic model. We will show why below.

Note: as shown above, you can only create input data for the logistic model from

`a`

,`b`

/`n1`

,`c`

,`d`

/`n2`

columns. You cannot do it from OR data only, as the probability in the untreated is then unknown.

In the previous example we analysed individual-level data and suggested that typically it is sufficient to work with summarised data, e.g. to speed up modelling. Summarising is also essential as it allows us to better understand the inputs and use diagnostics such as L’Abbe plot.

The generic function for doing this in *baggr* is
`prepare_ma`

. It can be used with both continuous and binary
data. In our case we can summarise `df_ind`

to obtain either
odds ratios or risk ratios:

```
prepare_ma(df_ind, effect = "logOR")
#> group a n1 c n2 b d tau se
#> 1 Balcon 14 56 15 58 42 43 -0.04546237 0.4303029
#> 2 Clausen 18 66 19 64 48 45 -0.11860574 0.3888993
#> 3 Multicentre 15 100 12 95 85 83 0.19933290 0.4169087
#> 4 Barber 10 52 12 47 42 35 -0.36464311 0.4855042
#> 5 Norris 21 226 24 228 205 204 -0.13842138 0.3147471
#> 6 Kahler 3 38 6 31 35 25 -1.02961942 0.7540368
#> 7 Ledwich 2 20 3 20 18 17 -0.46262352 0.9735052
prepare_ma(df_ind, effect = "logRR")
#> group a n1 c n2 b d tau se
#> 1 Balcon 14 56 15 58 42 43 -0.03390155 0.3209310
#> 2 Clausen 18 66 19 64 48 45 -0.08483888 0.2782276
#> 3 Multicentre 15 100 12 95 85 83 0.17185026 0.3598245
#> 4 Barber 10 52 12 47 42 35 -0.28341767 0.3779232
#> 5 Norris 21 226 24 228 205 204 -0.12472076 0.2836811
#> 6 Kahler 3 38 6 31 35 25 -0.89674614 0.6643991
#> 7 Ledwich 2 20 3 20 18 17 -0.40546511 0.8563488
```

In both cases the effect (OR or RR) and its SE were renamed to
`tau`

and `se`

, so that the resulting data frame
can be used directly as input to `baggr()`

.

For already summarised data, you can use the same function to move
between different effect measures. For example you can take OR measures
`a <- prepare_ma(df_ind, effect = "logOR")`

and convert
them to RR by using `prepare_ma(a, effect = "logRR")`

.

Consider data on four fictional studies:

```
df_rare <- data.frame(group = paste("Study", LETTERS[1:5]),
a = c(0, 2, 1, 3, 1), c = c(2, 2, 3, 3, 5),
n1i = c(120, 300, 110, 250, 95),
n2i = c(120, 300, 110, 250, 95))
df_rare
#> group a c n1i n2i
#> 1 Study A 0 2 120 120
#> 2 Study B 2 2 300 300
#> 3 Study C 1 3 110 110
#> 4 Study D 3 3 250 250
#> 5 Study E 1 5 95 95
```

In Study A you can see that no events occurred in `a`

column.

We have shown above how `prepare_ma()`

can be used to
summarise the rates (if we work with individual-level data) and
calculate/convert between log OR’s and log RR’s. It can also be used to
apply corrections to event rates, which it does automatically:

```
df_rare_logor <- prepare_ma(df_rare, effect = "logOR")
#> Applied default rare event correction (0.25) in 1 study
# df_rare_logor <- prepare_ma(df_rare_ind, effect = "logOR")
df_rare_logor
#> group a n1 c n2 b d tau se
#> 1 Study A 0.25 120.25 2.25 120.25 120.25 118.25 -2.213996 2.1121593
#> 2 Study B 2.00 300.00 2.00 300.00 298.00 298.00 0.000000 1.0033501
#> 3 Study C 1.00 110.00 3.00 110.00 109.00 107.00 -1.117131 1.1626923
#> 4 Study D 3.00 250.00 3.00 250.00 247.00 247.00 0.000000 0.8214401
#> 5 Study E 1.00 95.00 5.00 95.00 94.00 90.00 -1.652923 1.1053277
```

Note how the output of `prepare_ma`

now differs from the
original `df_rare`

for “Study A”: a (default) value of 0.25
was added, because there were no events in treatment arm. That means
\(\log(OR)=\log(0)=-\infty\). It is
typical to correct
for rare events when analysing summary level data. A great overview
of the subject and how different meta-analysis methods perform is
provided by Bradburn et al. (2007). You
can modify the amount of correction by setting the
`rare_event_correction`

argument.

```
pma01 <- prepare_ma(df_rare, effect = "logOR",
rare_event_correction = 0.1)
pma1 <- prepare_ma(df_rare, effect = "logOR",
rare_event_correction = 1)
pma01
#> group a n1 c n2 b d tau se
#> 1 Study A 0.1 120.1 2.1 120.1 120.1 118.1 -3.061315 3.2392876
#> 2 Study B 2.0 300.0 2.0 300.0 298.0 298.0 0.000000 1.0033501
#> 3 Study C 1.0 110.0 3.0 110.0 109.0 107.0 -1.117131 1.1626923
#> 4 Study D 3.0 250.0 3.0 250.0 247.0 247.0 0.000000 0.8214401
#> 5 Study E 1.0 95.0 5.0 95.0 94.0 90.0 -1.652923 1.1053277
```

It is important to compare different models with different
`rare_event_correction`

values:

```
bg_correction01 <- baggr(pma01, effect = "logOR")
bg_correction025 <- baggr(df_rare_logor, effect = "logOR")
bg_correction1 <- baggr(pma1, effect = "logOR")
bg_rare_ind <- baggr(df_rare, model = "logit", effect = "logOR")
```

```
bgc1 <- baggr_compare(
"Correct by .10" = bg_correction01,
"Correct by .25" = bg_correction025,
"Correct by 1.0" = bg_correction1,
"Individual data" = bg_rare_ind
)
bgc1
#>
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> Correct by .10 -2.23302 -0.676312 0.743550 -0.651132 0.748400
#> Correct by .25 -2.19922 -0.688957 0.862837 -0.702436 0.754231
#> Correct by 1.0 -1.99862 -0.674965 0.627155 -0.667651 0.631653
#> Individual data -2.84815 -0.969296 0.822638 -0.926458 0.864047
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> Correct by .10 0.0329695 1.201240 4.71467 0.765106 1.662980
#> Correct by .25 0.0310607 1.203540 6.15245 0.799620 1.444410
#> Correct by 1.0 0.0238649 0.806168 2.99630 0.598175 0.761054
#> Individual data 0.0466459 1.311940 4.79798 0.963777 1.238470
#>
#> Posterior predictive effects:
#> 2.5% mean 97.5% median sd
#> Correct by .10 -4.56300 -0.632432 3.04614 -0.649115 2.24564
#> Correct by .25 -4.38997 -0.697637 3.28813 -0.707071 2.00664
#> Correct by 1.0 -3.50090 -0.684390 1.88305 -0.646239 1.29757
#> Individual data -5.46903 -0.995414 2.97378 -0.903370 1.97591
plot(bgc1) + theme(legend.position = "right")
```

> Note: The
results of this vignette hosted on CRAN have very short run time
(`iter = 2000, chains = 4`

) due to server constraints. We
recommend replicating this example yourself.

Note in the result above that:

- The heterogeneity estimate (“SD for treatment effects”) is impacted by the corrections
- Mean treatment effect estimates are less prone to the correction, but there is a sizeable difference between the logistic model and the aggregate data models
- As corrections tend to 0, the estimates of \(\log(OR)\) for Study A are more diffuse, but there is still a big difference between the logistic model and the aggregate data model with correction of 0.1.

We will explore the last point in more detail soon. First, however,
let us note that the issue with modelling rare events is not limited to
zero-events. As mentioned, \(\log(OR)\)
is approximately Gaussian. The quality of the approximation will depend
on probabilities in all cells of the contingency table (which we
estimate through `a`

, `b`

, `c`

,
`d`

). Therefore, treating \(\log(OR)\) as Gaussian might lead to
different results in the individual-level vs summary-level models if the
events are rare. With low counts in our example it will definitely be
the case.

Let us generate similar data as above but now without the 0 cell:

```
df_rare <- data.frame(group = paste("Study", LETTERS[1:5]),
a = c(1, 2, 1, 3, 1), c = c(2, 2, 3, 3, 5),
n1i = c(120, 300, 110, 250, 95),
n2i = c(120, 300, 110, 250, 95))
df_rare_logor <- prepare_ma(df_rare, effect = "logOR")
```

```
bg_rare_agg <- baggr(df_rare_logor, effect = "logOR")
bg_rare_ind <- baggr(df_rare, effect = "logOR", model = "logit")
```

Let’s compare again, both on group level but also in terms of hypothetical treatment effect:

```
bgc2 <- baggr_compare(
"Summary-level (Rubin model on logOR)" = bg_rare_agg,
"Individual-level (logistic model)" = bg_rare_ind
)
bgc2
#>
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> Summary-level (Rubin model on logOR) -2.02986 -0.608230 0.755365 -0.606898 0.684658
#> Individual-level (logistic model) -3.62767 -0.810261 0.686309 -0.720548 0.895789
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> Summary-level (Rubin model on logOR) 0.0296311 0.84085 3.27987 0.593070 0.846950
#> Individual-level (logistic model) 0.0383252 1.00042 3.48474 0.732055 0.915191
#>
#> Posterior predictive effects:
#> 2.5% mean 97.5% median sd
#> Summary-level (Rubin model on logOR) -3.26276 -0.585702 2.27612 -0.590107 1.40043
#> Individual-level (logistic model) -4.42009 -0.817129 2.18869 -0.713853 1.59739
plot(bgc2)
```

The results are still a bit different.

So far we’ve been using the logistic model with default priors (that are zero-centered and scaled to data). Our justification was that the priors on treatment effects are the same for both individual-level and aggregate-level data models. However, recall the logistic model includes additional \(K\) parameters, log odds of event in the control arms. (For simplicity we refer to them as baselines.) There are different ways in which we can assign them priors:

- Default independent prior for each baseline, which is scaled to input data
- Custom independent prior for baseline event rates, specified by
using
`prior_control`

argument when callin`baggr`

, using the same syntax as for other priors, e.g.`normal(0, 5)`

. - Assuming hierarchical structure on baselines and specifying a prior
using
`prior_control`

(which is now interpreted as hypermean, rather than \(K\) independent priors) and`prior_control_sd`

(hyper-SD for baseline parameters). To enable the hierarchical structure of baselines we need to specify`pooling_control = "partial"`

(analogously to specifying`pooling`

argument for treatment effects).

Choosing between the three is especially important in the case of rare events, as this will have a potentially large impact on the results. Let’s continue with the example from previous section and consider three models:

- default settings (already fitted above,
`bg_rare_ind`

), - strongly informative independent prior on baselines: centered on 1% event rate, \(\mathcal{N}(-4.59, 2^2)\), which translates 95% confidence that baseline event rate is between about 0.02% and 55%
- hierarchical prior on baselines (mean \(\mathcal{N}(-4.59, 1)\), SD \(\mathcal{N}(0, 2^2)\)).

We exaggerate our prior choices on purpose, to best illustrate the differences.

```
bg_rare_pool_bsl <- baggr(df_rare, effect = "logOR", model = "logit",
pooling_control = "partial",
prior_control = normal(-4.59, 1), prior_control_sd = normal(0, 2))
bg_rare_strong_prior <- baggr(df_rare, effect = "logOR", model = "logit",
prior_control = normal(-4.59, 10))
```

```
bgc3 <- baggr_compare(
"Rubin model" = bg_rare_agg,
"Independent N(0,10^2)" = bg_rare_ind,
# "Prior N(-4.59, 2^2)" = bg_rare_prior2,
"Hierarchical prior" = bg_rare_pool_bsl,
"Independent N(-4.59, 10^2)" = bg_rare_strong_prior
)
bgc3
#>
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> Rubin model -2.02986 -0.608230 0.755365 -0.606898 0.684658
#> Independent N(0,10^2) -3.62767 -0.810261 0.686309 -0.720548 0.895789
#> Hierarchical prior -1.98819 -0.686223 0.454052 -0.667927 0.621197
#> Independent N(-4.59, 10^2) -3.41717 -0.778947 0.853450 -0.706826 1.013460
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> Rubin model 0.0296311 0.840850 3.27987 0.593070 0.846950
#> Independent N(0,10^2) 0.0383252 1.000420 3.48474 0.732055 0.915191
#> Hierarchical prior 0.0177607 0.675636 2.54531 0.475771 0.726157
#> Independent N(-4.59, 10^2) 0.0345881 1.039860 4.02034 0.710351 1.066950
#>
#> Posterior predictive effects:
#> 2.5% mean 97.5% median sd
#> Rubin model -3.47727 -0.614096 2.29151 -0.610400 1.44826
#> Independent N(0,10^2) -4.64276 -0.821637 2.10816 -0.703896 1.58941
#> Hierarchical prior -3.05701 -0.672248 1.59197 -0.651758 1.15323
#> Independent N(-4.59, 10^2) -5.01152 -0.813682 2.59469 -0.713356 1.82096
plot(bgc3) + theme(legend.position = "right")
```

Note: The results of this vignette hosted on CRAN have very short run time (

`iter = 2000, chains = 4`

) due to server constraints. We recommend replicating this example yourself.In particular for hierarchical prior on baselines, we recommend longer runs.

We can see that

- The hierarchical prior offers us balance between the rare event corrections on aggregate data and the logit models without any corrections.
- The default prior choice can have different behaviour in tails (for specific studies) than the prior that explicitly moved the baseline to 1%.
- The differences between different logistic models may be smaller than between all logistic models and the aggregate data model, but there are still differences. Remember that all of the logistic models above make exactly the same assumptions on treatment effect and they differ only in their priors concerning baseline probabilities.

Each of the models considered could now be compared using
`loocv()`

(see the example earlier in the vignette) and their
out-of-sample performance can be assessed.

This section provides only a short example. To learn more about meta-regression see e.g. Baker et al. (2009)

Two types of covariates may be present in your data:

- Covariates that
**change according to group unit**. In that case, the model accounting for the group covariates is a meta-regression model. It can be modelled on summary-level data. - Covariates that
**change according to individual unit**. Then, the model can be called a mixed model. It has to be fitted to individual-level data. Note that the first case can also be accounted for by using a mixed model.

In both cases we only need to name one extra argument to
`baggr`

: `covariates=`

, followed by a vector of
column names in input data. You should remember that your treatment
effect estimate will vary a lot not only with choice of covariates, but
also the contrasts that you use.

```
#let's use the data.frame we created from Yusuf et al earlier
df_ma$study_grouping <- c(1,1,1,0,0,0,0)
df_ma$different_contrasts <- c(1,1,1,0,0,0,0) - .5
bg_cov1 <- baggr(df_ma, covariates = c("study_grouping"), effect = "logarithm of odds ratio")
```

```
baggr_compare("No covariate" = bg_model_agg,
"With covariates, 0-1 coding" = bg_cov1)
#> Compared models use different covariates. Hyperparameters are not directly comparable.
#>
#> Mean treatment effects:
#> 2.5% mean 97.5% median sd
#> No covariate -0.676378 -0.178309 0.216140 -0.168606 0.228784
#> With covariates, 0-1 coding -0.983304 -0.341271 0.252122 -0.333130 0.311910
#>
#> SD for treatment effects:
#> 2.5% mean 97.5% median sd
#> No covariate 0.00885086 0.256308 0.905122 0.187025 0.252931
#> With covariates, 0-1 coding 0.00954359 0.287942 0.997505 0.212520 0.274785
#>
#> Posterior predictive effects:
#> 2.5% mean 97.5% median sd
#> No covariate -1.03299 -0.179489 0.650847 -0.163442 0.434541
#> With covariates, 0-1 coding -1.35930 -0.339569 0.648485 -0.340654 0.519976
#>
#> Mean (SD) for covariates:
#> model study_grouping
#> No covariate 0.331005 (0.456074)
```

To access the covariate estimates, see `fixed_effects()`

function. They are also printed out by default:

```
bg_cov1
#> Model type: Rubin model with aggregate data
#> Pooling of effects: partial
#>
#> Aggregate treatment effect (on logarithm of odds ratio), 7 groups:
#> Hypermean (tau) = -0.34 with 95% interval -0.98 to 0.25
#> Hyper-SD (sigma_tau) = 0.2879 with 95% interval 0.0095 to 0.9975
#> Total pooling (1 - I^2) = 0.74 with 95% interval 0.18 to 1.00
#>
#> Treatment effects on logarithm of odds ratio:
#> mean sd 2.5% 50% 97.5% pooling
#> Balcon -0.007 0.30 -0.62 -0.0033 0.58 0.73
#> Clausen -0.039 0.29 -0.64 -0.0271 0.51 0.70
#> Multicentre 0.051 0.30 -0.54 0.0480 0.68 0.72
#> Barber -0.344 0.32 -1.01 -0.3446 0.26 0.76
#> Norris -0.259 0.28 -0.78 -0.2587 0.30 0.64
#> Kahler -0.436 0.39 -1.31 -0.4086 0.25 0.85
#> Ledwich -0.348 0.41 -1.24 -0.3394 0.44 0.90
#>
#> Covariate (fixed) effects on logarithm of odds ratio:
#> mean sd
#> study_grouping 0.33 0.46
```

Note: in the example above we did not manually set priors for \(\beta\) coefficients. Users can do it by passing argument

`prior_beta`

to`baggr()`

.

Baker, W. L., C. Michael White, J. C. Cappelleri, J. Kluger, C. I.
Coleman, and Health Outcomes, Policy, and Economics (HOPE) Collaborative
Group. 2009. “Understanding Heterogeneity in Meta-Analysis: The
Role of Meta-Regression.” *International Journal of Clinical
Practice* 63 (10): 1426–34.

Bradburn, Michael J., Jonathan J. Deeks, Jesse A. Berlin, and A. Russell
Localio. 2007. “Much Ado about Nothing: A Comparison of the
Performance of Meta-Analytical Methods with Rare Events.”
*Statistics in Medicine* 26 (1): 53–77.

Deeks, Jonathan J. 2002. “Issues in the Selection of a Summary
Statistic for Meta-Analysis of Clinical Trials with Binary
Outcomes.” *Statistics in Medicine* 21 (11): 1575–1600.

L’Abbé, K. A., A. S. Detsky, and K. O’Rourke. 1987. “Meta-Analysis
in Clinical Research.” *Annals of Internal Medicine* 107
(2): 224–33.

Rubin, Donald B. 1974. “Estimating Causal Effects of
Treatments in Randomized and
Nonrandomized Studies.” *Journal of Educational
Psychology*.

Yusuf, S., R. Peto, J. Lewis, R. Collins, and P. Sleight. 1985.
“Beta Blockade During and After Myocardial Infarction: An Overview
of the Randomized Trials.” *Progress in Cardiovascular
Diseases* 27 (5): 335–71.