Welcome to **RprobitB**! This vignette introduces the R package and defines the underlying model.

With **RprobitB**^{1} you can

analyze choices made by deciders among a discrete set of alternatives,

estimate (latent class) (mixed) (multinomial) probit models in a Bayesian framework,

model heterogeneity by approximating any underlying mixing distributions through a mixture of normal distributions,

identify latent classes of decision makers.

Run `install.packages("RprobitB")`

in your R console to install the latest version of **RprobitB**.

Why the notation **(latent class) (mixed) (multinomial) probit model**? Because **RprobitB** can fit probit models of increasing complexity:

Most basic, modelling the choice between two alternatives (the

**probit model**).Considering more than two alternatives leads to the

**multinomial probit model**.If we incorporate random effects, the model gets the prefix

*mixed*.The most general model is the

**latent class mixed multinomial probit model**, which approximates the mixing distribution through a mixiture of normal distributions.

Assume that we observe the choices of \(N\) decision makers which decide between \(J\) alternatives at each of \(T\) choice occasions.^{2} Specific to each decision maker, alternative and choice occasion, we furthermore observe \(P_f+P_r\) choice attributes that we use to explain the choices. The first \(P_f\) attributes are connected to fixed coefficients, the other \(P_r\) attributes to random coefficients following a joint distribution mixed across decision makers.

Person \(n\)’s utility \(\tilde{U}_{ntj}\) for alternative \(j\) at choice occasion \(t\) is modeled as \[\begin{equation} \tilde{U}_{ntj} = \tilde{W}_{ntj}'\alpha + \tilde{X}_{ntj}'\beta_n + \tilde{\epsilon}_{ntj} \end{equation}\]

for \(n=1,\dots,N\), \(t=1,\dots,T\) and \(j=1,\dots,J\), where

\(\tilde{W}_{ntj}\) is a vector of \(P_f\) characteristics of \(j\) as faced by \(n\) at \(t\) corresponding to the fixed coefficient vector \(\alpha \in {\mathbb R}^{P_f}\),

\(\tilde{X}_{ntj}\) is a vector of \(P_r\) characteristics of \(j\) as faced by \(n\) at \(t\) corresponding to the random, decision maker-specific coefficient vector \(\beta_n \in {\mathbb R}^{P_r}\), where \(\beta_n\) is distributed according to some \(P_r\)-variate distribution \(g_{P_r}\),

and \((\tilde{\epsilon}_{nt:}) = (\tilde{\epsilon}_{nt1},\dots,\tilde{\epsilon}_{ntJ})' \sim \text{MVN}_{J} (0,\tilde{\Sigma})\) is the models’ error term vector for \(n\) at \(t\), which in the probit model is assumed to be multivariate normally distributed with zero mean and covariance matrix \(\tilde{\Sigma}\).

As is well known, any utility model needs to be normalized with respect to level and scale in order to be identified. Therefore, we consider the transformed model

\[\begin{equation} U_{ntj} = W_{ntj}'\alpha + X_{ntj}'\beta_n + \epsilon_{ntj}, \end{equation}\]

\(n=1,\dots,N\), \(t=1,\dots,T\) and \(j=1,\dots,J-1\), where (choosing \(J\) as the reference alternative) \(U_{ntj}=\tilde{U}_{ntj} - \tilde{U}_{ntJ}\), \(W_{ntj}=\tilde{W}_{ntj}-\tilde{W}_{ntJ}\), \(X_{ntj}=\tilde{X}_{ntj}-\tilde{X}_{ntJ}\) and \(\epsilon_{ntj}=\tilde{\epsilon}_{ntj}-\tilde{\epsilon}_{ntJ}\), where \((\epsilon_{nt:}) = (\epsilon_{nt1},...,\epsilon_{nt(J-1)})' \sim \text{MVN}_{J-1} (0,\Sigma)\) and \(\Sigma\) denotes a covariance matrix with the top-left element restricted to one.^{3}

Let \(y_{nt}=j\) denote the event that decision maker \(n\) chooses alternative \(j\) at choice occasion \(t\). Assuming utility maximizing behavior of the decision makers, the decisions are linked to the utilities via \[\begin{equation} y_{nt} = \sum_{j=1}^{J-1} j\cdot 1 \left (U_{ntj}=\max_i U_{nti}>0 \right) + J \cdot 1\left (U_{ntj}<0 ~\text{for all}~j\right), \end{equation}\] where \(1(A)\) equals \(1\) if condition \(A\) is true and \(0\) else.

We approximate the mixing distribution \(g_{P_r}\) for the random coefficients^{4} \(\beta=(\beta_n)_{n}\) by a mixture of \(P_r\)-variate normal densities \(\phi_{P_r}\) with mean vectors \(b=(b_c)_{c}\) and covariance matrices \(\Omega=(\Omega_c)_{c}\) using \(C\) components, i.e. \[\begin{equation}
\beta_n\mid b,\Omega \sim \sum_{c=1}^{C} s_c \phi_{P_r} (\cdot \mid b_c,\Omega_c),
\end{equation}\] where \((s_c)_{c}\) are weights satisfying \(0 < s_c\leq 1\) for \(c=1,\dots,C\) and \(\sum_c s_c=1\).

One interpretation of the latent class model is obtained by introducing variables \(z=(z_n)_n\) allocating each decision maker \(n\) to class \(c\) with probability \(s_c\), i.e. \[\begin{equation} \text{Prob}(z_n=c)=s_c \quad \text{and} \quad \beta_n \mid z,b,\Omega \sim \phi_{P_r}(\cdot \mid b_{z_n},\Omega_{z_n}). \end{equation}\]

We call this model the **latent class mixed multinomial probit** model.^{5}

The package name

**RprobitB**is a portmanteau, combining**R**(the programming language),**probit**(the model class) and**B**(for Bayes, the estimation method).↩︎For notational simplicity, the number of choice occasions \(T\) is assumed to be the same for each decision maker here. However,

**RprobitB**allows for a different number of choice occasions for each decision maker.↩︎**RprobitB**provides an alternative to fixing an error term variance in order to normalize with respect to scale by fixing an element of \(\alpha\).↩︎We use the abbreviation \((\beta_n)_n\) as a shortcut to \((\beta_n)_{n =1,...,N}\) the collection of vectors \(\beta_n,n=1,...,N\).↩︎

Note that the model collapses to the (normally) mixed multinomial probit model if \(P_r>0\) and \(C=1\), to the multinomial probit model if \(P_r=0\) and to the basic probit model if additionally \(J=2\).↩︎