The present study used a more efficient and consistent approach, a two-step Bourguignon, Fournier, and Gurgand (BFG) approach, due to the classification of the respondents into four groups as shown earlier. Ma and Abdulai [36] and Khanal and Mishra [37] adopted this approach due to the classification of their respondents into different categories. This two-stage selectivity correction approach method was put forward by Bourguignon, Fournier, and Gurgand [38] to discover and address selectivity effects created by different choices/classification. Park and others [39] explained that the BFG approach for dealing with selectivity effect has advantages over others because it does not only show the direction of the bias, but also the source of the bias.
Bourguignon, Fournier, and Gurgand (BFG)
The foremost procedure in our two-step approach consisted of estimating the factors that influence the use of the selected technologies as well as selectivity correction factors to address sample selection bias. We used Multinomial Probit (MNP) model to determine the factors influencing adoption in the first stage of the estimation process. The empirical MNP model to examine the effects of the socio-economic variables on the choices of improved rice technologies (y) is given as follows:
$$Y_{i} = 1 \ldots j = \beta _{0} + \beta _{1} \chi _{{1i}} + \beta _{2} \chi _{{2i}} + \ldots + \beta _{n} \chi _{{ni}} + \varepsilon$$
(1)
where \(\beta _{0}\) is the intercept, \(\beta _{{1 - n}}\) are the coefficients of the various explanatory variables, \(\chi _{{i - n}}\) are the various explanatory variables and ε is the error term. The explanatory variables: age, education, gender, number of infants, number of adults, time allocated to other economic activities, total rice land holding, number of hours of family labour, total livestock unit and hours of communal activities are continuous non-negative variables, while marriage, land ownership, taking part in relevant extension training programmes, perceptions of capital intensive and labour-demanding nature of improved technologies are dummy variables coded with 1 for yes and 0 if otherwise. All these variables were included in the first stage (in the selection equations) of the BFG estimation. However, in the second stage, which is the focus of this study, number of years in schooling, marriage, number of hours of communal activities, hours of engaging in other economic activities, farmers perceptions of capital intensive and labour-demanding nature of the chosen improved rice technologies were excluded since these variables are more associated with the first stage which is the selection equation (Table 1 provides a detailed description of the variables).
The second step to determine the effect of the uptake of the selected technologies on rice farmers’ net revenue was the focus of the current study. The net revenue was obtained by subtracting the cost of producing paddy rice per hectare of rice field from gross revenue obtained from the sales of the paddy rice produced from the rice field. At the time of the field work, gh¢3.8148 = $1. If the choice to use a technology is not dependent on the factors that are associated with a farmer's income, then standard OLS regression, devoid of misspecification, would have provided an unbiased estimate of the “average treatment effect” associated with technology usage. Nevertheless, the assumption of independence raises concerns. Therefore, we used the OLS analysis and added the selectivity correction factors produced in stage one of the BFG model estimation. Our OLS regression was specified to determine the factors that impacted on rice farmers’ net revenue while other important variables were controlled. As a result, the following specification linking rice farmers' net revenue to farmer and characteristics of household as well as institutional factors was estimated. Thus, we assumed that net revenue of respondents is a linear function of a vector of explanatory variables (\(X_{{ij}}\)) and user dummy (\(C_{{ij}}\)) estimated as follows:
$$Y_{{ij}} = {\beta }X_{{ij}} +{\delta }C_{{ij}} + {\mu }_{i} ,$$
(2)
where \(Y_{{ij}}\) is the net revenue computed in Ghana Cedis (gh¢), for uptake of improved seed (\(j = 1\)), fertilizer (\(j = 2\)) improved seed and fertilizer combined (j = 3); δ and β are parameters to be estimated; \({\mu }_{i}\), \(n_{i}\) denotes residual term in Eqs. (2) and (5) that fulfils \({\mu }_{i}\)∼ N(0, σ).
The main variable of interest is the net revenue of the rice farmers. The subsequent variables, the explanatory variables (\(X_{{ij}}\)), are introduced as controls. Age is the number of years of the respondent. Adult Size is the number of adult persons in the household; Family Labour hours is the total family members labour participation in terms of hours; Farm size is the total area of the farmer's land holdings; Attend Relevant Extension Training is a dummy variable showing participation in relevant extension training; child refers to number of children below 5 years; and the location dummy variable takes the value of 1 for the Ashanti region.
The choice of the selected technologies is expected to increase rice farmers’ net revenue. However, its impact on other crop income is a priori ambiguous. Adult size (holding constant proportion of Labour) is expected to impact positively on farmers’ income. Farm size represents inputs into the farm production function, so that its increase is foreseen to result in higher output. The partaking in relevant extension training is believed to increase farmers’ incomes (At the time of the field work, 1$ = gh¢ 3.8148). The resources required to raise children and livestock, are expected to decrease rice farmers income (holding all other things constant). The location dummy variable depicts unmeasured characteristics of the agricultural inputs’ quality, and its effect is a priori ambiguous.
The problem of selection bias is encountered if unobservable characteristics affect residual term and results in a correlation between the residual terms, i.e. corr \((n_{i} ,{\mu }_{i} ){ \ne }0\) if any of the options are chosen (\(j = 1\)), the outcome equation for net revenue, \({\gamma }_{1}\) is specified as:
$$\gamma _{1} = X{\beta }_{1} + {\delta }_{1} \left[ {\rho _{1}^{*} m(P_{1} ) + \rho _{2}^{*} m(P_{2} )\frac{{P_{2} }}{{P_{2} - 1}} + \rho _{3}^{*} m(P_{3} )\frac{{P_{3} }}{{P_{3} - 1}} + \rho _{4}^{*} m(P_{4} )\frac{{P_{4} }}{{P_{4} - 1}}} \right] + {\omega }_{1} .$$
(3)
In order to obtain an unbiased and consistent estimation, we simultaneously introduced the selectivity correction terms \(\left( {\eta _{1}^{*} {,}\eta _{2}^{*}{,}\eta _{3}^{*}{,}\eta _{4}^{*} } \right)\) estimated in the first-stage in Eq. (4) below:
$$\gamma _{1} = X{\beta }_{1} + {\delta }_{1} \left[ {\rho _{1}^{*} m(P_{1} ) + \rho _{2}^{*} m(P_{2} )\frac{{P_{2} }}{{P_{2} - 1}} + \rho _{3}^{*} m(P_{3} )\frac{{P_{3} }}{{P_{3} - 1}} + \rho _{4}^{*} m(P_{4} )\frac{{P_{4} }}{{P_{4} - 1}}} \right] + {\eta }_{1}^{*} { + \eta }_{2}^{*} { + \eta }_{3}^{*} { + \eta }_{4}^{*} { + \omega }_{1} ,$$
(4)
where \(m(P_{1} )\), \(m(P_{2} )\), \(m(P_{3} )\) and \(m(P_{4} )\) are the conditional expectations, \(\eta _{1}^{*} ,\eta _{2}^{*}{,}\eta _{3}^{*} {\text{ and }}\eta _{4}^{*}\) are termed selectivity effects; the standard deviation of the error term from the net revenue equation is denoted as σ; \(\omega _{1}\) is the residual term and \(\rho\) signifies correlation coefficients between η and μ.
The selectivity correction terms in Eq. (4) can be interpreted econometrically as follows: (1) if at least one of the terms is significant, showing the presence of sample selectivity effects as a result of unobservable factors; and (2) when there are insignificant selectivity terms, indicating the absence of selectivity effects. In the first instance, the Endogenous Switching Regression (ESR) model becomes the best choice to determine the causal effect of the given choice of the technology. However, in the second situation, the probit model and the PSM approach becomes the most appropriate method in assessing the related causal effects [40, 41].
The ESR model
From the above outcome and choice equations specified, the respective relationship between the two regimes is represented as:
$$C_{1}^{*} = Z{\gamma }_{1} +{\eta }_{1} ,$$
(5)
$$Y_{1} = X{\beta }_{1} + {\varphi }_{1} \quad {\text{if }}C_{1} = 1,$$
$$Y_{0} = X{\beta }_{0} + {\varphi }_{0} \quad {\text{if }}C_{0} = 0,$$
where \(Y_{1}\) is net revenue given that any of the options are selected (\(j = 1\)), and \(Y_{0}\) is net revenue derived from the non-selection of any of the options (\(j \ne 1\)); X represents the exogenous variable vector of exogenous variables that affect the net revenue; \(\varphi _{1}\) and \({\varphi }_{0}\) are residual terms, with the mean of zero and is normally distributed.
After the model was estimated in Eq. (5), the inverse Mills ratios \(\lambda _{1}\) and \(\lambda _{0}\), and the co-variance terms \(\sigma _{{n1}} = \omega v(\eta _{1} ,\varphi _{1} )\) and \({\sigma }_{{{\eta }0}} = {\omega }v({\eta }_{1} ,{\varphi }_{0} )\) were calculated and substituted into Eqs. (6) and (7):
$$Y_{1} = X{\beta }_{1} + {\sigma }_{{{\eta }1}} {\lambda }_{1} + {\zeta }_{1} \quad {\text{if }}C_{1} = 1,$$
(6)
$$Y_{0} = X{\beta }_{0} + {\sigma }_{{{\eta 0}}}{\lambda }_{{0}} + {\zeta }_{{0}} \quad {\text{if }}C_{0} = 0,$$
(7)
where \(\lambda _{1}\) and \(\lambda _{{0}}\) are used to control for selection bias as a result of the unobservable factors which includes farmers’ inheritability and local institutional environment; the error terms \({\xi }_{1}\) and \({\xi }_{{0}}\) have conditional means of zero.
The effect of uptake of any of the technologies on net revenue was examined by specifying expected values of the outcomes. The variation in the specified outcome equation as a result of a specific choice relative to another choice is specified as the difference between the two options. These outcomes are termed as Average Treatment Effect on Treated (ATT).
The \(ATTt_{{ATT}}^{{ESR}}\) in this case is:
$$t_{{ATT}}^{{ESR}} = E\left[ {Y_{1} {|}C_{1} = 1} \right] - E\left[ {Y_{0} {|}C_{1} = 1} \right] = X({\beta }_{{1}} { - \beta }_{{0}} ) + ({\sigma }_{{{\eta 1}}} { - \sigma }_{{{\eta 0}}} ).$$
(8)
The PSM technique
PSM could be written as:
$$\Pr (X_{1} ) = \Pr (C_{1} = 1{|}Z_{1} ) = E(C_{1} {|}Z_{1} ),$$
(9)
where \(C_{1} = \left\{ {0,1} \right\}\) represents an indicator for selecting the given type of option (\(j = 1\)) and \(Z_{1}\) is the pre-choice characteristic vector. We then estimated the ATT,\(t_{{ATT}}^{{PSM}}\) as shown below after estimating the propensity scores:
$$t_{{{\text{ATT}}}}^{{{\text{PSM}}}} = E_{{p\left( {z_{1} } \right)D_{1} = 1}} \left\{ {E\left[ {\left( {Y_{1} {|}D_{1} = 1,P\left( {Z_{1} } \right)} \right)} \right] - E\left[ {\left( {Y_{0} {|}D_{1} = 1,P\left( {Z_{1} } \right)} \right)} \right]} \right\}$$
(10)
Various approaches have been adopted to match the selected adopters and non-adopters of the similar propensity score (nearest neighbour matching (NNM), kernel-based matching (KBM) and radius matching methods. For robustness check, a joint consideration of the matching techniques was done (attached as Annex 1, Table 9). It is worth mentioning that none of the proposed methods in the literature is a priori superior to the others. As a result, we interpreted the results generated from radius matching approach.
Estimation of Gini coefficients
In order to address the second objective of the study, we simulated what farmers' incomes would be without technology usage (estimated the incomes of non-users), after estimating the income of users of the selected technologies. We then calculate the Gini coefficient for the two scenarios: (i) technology users; and (ii) non-users of the technology. The Gini coefficient was computed using the formula as illustrated below:
$$G_{c} = \frac{1}{n}\left[ {n + 1 - 2\left[ {\frac{{\mathop \sum \nolimits_{{i = 1}}^{n} \left( {n + 1 - i} \right)y_{i} }}{{\mathop \sum \nolimits_{{i = 1}}^{n} y_{i} }}} \right]} \right].$$
(11)
This may be simplified to:
$$G_{c} = \frac{{2\mathop \sum \nolimits_{{i = 1}}^{n} iy_{i} }}{{n\mathop \sum \nolimits_{{i = 1}}^{n} yi}} - \frac{{n + 1}}{n},$$
(12)
where \(G_{c}\) = Gini coefficient; \(\gamma _{i}\) = number of individuals; \(yi\) = income or wealth of individual farmer, for a population uniform on the values \(yi\) when \(~i = 1~to~n\), indexed in an increasing order \(~\left( {y_{i} \le y_{i} + 1} \right)\). The value of Gini index ranges from 0 to 1 ranging from complete equity to complete inequity of income distribution. In this study, this equation was used to calculate the Gini coefficient without direct reference to the Lorenz curve.
Farmers’ rice income distribution gives a picture of how the net revenue from rice production is shared between users and non-users of the selected agricultural technologies in the study area. According to Buchan [42], the Gini coefficient is a significant indicator for estimating a society’s income distribution, a significant attribute that reflects economic sustainability of that society.