# «International Trade Efficiency, the Gravity Equation, and the Stochastic Frontier Heejoon Kang and Michele Fratianni Business Economics and Public ...»

International Trade Efficiency, the Gravity Equation, and the Stochastic Frontier

Heejoon Kang and Michele Fratianni

Business Economics and Public Policy

Kelley School of Business

Indiana University

Bloomington, IN 47405

Phone: (812)855-9219

Fax: (812)855-3354

kang@indiana.edu

fratiann@indiana.edu

This version: June 2006

Abstract

In the gravity equation of international trade, bilateral trade flows are regressed on

trading partners’ income and the distance that separates them along with other variables.

This widely used equation is traditionally estimated by the ordinary least squares method.

We employ an alternative technique of stochastic frontier estimation to assess the potential bilateral trade flows from the same gravity equation. Countries are shown to have low efficiencies in their international trade as the predicted trade from frontier estimation is generally far greater than actual trade. Trade efficiencies are computed and ranked for individual countries, ten geographical regions, and eleven regional trade agreements.

Key Words: Efficiency coefficients; OLS residuals; trade gravity; trade potentials.

JEL Classification: F10, F14, C13 Acknowledgement: We thank the Kelley School of Business Center for International Business Education and Research (CIBER) at Indiana University for financial support and Anand Jha for research assistance.

International Trade Efficiency, the Gravity Equation, and the Stochastic Frontier I. Introduction The gravity equation (GE) is widely used in explaining bilateral trade flows. The GE has been derived from diverse international trade models, ranging from models of complete specialization and identical consumers’ preferences (Anderson 1979;

Bergstrand 1985; Deardorff 1998) to models of product differentiation in a regime of monopolistic competition (Helpman 1987) to hybrid models of different factor proportions and product differentiation (Bergstrand 1989) and to models of incomplete specialization and trade costs (Haveman and Hummels 2004). Under complete specialization in production, identical consumers’ preferences and zero barriers to trade, country i imports from all other countries and its import from country j is equal to YiYj/Yw, where Y is income and the subscripts refer to country i, country j, and the world, respectively (see, for example, Deardorff 1998, eq. (2)).

Typically, the ordinary least squares (OLS) estimation of the gravity equation shows the value of R-squared of about 0.65. The actual trade often deviates considerably from the predicted values from the model. Furthermore, the prediction that each country imports from all other countries does not hold in reality. Haveman and Hummels (2004, p. 211) show that four-fifths of importers at the four and five-digit SITC (Standard International Trade Classification) level buy from fewer than 10 per cent of available suppliers. One way to cope with this fact is to introduce trade frictions and let importers purchase from the cheapest exporters. By denoting with tij the ratio of prices paid by country i to prices charged by country j, the amount of imports of i from j becomes equal to YiYj/ tijYw (Deardorff 1998, eq. (11)). Trade frictions are unobservable, but are empirically related to distance and national borders. The relationship between trading costs and distance is assumed to be continuous, whereas the relationship between trading costs and national borders is discontinuous: a sort of jump due, among other things, to differences in legal systems and practices, languages, networks, competitive policies, monetary regimes, tariffs and other restrictions that discriminate against foreign producers.

In addition to income, distance, and borders, the explanatory variables of the GE include a host of other factors that influence bilateral trade. The GE has been very successful in explaining actual trade patterns; in fact, it is considered to be state of the art for the determination of bilateral trade (Leamer and Levinsohn 1995, p. 1384; Feenstra et al. 2001, p. 431). Traditionally, the GE has been estimated by OLS under the assumption that the differences between the actual values and the predicted values, between country i and country j at time t, are purely random; that is, (1) yij,t = f(Xij,t) + εij,t, where εij,t is the disturbance term, assumed to be independently and identically distributed (iid). In (1), yij,t is the actual value of bilateral trade, Xij,t is the vector of explanatory variables mentioned above, and f(Xij,t) the value of predicted bilateral trade. With OLS, we assume that actual values of trade deviate from their predicted values by a random value. Luck or measurement errors may constitute the disturbance term of εij,t. Therefore, this disturbance is sometimes positive and sometimes negative, but, by construction, its average value is zero.

The fundamental question we try to answer in this paper is: what would bilateral trade flows be if countries operated at the frontier of the GE model? It is not obvious that all optimizing agents, countries in our case, can operate at the frontier. To begin with, critical inputs may be missing from the empirical specification of the model: two obvious examples, in this regard, are the managerial input and the country’s infrastructure.

Second, utilization rates of specified inputs may differ across countries because of differences in the quality of institutions. Countries with good institutions have higher marginal input productivities than countries with poor institutions. Finally, trading costs reflect, to some extent, the rent that domestic producers can extract by erecting barriers to trade. Rent extraction will differ across countries and will depend on a host of factors.

Missing inputs, differences in input utilization rates, and differences in rent extraction are for the econometrician a source of misspecification that is hard to correct because it is driven by difficult-to-measure variables. Only by comparing, ex post, the performance of the best against the performance of a particular trading partner can one infer a degree of efficiency of the particular performer with respect to the best possible performer. To be sure, both the representative and the potentially best-performing trading partner are optimizing, but the former faces tighter constraints than the latter.

Newton’s gravity equation shows the maximum force between two masses which are spatially separated. Trade gravity equation in (1) can be interpreted the same way.

Two countries try to maximize their trade given the distance, their economy sizes, and other factors in the equation. Equation (1) can be viewed as a production function of bilateral trade between the two countries. Alternatively, (1) can be viewed as an outcome of cost minimization in which two trading partners try to minimize transaction or transportation costs in international trade. In fact, under the title of “a spatial theory of trade,” Rossi-Hansberg (2005) shows in the development of his general trade theory; how countries make their optimum decisions on mutual trade and argues that “[his} model is consistent with estimations of the gravity equation both within and across countries (p.

1485).” When the trade gravity equation is viewed as the outcome of cost minimization, the use of stochastic frontier estimation is justified.

In sum, to answer our question we need to apply a methodology that is able to differentiate the performance of a given particular trading partner from that of the potentially best, and measure the gap between the two, which we call efficiency. This is the role of stochastic frontier estimation. For example, Aigner et al. (1977), Charnes et al.

(1978), and Schmidt (1985) use stochastic frontier estimation to calculate efficiency scores obtained from the deviation of actual production or cost values from frontier estimates. Zak et al. (1979) apply the same methodology in evaluating efficiency to professional basketball, Porter and Scully (1982) to professional baseball, Huang and Bagi (1984) to farms in Northwest India, Cummins and Weiss (1993) to the U.S.

insurance industry, Zuckerman et al. (1994) to hospitals, Kaparakis et al. (1994) and Berger and Humphrey (1997) to commercial banks, Hay and Liu (1997) to the UK manufacturing sector, and Worthington (1998) to non-bank financial institutions. Lovell (1993) reviews the methods used in other industries. Some researchers have applied stochastic frontier estimation to compare efficiencies and performances across countries.

Allen and Rai (1996) have done it for banks across 15 countries, Beccalli (2004) for investment firms in the United Kingdom and Italy, and Weill (2004) for European corporations.

Under stochastic frontier estimation, εij,t is decomposed into two parts, vij,t and

**uij,t:**

(2) yij,t = f(Xij,t) + εij,t = f(Xij,t) + vij,t − uij,t, where vij,t is assumed to have an iid normal distribution and uij,t to have an iid nonnegative half normal distribution. That is,

closely following Kumbhakar and Knox Lovell (2000, pp. 74-78). We further assume, following the literature, that uij,t and vij,t are distributed independently of each other and of the regressors of Xij,t in (2).

The two-sided error term, vij,t, is the normal statistical noise due to luck or measurement errors, whereas the one-sided error term, uij,t, represents the measure of performance or, in case of production functions, the degree to which actual output falls short of potential output given by the stochastic frontier equation (2). The nonnegative uij,t in (2) represents “efficiency” of a country in its foreign trade arising from its lack of proper infrastructure or managerial expertise. According to Jondrow et al. (1982), technical efficiency for each observation is E[uij,t | εij,t], given the estimate of the residuals in (2) for εij,t from the stochastic frontier method. In particular, from the stochastic frontier estimation of (2), we have the estimates of σv2, σu2, and εij,t. The estimate of the error term, εij,t, is the residual. From the estimation, the following quantities can be computed: σ2 = σv2 + σu2 and λ = σu / σv.

From these estimates, technical efficiency, TE, of each observation is computed by TEij,t = exp{−(σu2 σv2 / σ2) [φ(ηij,t) / {1 − Φ(ηij,t) } – ηij,t]}, (5) where ηij,t = εij,t λ / σ, φ is the standard normal density function, and Φ is the cumulative standard normal distribution function. Once efficiency is computed for each observation, then average technical efficiency can be calculated for any country or for any group of countries.

Several papers have employed the Jondrow et al.(1882)’s or Kumbhakar and Knox Lovell (2000)’s approach. For instance, Hunt-McCool et al. (1996) calculate the maximum stock price for initial public offerings and compare the OLS and stochastic frontier estimates to see if the offerings have been systematically underpriced. Kaparakis et al. (1994) compare cost efficiencies in commercial banks in different U.S. states and calculate technical inefficiencies of each individual bank. Huang and Bagi (1984) compute the level of inefficiency for 151 individual farms in Northwest India and find that it ranges from 4.0 to 22.38 per cent. In all these papers, the dependent variables are expressed in logarithmic terms.

In this paper, we have adopted the normal-half normal distribution of vij,t and uij,t.

Other distributional assumptions can be made. Instead of the half normal distribution, Kumbhakar and Knox Lovell (2000, pp. 80-89) suggest exponential, truncated normal, or gamma distributions and show that the numerical values of the technical efficiency are sensitive to the choice of the particular distributions. Yet, relative efficiency measures

**across observations are shown not to be critically dependent on the particular distribution:**

see Kumbhakar and Knox Lovell (2000, p. 90). Use of the normal-half normal in this paper will provide useful relative efficiencies of a given country or of a given group of countries.

II. The Trade Gravity Equation

(6) ln Tij,t = β0 + β1 ln Yi,tYj,t + β2 (Yi,tYj,t /Ni,tNj,t) + β3 Dij + β4 Fij,t + εij,t, where Tij,t is the value of bilateral trade between country i and country j in year t measured in constant U.S. dollars, Yi,t is real gross domestic product (GDP) of country i in year t also measured in U.S. dollars, Ni,t is population of country i in year t, Dij is distance between i and j, Fij,t is a vector of other factors, and εij,t is a disturbance term.

Equation (6) is the same as the GE derived by Bergstrand (1989, equation 1), except for the fact that Bergstrand’s is expressed in nominal rather than in real terms. Equation (6) is also the same equation used by, among others, Rose (2000, 2002, and 2003).

Vector F includes a fairly comprehensive list of variables that affect bilateral trade, such as dummy variables if the two countries belong to the same regional trade agreement (RTA), and if they share a common currency, common border, common language, common colonizer in the past, or if one country colonized the other. The RTA dummy proxies for a tariff variable indicating preferential trading. Its coefficient should be positive if countries belonging to RTAs trade more than countries that do not belong to an RTA. We also add an “interregional” dummy variable which is equal to one if the two countries belong to two separate RTAs. The coefficient is negative if there is trade diversion. Common currency, common border, common language, and common colonizer are all trade enhancing. Finally, we add year dummies to control for certain idiosyncratic differences in different calendar years.