Least-squares estimation of trip distribution parameters

Least-squares estimation of trip distribution parameters

LEAST-SQUARES ESTIMATION OF TRIP DISTRIBUTION PARAMETERS FRANK J. CESARIO Cornell University. Ithaca. New York 14850. U.S.A. Abstract-A least-squ...

438KB Sizes 2 Downloads 72 Views




New York

14850. U.S.A.

Abstract-A least-squares approach for estimating parameters of trip distribution models commonly used m urban and regional transportation studies is developed. For a small reference system with three origms and three destinations parameters estimated by least squares are compared to parameters estimated b) the maximum likelihood procedure developed by Hyman and Evans. Some ideas for further research are given. 1. INTRODUCTION


Recent spatial interaction literature has contained some important and interesting approaches for parameter estimation which systematically make use of observed interaction data (see Hyman. 1969; Evans, 1971: Batty and Mackie, 1972). These approaches have tended to focus on the employment of the principle of maximum likelihood. In this paper, the alternative principle of least squares is examined. Specifically, a method for obtaining least-squares estimates of spatial interaction parameters is developed and results from using this technique are compared with results obtained from maximum-likelihood estimation. Trip-making for some specified purpose will be the interaction of interest. 2. A LEAST-SQUARES APPROACH In typical fashion. we let I,, represent the number of trips observed to take place between origin i and destination j during some time period of finite duration. If there are Iv’ origins and .bf destinations we can write 0, = il: I,, .i=l,? ,=


&,,.. i=l.?. I=

. . . . . IV


..,.. M




to denote the total number of trips emanating from origin i and the total number of traps terminating at destination j. respectively. The total number of trips made in the system will be denoted by M! The above quantities are depicted in Table 1. In addition. if we let ci, represent the generalized cost of travel between i and j. the total cost of travel in the system. C. is given by C= for the observed

i flt,(‘,, ,=I ,‘I

trip distribution.


1. Trip distribution







tableau .,I z ‘i,



Let ii, be an estimate of tij. A general class of trip distribution models can be written (Hyman. 1969: Wilson, 1971; Cesario, 1973): ii, = CC’, t;J‘(c,,)


where G is a constant of proportionality: L’, is an originspecific term; Vj is a destination-specific term; and./ is an arbitrary function with one or more unknown parameters. The terms Cri and t; are subject to several different interpretations, but in this paper it will be assumed that these quantities are unknown and must be estimated from the data. For convenience. we assume sets IL’,; and [ Vj) to be each specified up to a multiplicative constant (Cesario, 1973). It has been shown by Hyman (1969) and by Evans (1971) that if tho t,,‘s are considcrcd as independent observations and y,, represents the probabihty that a trip is made from origin i to destination;, it follows that rij’s can be considered as samples from a multinomial distribution. The maximum likelihood estimates of Ui and 5 are therefore obtained by maximizing the appropriate likelihood function subject to any relevant constraints (see the appendix for more details). Another

F. J.



approach might involve the principle of least squares. At the very least, it would be informative to compare results obtained by least squares with those obtained by this maximum-likelihood estimation procedure. Thus, Iii = GU,~.f(c,j)

+ ‘i,


may be taken to represent the data, where e,, is an error term. The least-squares principle involves minimization of the function .i, ji,

’ M e;. = i x (fij - fij)’ i=l j=1 N M = 2 1 [rij - GUiI/,.f(cij)]’ t=1 j=l


subject to any relevant constraints. While no further assumptions need to be made in order to obtain point estimates of parameters, it should be pointed out that in the special case where it is assumed that the eij terms are independent and normally distributed with a zero mean and constant variance, least-squares estimates of parameters are also maximum-likelihood estimates (Draper and Smith, 1967). Such an assumption is required in order to “test” parameters for statistical significance. [Because these new distributional assumptions (i.e. normally-distributed errors) differ from those of Hyman and Evans, who used multinomial probabilities, we do not expect estimates obtained by both approaches to agree.]

estimation process. The required satisfaction of these constraints is viewed to be relevant only in the case when no ri, data are observed. When such data are available the goal is to obtain “good” estimates of ri,. It follows that the “better” the estimates, the more closely will the above constraints be satisfied. Nevertheless, it will be seen that unconstrained minimization of (7) will naturally result in a certain degree of consistency between observed and estimated origin and destination totals. Performing the appropriate optimizing calculations on (7) we get

= LT,g r/;.fPij: /I,’ ,= 1


(cc”, - “; t




b) = 0






as first order conditions. (It can easily be shown that the second order conditions necessary and sufficient for a minimization of S hold.) The above equations

can be rewritten


li 1


/I) = 0


c’:_f(C(,; /Y)’- i tijUj,/(Cjj;



Etijv,f(c,,; ,=




Assume a cost function with single parameter p-for example,f(ci,) = ct orf(cij) = exp(pcii). We now denote the cost function byS(cij; /I). Further, with no loss of generality (since Cri and 5 are each specified up to multiplicative constants) we can let G = I in (4). Least-squares estimates of parameters are now’ obtained by minimizing with respect to (Cli). ( V,j and fl the function s = i: t (fij - L’if$f(c,,; B))Z ,=I j=1


subject to any relevant constraints. At this point we note that the usual constraints are denoted by (I). (2) and (3) where tij is replaced by Iii. That is. it is generally required that the sum of the estimated numbers of trips emanating from each origin and terminating at each destination equal the corresponding observed sums. and that the estimated total trip cost equals the observed total (Wilson, 1970). It is important to point out that the use of these constraints is bound in tradition and at the outset of the data analysis there is no compelling reason why they should be imposed on the





Making use of some of the Evans (1971) results which show that S increases monotonically with b, it has been found that the solution of simultaneous nonlinear equations (1 l), (12) and (I 3) can be obtained efficiently by the iterative procedure depicted in the flow chart of Fig. 1. An initial p, /I’“), and an initial destination vector, V’“‘. are selected arbitrarily. An initial origin vector U’“’ is then calculated using (11). Then, alternately using equations (12) and (11) successive vectors [UC”. V”)]. [I?“, V”‘] [U’“‘. Vk’] are computed until convergence to a stable solution is achieved for the < 6; Iv”’ _ v”-“1 < 6 given /j (that is, )I?’ - u”- 111 for small ~5).Next, an error sum of squares S(O) is cal-


of trip distribution



sums of row and column elements.

If we let


kijA [ii

where < 1 k,, = 1 > 1

if if if

2,, < tLj i,, = ti, iii > tij

we can write i=l,?,...,N


; If, = ;$r z,


Search interval ( P2’ , p’*‘, for






M. )...,



From (17) and (18) it can be inferred that the kiis along any row or column of the matrix (2,,) cannot all be greater than 1 or less than 1; unless a “perfect fit” is obtained (in which case all kij = 1) it will always be the case that for every origin and for every destination some trip quantities will be overestimated and some trip quantities will be underestimated.



Fig. 1 culated using (7). The value of PC”)is incremented to fi”’ and the above procedure is repeated to obtain an estimate S(r). The value of /IY’)is now incremented to /3”‘, the direction of change being given by the sign of the difference between S”’ and So’ [b(r) is increased if S”) > So) and decreased if S”’ < SO)]. Now that the appropriate direction for changes in /I’has been found, the above process is repeated until. eventually. it is the case that [S- ” - Sk- “1 is of a different sign than [S’k-l) - Sk)]. This indicates that the optimal fl to satisfy (14) lies somewhere in the range [p’“- “, fi’“‘]. Any unidimensional search technique (e.g. quadratic search, Newton-Raphson) can be used to find it. It was mentioned previously that solutions obtained by means of simultaneous equations (11). (13) and (13) will naturally result in a certain degree of consistency between observed and estimated origin and destination trip totals. To see this. we note that (8) and (9) can be written tif,= ,= 1

5 ,z,ffi=





i fiji ,,,,j= 1.2 ..,.,



,= i


respectively. Consistency is achieved not on the simple origin and destination sums but, rather. on the squared

In this section the computational algorithm for least-squares estimation depicted in Fig. 1 is used to estimate parameters of the hypothetical (3 x 3) trip distribution system of Table 2. Corresponding results using the maximum-likelihood estimation procedure developed by Hyman are also computed and compared with the least-squares estimates. The algorithm developed by Hyman is described in the appendix. In both cases we let f‘(ci,; /I) = 6,. Thus, the simultaneous equations corresponding to (11). (12. and (13) may be written as M

c cT,

“, = J



j= 7.

tijvjctj 1


. ....







tilrijL’ic!, N


The estimates of fi obtained were -0-202 for the leastsquares approach and -0-123 for the maximumlikelihood approach. The estimates of [ I I,) and i I’,)


F. J. Table 2. Hypothetical

trip distribution



of the usual constraints on row and column totals (see appendix), while the least-squares method satisfied these constraints only approximate1y.t The major difference in the parameter values obtained by both methods appears to be in the estimated values of the cost function parameter, /?. This discrepancy, which is accentuated by the particular numbers chosen in this example problem, can be attributed to the different distributional assumptions employed in the two analyses. When S is large relative to the average tij, large differences in the estimates of/l can be expected. On the other hand, as S + 0 (i.e. as the model fits the data more and more closely) the estimates of p obtained by least squares and by maximum likelihood will converge to the same value. It is well-known that “significance” tests for the above parameters are not nearly as clearcut as they are in the case of models which are linear in parameters. Nevertheless, approximate tests can be employed (see Draper and Smith, 1967; Mandel 1969, 1971). Such tests were not conducted and are left as a piece of unfinished business for the inquisitive reader to pursue. In making these tests, the prospective researcher will want to keep in mind that the constancy of the variance in elj will likely not hold if there is considerable variation in the magnitudes of observed t,,‘s; it is likely that the variance of tij increases with its mean. Consequently, some suitable transformation of the dependent variable-in the manner suggested, perhaps, by Box and Cox (1964tmight be used (see Cesario (1974)).



Origins 1




4 5


- 5 = 10


-6 &

r23= c*,=










1 .ooo 0.604 0.292

1.000 1.795 2.373

1.000 0.583 0.311


1000 1.894



Maximum likelihood (/9 = -0.123)$

Least-squares (B = - 0.202)

1 2 3



Table 3. Numerical




-2 d:;:,




t For convenience of interpretation and with no loss of generality, quantities in this table have been normalized by letting the first element in each column be equal to unity. $ Refer to the appendix for the derivation of the V, and Vj vectors for this method.

obtained by both methods are given in Table 3. It will be noted that considerable correspondence between these origin and destination parameter estimates is apparent. The estimates matrices {iij} are given in Table 4. The sum of the squared deviations, S, was 10440 for the least-squares approach and 11.245 for the maximumlikelihood method. Again, the two methods produced similar results. We note that, as expected, the maximumlikelihood procedure resulted in an “exact” satisfaction



approach to parameter estimation in spatial interaction modeling has been explored. Results from using this approach have been compared with the results from use of a maximumlikelihood procedure employing multinomial probability assumptions. It was seen that for the small problem considered the results are similar and, from a computational viewpoint. there are only subtle practical

t The computational times for estimating parameters by both methods were nearly identical and are not given here. Table 4. Estimated


In this paper a least-squares

trip distributions-?

Destinations Origins 1 2 3 Column





Row totals

3.325 (3,428) 5,476 (5.651) 9.964 (9,921)

2.779 (2.436) 4.577 (4.015) 4.069 (4,549)

1.075 (1.136) 2,539 (2.334) 2,101 (2.530)

7.178 (7.000) 12.592 (12.000) 16.134 (17.000)

18,764 (19.000)

I 1.424(1 I ,000)

5,715 (6.000)

35.904 (36.000)

t Estimates obtained by maximum likelihood are given in parentheses. These quantities have not been rounded off, as is the usual custom, to demonstrate the relationship between estimates obtained by both methods. Rounding would make results obtained by both methods appear equivalent.


of trip distribution

It would be in these two approaches. interesting and fruitful, of course, to compare results for larger problems. A final note is in order. Previous papers have used the terminology “best possible” to describe the results of employing the maximum likelihood principle with multinomial assumptions to the estimation of spatial interaction parameters. In view of the results of this investigation it may be stated that the term “best possible” must be interpreted carefully with respect to the particular estimation principle being employed. estimates using multinomial The “best possible” distributional assumptions differ from “best possible” estimates obtained by using least squares. differences

Assuming that the riis are independent, and ignoring sample size issues the likelihood function may be presumed to be proportional to the multinomial distribution. viz: \

L =



fl qy,


.V .%I L* = 1 1 pij In qij - i t f qij - I 1=1 j=1 ,=I ,=I > N


= 1 1 pij[ln CTi+ In l$ + In f(cij; /I)] i=r ,=1 s



[ i=r



c c


l/i, = L., 1; I(Ci,: [I)


where c/,, is the probability of a trip from i to/ and other terms are as defined in the main body of this paper. We let p,, = I,, ‘u’ and c/;, = i,, W. It must be the case, of course. that


If we let fCcij;



8) = CC




i= 1,2 ,..,,


and maximize (A4) by the usual methods, three equations are obtained:

[. =-_IL





(A7) the following

I .2.

.j= 1.7._,

This appendix presents in modified form the essential ingredients of the method developed by Hyman (1969) for estimating by maximum likelihood the parameters of the general function



Maximum likelihood estimates are obtained by maximizing (A3) subject to constraint (A2) on the q,is. (It is usually easier, however, to maximize the logarithm of L rather than L itself.) Hence. letting i. be a Lagrangean multiplier, we find values of 1L’,:. [ V,) and p which maximize

6. REFERENCES Batt! M. and Machic S. I 1972) The calibration of gravlt!. entropy and related models of spatial mteraction. E~~r~irorvncnrmd Plmr~inq 4. 205-234. Box G. E. P. and Cox D. R. (1964) An analysis of transformatlons. J. Srcrl. SOCK.B 26. 21 l-243. Cesario F. J. (1973) A generalized trip distribution model. J. R~~qionul SC,/. 12. 233-248. Cesarlo F. J. (lY7J) More on the generalized trip dlstribution model. J. Rc~qwrwl SCi. 13, to he publlshed. Draper N. R. and Smith H. (I 967) Applied Reyrcssiot~ Anu!,sis. Wiley. Neu York. Evans A. W. (1970) Some properties of trip distribution methods. Trtr~s/w Rc,\. 4, 19-36. Evans A. W. (1971) The calibration of trip distribution models with exponential or similar cost functions. Twsprt Rc,s. 5. 14538. Hyman G. M. (1969) The calibration of trip distribution models. Enrrronnw~r utld Plmnirtq 1. 105-I 12. Mandel J. (1969) The partitioning of interaction in analysis of variance. J. Rcj. mt!u. hr. Srtrmi. 73B. 309-328. Mandel J. (1971) A new analysis of variance model for nonadditive data. Technonwrr~.v 13, l-18. Wilson A. G. ( 1970) Enmop!, in L’r-butt und Rqior~al Model&q. Pion. London. Wilson A. G. (197 I ) A family of spatial interaction models and associated developments. Elwirorvnrrjr ad P/umir~,q 3, l-32.



. hf




V *,

& jg,PijIn ci;= 2 1 q,,In “ii.




Equations (A8), (A9) and (AIO) can be solved simultaneously by an iterative process similar to that depicted in Fig. 1. It is easily inferred from these equations that N


i= k


1.2. .._, M




(Al 1)




Thus, the estimated proportions of trips emanating from each origin and terminating at each destination are equal to the observed proportions. It follows that

if it is assumed that W total trips are made, then estimates of origin and destination trip totals will equal observed totals.