The denominator (and the mi*nj in the numerator) come from two probabilities: first, the probability a commuter does not find a good job closer than the target area (mi/[mi+sij]), and second, the probability that they do find a good job in the target area (nj/[mi + nj + sij). The product of those two probabilities gives the probability that a commuter from m will end up in target area n. That probability is multiplied by the number of commuters from county m (Ti) to give the number of commuters expected from m to n (Tij)

Just to clarify what i wrote above, if the model predicts 10 people will commute from Torrance to Pasadena, it should also predict 1 commuter from Central torrance to Pasadena, and 9 from CT to other pieces of Torrance. Changing the borders you consider changes the total number of commuters, someone going from Central Torrance to North Torrance is a commuter if they are treated seperately, but is staying home if your model only considers Torrance. Adding or Subtracting 90 commuters changes the numbers a bit (by changing who is considered a commuter), but the real world predictions of the model are invariant with county boundaries.

Thanks Smarmet. I admit that I haven't engaged with the derivation or even with your summary yet, but I do feel that they will have quite a burden of proof.

Again assuming I understand the formula, it predicts that people from each tenth of Torrance (assuming Torrance divided into 10 arbitrary regions) are ~1/10 as likely to commute to Pasadena as the people of Torrance are. This despite the fact that the people from each tenth of Torrance *are* in aggregate the the people of Torrance.

OK maybe the model assumes something special about county boundaries, such that I'm not free to apply it to arbitrarily defined regions like this. Guess I do have some reading to do..

Within a large country, the relevant distances tend to be "practical to drive to" and "impractical to drive to" - if you live in New York State, moving to California and moving to Iowa both involve about the same amount of disruption...

This bothered me for a while too. There is a derivation of the formula in section 2 of this pdf: , but it didn't make things much clearer for me.After thinking for a while, these are my thoughts:If Torrance and Central Torrance are considered seperate, adjacent counties closer to each other than to Pasadena then I think the model does predict 100 times more commuters from T to P than CT to P. Basically the distance (in population) is scaled by the population of the home county (a more populous county produces higher z values, and thus travels over more people to find a commute worthy spot). The reason splitting Torrance up into 10 pieces and adding each piece's commuters doesn't add up to Torrance's commuters seems to be that the total number of commuters changes. Most of Central Torrance's commuters go to other parts of Torrance, and the same for each other piece. When Torrance is taken as a whole those commuters disappear. So while CT has a mi 1/10th of Torrance, it has the same Ti (and all the pieces combined have 10X the Ti). Adding up all the pieces seems to produce the desired reult then.

I would be curious to see what happens if you use travel time rather than straight line distance to do this analysis. The more important issue, however, is that the gravity model ignores the presence of other alternatives (not n_{i} or m_{j}), so in many ways it is a straw man and its poor performance is no surprise. Something like the Huff model would have been a better choice for comparison (http://www.esri.com/library..., especially since that is a probabilistic model. The other issue that I can't seem to work out is if you can relax the assumption that one of these variables is people. Can we make n_{j} a store, and use square footage instead of population?

The radiation model is very interesting but I don't understatnd the mi in the numerator.

Assume that Central Torrance contains 10% of Torrance's residents.

Then if I understand it, the radiation model predicts that the number of people who commute from Central Torrance to Pasadena is only 1/100 the number of people who commute from Torrance to Pasadena (since Central Torrance's Ti and mi are each 1/10 of Torrance's, while each factor in the denominator is the same for Torrance vs Central Torrance).

"The f() function is only in the gravity model, not the radiation model."

Yes, but you (and I) are comparing the dependency on r of the two models, so we have to compare f(r) in the gravity model to the population density in the radiation model.

"And if you’d look at the paper, you’d see that they don’t need to add anything more to the model I described to fit the data well."

I see Dimitri put the link up, and yes, I must say I'm pleasantly surprised with the results in the graphs. Without tweaking the radiation model does seem to fit empirical to a high degree (though the logarithmic graphs could hide smaller discrepancies to the naked eye). Oh well, you learn something new every day...

The f() function is only in the gravity model, not the radiation model. And if you'd look at the paper, you'd see that they don't need to add anything more to the model I described to fit the data well.

"Amazingly, this better fitting radiation model only depends on distance indirectly, via population density."

Right.

"It suggests that while distance matters, it is almost never an overwhelming consideration."

Nope, that depends on f(rij) and the population density. For example if f(r) is the travel time or fuel cost than it is proportional with r while the population density is probably proportional to r^2, then the radiation model depends on r more strongly than the gravity model. Distance can matter, that's also in the text:

"Step two, the individual chooses the closest job to his/her home, whose benefits z are higher than the best offer available in his/her home county"

This suggests that distance is not just a detail in the radiation model: the agents clearly prefer staying closer to home.

The radiation model makes sense (in theory), but unless I'm missing something, Robin's conclusion about distance doesn't. Also, I'm not convinved that such simple models are useful to model a complex economy of emotional human beings. I suspect that in order to fit the empirical data you have to deform any of these models to such levels that not much of the original formula remains and you might as well have started with a random function (I mean it's always possible to find some polynomial that interpolates the empirical data for a certain time period, but that doesn't really teach you much about the underlying dynamics and it sure as hell doesn't allow you to predict the future), but maybe that's just my distrust of economists speaking.

Thanks Smarmet, this helps. Yes I had assumed total Ti invariant which is not correct.

The denominator (and the mi*nj in the numerator) come from two probabilities: first, the probability a commuter does not find a good job closer than the target area (mi/[mi+sij]), and second, the probability that they do find a good job in the target area (nj/[mi + nj + sij). The product of those two probabilities gives the probability that a commuter from m will end up in target area n. That probability is multiplied by the number of commuters from county m (Ti) to give the number of commuters expected from m to n (Tij)

"Tij = Ti mi nj / (mi + sij)(mi + nj + sij)"

Why the addition of the Ti term here which wasn't in the gravity model? That seems unrelated to the gravity v radiation issue.

Any simple explanation why does the denominator takes the form it does?

Just to clarify what i wrote above, if the model predicts 10 people will commute from Torrance to Pasadena, it should also predict 1 commuter from Central torrance to Pasadena, and 9 from CT to other pieces of Torrance. Changing the borders you consider changes the total number of commuters, someone going from Central Torrance to North Torrance is a commuter if they are treated seperately, but is staying home if your model only considers Torrance. Adding or Subtracting 90 commuters changes the numbers a bit (by changing who is considered a commuter), but the real world predictions of the model are invariant with county boundaries.

Thanks Smarmet. I admit that I haven't engaged with the derivation or even with your summary yet, but I do feel that they will have quite a burden of proof.

Again assuming I understand the formula, it predicts that people from each tenth of Torrance (assuming Torrance divided into 10 arbitrary regions) are ~1/10 as likely to commute to Pasadena as the people of Torrance are. This despite the fact that the people from each tenth of Torrance *are* in aggregate the the people of Torrance.

OK maybe the model assumes something special about county boundaries, such that I'm not free to apply it to arbitrarily defined regions like this. Guess I do have some reading to do..

Within a large country, the relevant distances tend to be "practical to drive to" and "impractical to drive to" - if you live in New York State, moving to California and moving to Iowa both involve about the same amount of disruption...

Oops, messed up the link formatting something awful. Can I (or a mod) edit posts?

This bothered me for a while too. There is a derivation of the formula in section 2 of this pdf: , but it didn't make things much clearer for me.After thinking for a while, these are my thoughts:If Torrance and Central Torrance are considered seperate, adjacent counties closer to each other than to Pasadena then I think the model does predict 100 times more commuters from T to P than CT to P. Basically the distance (in population) is scaled by the population of the home county (a more populous county produces higher z values, and thus travels over more people to find a commute worthy spot). The reason splitting Torrance up into 10 pieces and adding each piece's commuters doesn't add up to Torrance's commuters seems to be that the total number of commuters changes. Most of Central Torrance's commuters go to other parts of Torrance, and the same for each other piece. When Torrance is taken as a whole those commuters disappear. So while CT has a mi 1/10th of Torrance, it has the same Ti (and all the pieces combined have 10X the Ti). Adding up all the pieces seems to produce the desired reult then.

Distance is paramount. It provides latency.

Offtopic - have you imagined replacing traders of the City of London with emulations? :)

Brilliant.I've been saying something similar for years. Laws are the topography of our otherwise flat world.

I would be curious to see what happens if you use travel time rather than straight line distance to do this analysis. The more important issue, however, is that the gravity model ignores the presence of other alternatives (not n_{i} or m_{j}), so in many ways it is a straw man and its poor performance is no surprise. Something like the Huff model would have been a better choice for comparison (http://www.esri.com/library..., especially since that is a probabilistic model. The other issue that I can't seem to work out is if you can relax the assumption that one of these variables is people. Can we make n_{j} a store, and use square footage instead of population?

The radiation model is very interesting but I don't understatnd the mi in the numerator.

Assume that Central Torrance contains 10% of Torrance's residents.

Then if I understand it, the radiation model predicts that the number of people who commute from Central Torrance to Pasadena is only 1/100 the number of people who commute from Torrance to Pasadena (since Central Torrance's Ti and mi are each 1/10 of Torrance's, while each factor in the denominator is the same for Torrance vs Central Torrance).

Am I missing something?

"The f() function is only in the gravity model, not the radiation model."

Yes, but you (and I) are comparing the dependency on r of the two models, so we have to compare f(r) in the gravity model to the population density in the radiation model.

"And if you’d look at the paper, you’d see that they don’t need to add anything more to the model I described to fit the data well."

I see Dimitri put the link up, and yes, I must say I'm pleasantly surprised with the results in the graphs. Without tweaking the radiation model does seem to fit empirical to a high degree (though the logarithmic graphs could hide smaller discrepancies to the naked eye). Oh well, you learn something new every day...

The f() function is only in the gravity model, not the radiation model. And if you'd look at the paper, you'd see that they don't need to add anything more to the model I described to fit the data well.

"Amazingly, this better fitting radiation model only depends on distance indirectly, via population density."

Right.

"It suggests that while distance matters, it is almost never an overwhelming consideration."

Nope, that depends on f(rij) and the population density. For example if f(r) is the travel time or fuel cost than it is proportional with r while the population density is probably proportional to r^2, then the radiation model depends on r more strongly than the gravity model. Distance can matter, that's also in the text:

"Step two, the individual chooses the closest job to his/her home, whose benefits z are higher than the best offer available in his/her home county"

This suggests that distance is not just a detail in the radiation model: the agents clearly prefer staying closer to home.

The radiation model makes sense (in theory), but unless I'm missing something, Robin's conclusion about distance doesn't. Also, I'm not convinved that such simple models are useful to model a complex economy of emotional human beings. I suspect that in order to fit the empirical data you have to deform any of these models to such levels that not much of the original formula remains and you might as well have started with a random function (I mean it's always possible to find some polynomial that interpolates the empirical data for a certain time period, but that doesn't really teach you much about the underlying dynamics and it sure as hell doesn't allow you to predict the future), but maybe that's just my distrust of economists speaking.