Interpreting Models: Coefficients, Marginal Effects or Elasticities?

I’ve spoken about interpreting models before. I think that this is the most important part of our work, communicating results. However, it’s one that’s often overlooked when discussing the how-to of data science. That’s why marginal effects and elasticities are better for this purpose than coefficients alone.

Model build, selection and testing is complex and nuanced. Communicating the model is sometimes harder, because a lot of the time your audience has no technical background whatsoever. Your stakeholders can’t go up the chain with, “We’ve got a model. And it must be a good model because we don’t understand any of it.”

Our stakeholders also have a limited attention span so the explanation process is two fold: explain the model and do it fast.

For these reasons, I usually interpret models for my stakeholders with marginal effects and elasticities, not coefficients or log-odds. Coefficient interpretation is very different for regressions depending on functional form and if you have interactions or polynomials built into your model, then the coefficient is only part of the story. If you have a more complex model like a tobit, conditional logit or other option, then interpretation of coefficients is different for each one.

I don’t know about your stakeholders and reporting chains: mine can’t handle that level of complexity.

Marginal effects and elasticities are also different for each of these models but they are by and large interpreted in the same way. I can explain the concept of a marginal effect once and move on. I don’t even call it a “marginal effect”: I say “if we increase this input by a single unit, I expect [insert thing here]” and move on.

Marginal effects and elasticities are often variable over the range of your sample: they may be different at the mean than at the minimum or maximum, for example. If you have interactions and polynomials, they will also depend on covarying inputs. Some people see this as added layers of complexity.

In the age of data visualisation, I see it as an opportunity to chart these relationships and visualise how your model works for your stakeholders.

We all know they like charts!

Bonds: Prices, Yields and Confusion- a Visual Guide

Bonds have been the talk of the financial world lately. One minute it’s a thirty-year bull market, the next it’s a bondcano. Prices are up, yields are down and that’s bad. But then in the last couple of months, prices are down and yields are up and that’s bad too, apparently. I’m going to take some of the confusion out of these relationships and give you a visual guide to what’s been going on in the bond world.

The mathematical relationship between bond prices and yields can be a little complicated and I know very few people who think their lives would be improved by more algebra in it. So for our purposes, the fundamental relationship is that bond prices and yields move in opposite directions. If one is going up, the other is going down. But it’s not a simple 1:1 relationship and there are a few other factors at play.

There are several different types of bond yields that can be calculated:

  • Yield to maturity: the yield you would get if you hold the bond until it matures.
  • Yield to call: the yield you would get if you hold the bond until its call date.
  • Yield to worst: the worst outcome on a bond, whether it is called or held to maturity.
  • Running yield: this is roughly the yield you would get from holding the bond for a year.

We are going to focus on yield to maturity here, but a good overview of yields generally can be found at FIIG. Another good overview is here.


To explain all this (without algebra), I’ve created two simulations. These show the approximate yield to maturity against the time to maturity, coupon rate and the price paid for the bond. For the purposes of this exercise, I’m assuming that our example bonds have a face value of $100 and a single annual payment.

The first visual shows what happens as we change the price we pay for the bond. When we buy a bond below face value (at, say $50 when its face value is $100), yield is higher. But if we buy that bond at $150, then yield is much lower. As price increases, yield decreases.

The time the bond has until maturity matters a lot here, though. If there is only a short time to maturity then the differences between below/above face value can be very large. If there are decades to maturity, then these differences tend to be much smaller. The shading of the blue dots represent the coupon rate that might be attached to a bond like this- the darkest colours will have the highest coupon rate and the lighter colour will have the lowest coupon rates. Again, the differences matter more when there is less time for a bond to mature.

Prices gif

The second animation is a representation of what happens as we change the coupon rate (e.g. the interest rate the debtor is paying to the bond holder). The lines of dots represent differences in the price paid for the bond. The lighter colours represent a cheaper purchase below face value (better yields- great!). The darker colours represent an expensive purchase above face value (lower yields-not so great).

If we buy a bond cheaply, then the yield may be higher than the coupon rate. If we buy it over the face value, then the yield may be lower than the coupon rate. The difference between them is less the longer the bond has to mature. When the bond is very close to maturity those differences can be quite large.

Coupon Gif

When discussing bonds, we often mention something called the yield curve and this describes the yield a bond (or group of bonds) will generate over their life time.

If you’d like to have a go at manipulating the coupon rate and the price to manipulate an approximate yield curve, you can check out this interactive I built here.

Remember that all of these interactives and animations are approximate, if you want to calculate yield to maturity exactly, you can use an online calculator like the one here.

So how does this match the real data that gets reported on daily? Our last chart shows the data from the US Treasury 10-year bills that were sold on the 25th of November 2016. The black observations are bonds maturing within a year, the blue are those that have longer to run.  Here I’ve charted the “Asked Yield”, which is the yield a buyer would receive if the seller sold their bond at the price they were asking. Sometimes, however, the bond is bought at a lower bid, so the actual yield would be a little higher. I’ve plotted this against the time until the bond matures. We can see that the actual yield curve produced is pretty similar to our example charts.

This was the yield curve from one day. The shape of the yield curve will change on a day-to-day basis depending on the prevailing market conditions (e.g. prices). It will also change more slowly over time as the Federal Reserve issues bonds with higher or lower coupon rates, depending on economic conditions.

yield curve

Data: Wall Street Journal.

Bond yields and pricing can be confusing, but hopefully as you’re reading the financial pages they’re a lot less so now.

A huge thanks to my colleague, Dr Henry Leung at the University of Sydney for making some fantastic suggestions on this piece.


Political Donations 2015/16

Yesterday, the ABC released a dataset detailing donations made to political parties in Australia during the 2015-16 period. You can find their analysis and the data here. The data itself isn’t a particularly good representation of what was happening during the period: there isn’t a single donation to the One Nation Party among the lot of them, for example. This data isn’t a complete picture of what’s going on.

While the ABC made a pretty valiant effort to categorise where the donations were coming from, “uncategorised” was the last resort for many of the donors.

Who gets the money?

In total, there were 49 unique groups who received the money. Many of these were state branches of national parties, for example the Liberal Party of Australia – ACT Division, Liberal Party of Australia (S.A. Division) and so on. I’ve grouped these and others like it together under their national party. Other groups included small narrowly-focussed parties like the Shooters, Fishers and Farmers Party and the Australian Sex Party. Small micro parties like the Jacqui Lambie Network, Katter’s Australian Party and so on were grouped together. Parties with a conservative focus (Australian Christians, Family First, Democratic Labor Party) were grouped and those with a progressive focus (Australian Equality Party, Socialist Alliance) were also grouped together. Parties focused on immigration were combined.

The following chart shows the value of the donation declared and the recipient group that received it.

Scatter plot

Only one individual donation exceeded $500 000 and that was to the Liberal Party. It’s obscuring the rest of the distribution, so I’ve removed it in the next chart. Both the major parties receive more donations than the other parties, which comes as no surprise to anyone. However, the Greens have a proportion of very generous givers ($100 000+) which is quite substantial. The interesting question is not so much as who received it, but who gave the money.

Scatter plot with outlier removed


Who gave the money?

This is probably the more interesting point. The following charts use the ABC’s categories to see if we can break down where the (declared) money trail lies (for donations $500 000 and under). Again, the data confirmed what everyone already knew: unions give to the Labor party. Finance and insurance gave heavily to the Liberal Party (among others). Several clusters stand out, though: uncategorised donors give substantially to minor parties and the Greens have two major clusters of donors: individuals and a smaller one in the agriculture category.

Donor categories and value scatter plot

Breaking this down further, if we just look at where the money came from and who it went to, we can see that the immigration-focused parties are powered almost entirely by individual donations with some from uncategorised donors. Minor parties are powered by family trusts, unions and uncategorised donors. Greens by individuals, uncategorised and agriculture with some input from unions. What’s particularly interesting is the differences in Labor and Liberal donors. Compared to Liberal, Labor does not have donors in the tobacco industry, but also has less input by number of donations in agriculture, alcohol, advocacy/lobby groups, sports and water management. They also have fewer donations from uncategorised donors and more from unions.

Donors and Recipients Scatterplot

What did we learn?

Some of what we learned here was common knowledge: Labor doesn’t take donations from tobacco, but it does from unions. The unions don’t donate to Liberal, but advocacy and lobby groups do. The more interesting observations are focussed on the smaller parties: the cluster of agricultural donations for the Greens Party – normally LNP heartland; and the individual donations powering the parties focussed on immigration. The latter may have something to say for the money powering the far right.


Productivity: In the Long Run, It’s Nearly Everything.

“Productivity … isn’t everything, but in the long run it’s nearly everything.” Paul Krugman, The Age of Diminished Expectations (1994)

So in the very long run, what’s the Australian experience? I recently did some work with the Department of Communications and the Arts on digital techniques and developments. Specifically, we were looking at the impacts advances in fields like machine learning, artificial intelligence and blockchain may have on productivity in Australia. I worked with a great team at the department led by the Chief Economist Paul Paterson and we’re looking forward to our report being published.

In the meantime, here’s the very long run on productivity downunder.

Australian Productivity Chart

Statistical model selection with “Big Data”: Doornik & Hendry’s New Paper

The claim that causation has been ‘knocked off its pedestal’ is fine if we are making predictions in a stable environment but not if the world is changing …. or if we ourselves hope to change it. – Harford, 2014

Ten or fifteen years ago, big data sounded like the best thing ever in econometrics. When you spend your undergraduate career learning that (almost!) everything can be solved in classical statistics with more data, it sounds great. But big data comes with its own issues. No free lunch, too good to be true and your mileage really does vary.

In a big data set, statistical power isn’t the issue. You have power enough for just about everything. But that comes with problems of its own. The probability of a Type II error may be very high. In this context, it’s the possibility of falsely interpreting that a parameter estimate is significant when in fact it is not. The existence of spurious relationships are likely. Working out which ones are truly significant and those that are spurious is difficult. Model selection in the big data context is complex!

David Hendry is one of the powerhouses of modern econometrics and the fact that he is weighing into the big data model selection problem is a really exciting proposition. This week a paper was published with Jurgen Doornik in the Cogent Journal of Economics and Finance. You can find it here.

Doornik and Hendry propose a methodology for big data model selection. Big data comes in many varieties and in this paper, they consider only cross-sectional and time series data of “fat” structure: that is, more variables than observations. Their results generalise to other structures, but not always. Doornik and Hendry describe four key issues for big data model selection in their paper:

  • Spurious relationships
  • Mistaking correlations for causes
  • Ignoring sampling bias
  • Overestimating significance of results.

So, what are Doornik and Hendry’s suggestions for model selection in a big data context? Their approach has several pillars to the overall concept:

  1. They calculate the probabilities of false positives in advance. It’s long been possible in statistics to set the significance level to control multiple simultaneous tests. This is an approach taken in both ANOVA testing for controlling the overall significance level when testing multiple interactions and in some panel data approaches when testing multiple cross-sections individually. The Bonferroni inequality is the simplest of this family of techniques, though Doornik and Hendry are suggesting a far more sophisticated approach.
  2. Test “causation” by evaluating super exogeneity. In many economic problems especially, the possibility of a randomised control trial is unfeasible. Super exogeneity is an added layer of sophistication on the correlation/causation spectrum of which Granger causation was an early addition.
  3. Deal with hidden dependence in cross-section data. Not always an easy prospect to manage, cross-sectional dependence usually has no natural or obvious ordering as in time series dependence: but controlling for this is critical.
  4. Correct for selection biases. Often, big data arrives not out of a careful sampling design, but on a “whoever turned up to the website” basis. Recognising, controlling and correcting for this is critical to good model selection.

Doornik and Hendry advocate the use of autometrics in the presence of big data, without abandoning statistical rigour. Failing to understand the statistical consequences of our modelling techniques makes poor use of data assets that otherwise have immense value. Doornik and Hendry propose a robust and achievable methodology. Go read their paper!

No man should be an island

Insularity may be the catchword in the aftershock analysis of Brexit, but it’s not confined to elites. We are a society clustered into islands of opinion. Our insular communities are separated from each other by opportunity and circumstance. They develop their own novel views of the world we live in. As the British referendum last week showed, these can be fundamentally different world views not easily bridged by arguments of economic rationality.

There has been much made of the insularity of elites and failure to heed the rage of the disenfranchised that resulted in the Brexit decision. There have been parallels in the U.S. with Trumpism while here in Australia too, there are similarities to be drawn.

Focus has been on the elites and their system, but little attention has yet been paid to the scaffolding by which a system fails to engage with its constituents to its own detriment. The phenomenon of insularity reaches much further and goes much deeper down the chain.

We have developed insular communities within our own cities. Communities where we think similar thoughts, have similar incomes and even speak in similar ways. The radial income distributions of our major capital cities (higher in the centre, decreasing as you move away from the centre) is striking. These patterns of income mean that the people at the school gate are more likely to come from similar incomes to you than not. Given our understanding of the relationships between education and income, there is also a good chance that the person on the treadmill next to you at the gym is also from a similar socioeconomic background.

The problem is not just insular elites ignoring the constituents that scaffold their power. The problem is that housing affordability has reduced our opportunity to interact on a daily basis with friends who are not like us. Conversation with friends is very different to professionally interacting with clients, our barista or our child’s school teacher: all of whom may live at a distance from us.

We form our world views with our friends as a social barometer. Social influence has a noted relationship with behaviour. If our social circle narrows to those only living a similar life to our own, our exposure to differing opinion may do the same.  We may find ourselves living in an echo chamber of our own views, unable to understand the passionate and different opinion of our fellow citizens without recourse to simplistically ascribing their beliefs to a lack of understanding.

The mocking commentary of regretful and fearful Brexiteers illustrates our inability to understand what compelled them to make their decision in the first place. Our eye-rolling supersedes our desire to engage. The memes were eye-wateringly funny, but when our examination of the phenomenon stops with the retweet button, we are further insulating ourselves from the uncomfortable reality of disagreement.

In Australia, our election campaign has been seven long weeks of avoiding uncomfortable disagreement. The government has not been willing to push for a sweeping policy agenda. It’s a move designed to keep the focus off the possibility of disagreement and firmly on the reassuring mantra of “jobs and growth”. Avoiding electoral discomfort only serves to further isolate communities already unheard and unremarked upon by party influencers. We are literally being asked to “stick with the current mob for awhile”.

Why do we congregate into islands of opinion? Neil Gaiman suggests that we are fearful of the consequences of disagreement or, simply, being wrong. This is an age when every opinion is recorded forever and those disagreed with may be mercilessly abused with consequences for harrassers rare enough to be newsworthy. Threats of violence are as common place as they are unacceptable. Opinions are also laudable if static, but not if they change. Elite changes in opinion are categorised as backflips and turnarounds suggesting a gymnastic talent not otherwise known amongst the blue tie wearing classes. Mere ordinary citizens must make do with online mockery in all its forms.

Caution in expressing opinion or inviting disagreement is a reasonable response. We retreat geographically in the face of economic pressures and we retreat socially, online and otherwise, in the face of object lessons meted out daily in our social media channel of choice.

What should we, as a society do? Withdrawing into our islands of opinion, we risk failing to understand each other. The Brexit is an example of the ultimate consequence of this. Trumpism may be another. In Australia, the proliferation of far right microparties, each certain of saving Australia from some peril, is one manifestation.

Neil Gaiman makes the suggestion, that it was in part a deep rage that allowed his friend and co-author Terry Pratchett to create as prolifically and effectively as he did. As individuals we need to risk even the deep discomfort of rage in the pursuit of understanding our neighbouring islands of opinion. Shaping society is a creative pursuit, after all.