I recently chatted with my older granddaughter about statistics. She is planning to take a course in these and is a little concerned. Statistical math can be quite challenging, but for the rest of us, the math is not the problem. Everyone should understand statistics better because people use them to lie.
You should have an intuition about what people present to you.
“Facts are stubborn things, but statistics are more pliable.” – Mark Twain
There is nothing wrong with inferences if you know how they came to be. Thus the course in statistics. To draw a reliable inference you need several things:
Example #1 – Suppose I told you that 20% of American children lived in poverty while only 13% in the OECD did. Would you want government programs or an explanation of where the numbers came from. The key element is how you define poverty. The liars know people have simple ideas about poverty so if they can recast the meaning of the word, they can get a reaction. In this case, the term poverty means someone who has a family income less than 50% of the mean income in the population. In the United States mean household income is $72,000. In Mexico a little under $10,000. A household in the US just at the border of poverty has nearly four times the income of the average Mexican household and seven times the threshold of the poor Mexican. You will notice two things.
Example #2. The three richest Americans have more wealth than the bottom half. Technically true but a little meaningless. If you have $100 in a bank account, you are wealthier than about 50,000,000 Americans. Why? Because many Americans have no wealth accumulated. Very young children for example. The bottom half is a net number. All of the below zero amounts are netted with the above zero amounts. The nature of the population of billionaires is not like the population of others and numbers don’t connect them.
Example #3. Ignoring the base rate. Suppose I may have some unusual disease. The disease occurs in one in ten thousand cases. There is a test and it has no false negatives and 5% false positives. I take the test and it shows positive. What should I do then? Step one. Establish the meaning. My odds of having the disease are now about 1 in 500. In 10,000 tests you should expect 500 positives and I am one of them. I suppose I could take the test again and if positive now about 1 in 25, so a third time and if still positive, pretty close to 50-50. Be very cautious relying on one outcome when the incidence is rare to begin with. The false positive rate matters far more.
Example #4. Notice the presentation. I once saw a graph where there were two lines, one going up and the other, unexpectedly going down. Upon examination I noticed the graph had two y-axes. The left side one went up as you would expect. The right side however went down. Higher numbers at the bottom. In fairness it did , in tiny print, note “inverted axis” The story that came with the graph was based on both axes being in traditional form. Misleading.
Example #5. Misused regression. Regression often fits a line to data. The trend line. People like trends, but regression analysis is not valid outside the range of the data you have. It does not predict the future. Suppose you had information about the number of people in cars on the highway. The data runs from 1920 to 1970. We observe that the number has fallen each decade. From the data, what year would the number of people in cars on the highway be less than 1? Clearly the data predicts driverless cars.
Example #6. Absence of evidence is not evidence of absence. In simple terms nothing proves nothing. You could argue that as there is no evidence of old telephone lines and switchboards in the Sahara, the people in the Sahara must have been the first to use wireless telephones. This is an example of misused correlation as proof of some fact. There are many spurious correlations available.
Assess the purpose of the presentation. If you see an argument based on statistics, assume the provider has a reason to present them as they are and in the way they have done so. Assume they want to convince you of something. Be skeptical and assess whether the methods to collect the data and create the inference make sense. If not, assume they intend to deceive.
Always know you can create statistics to prove anything if you are willing to tinker with their collection and analysis. Correlation is interesting but does not, by itself, imply cause. Trend lines have meaning only within the range of the data.
You can soon become intuitive about what is possible and what is not. Keep in mind that statisticians can prove that 42.3% of all statistics are made up on the spot.
In short. Beware received wisdom.
I help business owners, professionals, and others understand and manage risk and other financial issues. To help them achieve their goals, I use tax efficiencies and design advantages to acquire more efficient income and larger, more liquid estates.
In previous careers, I have been a partner in a large, international public accounting firm, CEO of a software start-up, a partner in an energy management system importer, and briefly in the restaurant business.
Please be in touch if I can help you. firstname.lastname@example.org 705-927-4770