Trego's Mountain Ear

"Serving North Lincoln County"

Category: A Science for Everyone

  • Polling Responses over Time

    Polling response rates also affect polling results.  As State Demographer in a rural state (South Dakota), changes in how the Census obtained and disseminated data made us increasingly dependent on the American Community Survey.  It was definitely more current than the decennial census – but the small numbers of participants made it less reliable.  Deborah Griffin, in “Measuring Survey Nonresponse by Race and Ethnicity” concluded: “The data suggest that special efforts are needed to address differential survey response rates – to increase the rates for areas with high concentrations of AIANs, Blacks and Hispanics . . . New methods to address low mail response must be developed.”

    On February 27, 2019, PEW published “Response rates in telephone surveys have resumed their decline.”  The critical part of the article is shown in the graph below – in 1997, the response rate was 36%, twenty years later it was 6%.  Kind of makes polling more difficult – particularly when you project the decline down to 2020.

    On 10/26, PEW published “What the 2020 electorate looks like by party, race and ethnicity, age, education and religion” and said, “Around a third of registered voters in the U.S. (34%) identify as independents, while 33% identify as Democrats and 29% identify as Republicans, according to a Center analysis of Americans’ partisan identification based on surveys of more than 12,000 registered voters in 2018 and 2019.”   Contrast that with Gallup’s findings.

    In 2013, the Research Council of Norway published, “Fewer Willing to Participate in Surveys.”  The most relevant observation to political polling is “In general, NSD sees that young, single men living in urban areas are the least likely to respond, while older women are the most willing.” 

    Polling accuracy depends on random selection.  As the proportion of respondents becomes smaller, the quality of randomness declines.  Causation is not proven by correlation, it must be inferred.  My inference is that the middle has stopped responding to polls – pollsters are getting responses from the same ideologically extreme friends who post political memes, and not from the center.  When 94% of those surveyed do not respond, we’re looking at some extreme nonresponse bias.

  • Party Affiliation over Time

    Party Affiliation over Time

    It seems, especially in the midst of an election year that the political parties are long established and permanent. While we don’t have an especially high rate of turnover in major political parties we do have one.

    The Republicans became the Democratic Republicans, which eventually become the Democrats (an extremely brief summary of a rather lengthy chunk of American History). The Republican party, as we know it today, actually came out of the Whig party (well, a splinter faction, sort of. No one said politics was straight forward).

    At any rate, political parties are not constant and neither is their membership. Gallup has a nice collection of data on party affiliation that we referenced last week in things that make surveys hard. Since they’ve provided it as a table, here is the graph:

    Party Affiliation over time; data from Gallup

    Looking at Gallup’s data, we can make several observations. Since 2004, the general trend has been an increase in people identifying as Independents, and a decrease in both Republicans and Democrats. We also notice that declines in either major party tend to coincide with increases in Independents.

    The top of the graph is 50%, and while none of our three categories make it that high, Independents come the closest (highest percent independents was 47%, which occurred both in October of 2013 and October of 2014).

    A “Zoomed in” version of the previous graph, from 2012 on

    Taking a closer look at things (note that I’ve changed the vertical scale as well, the bottom is now 15%) from 2012 on, we can see a large drop in both Republicans and Democrats in 2013 that has a corresponding rise in independents. 2018 had a decline in Independents that mirrors a rise in Democrats.

    The difficulty with examining trends is “How far do we have to zoom out?” Over a large amount of time, it’s difficult to see the impacts of smaller events but easier to examine long term trends. Another consideration is that what looks like a clear trend on the small scale may not reflect the trend in the long term.

    Political polling doesn’t give us all that much long term data. Do we have enough to make predictions from? Well, the people making predictions certainly seem to think so!

  • Things that make Surveys Hard

    I was asked to describe the problems with political polls. It is a great year for showing the problems in predicting from opinion polls. Projecting isn’t the problem – we take partial duration series (like flood data) and project the likelihood of larger and smaller events occurring. In my 70 years on the planet, I’ve seen a couple of hundred-year floods on the same river – and it isn’t a big deal. When you project a hundred-year occurrence from 38 years of data, it is a question of how wrong you’re going to be. Poker odds are easy – there are only 52 cards (unless you play with a joker). A pair of dice have only 12 potential combinations. The potential combinations of weather and climate during our planet’s existence aren’t quite infinite, but they approach it.

    In January, 2016, Gallup announced that “Democratic, Republican Identification Near Historical Lows”, and explained that 26% identified as Republicans, 29% as Democrats. On January 16, 2020 30% identified as Republican, 27% as Democrats. Gallup’s most recent stats were on September 14, with 28% identifying as Republican and 27% Democrats. If I start with a good model based on the 2016 election results, I have a problem in 2020.

    For political polls, our universe consists of registered voters – but that gets to be a problem: “The Public Interest Legal Foundation (PILF) found that 244 counties across the United States exceed 100% voter registration. Counties in 28 states plus the District of Columbia and Alaska have more voters registered than adults living in those jurisdictions.

    After a review of records submitted to the federal government, The Public Interest Legal Foundation (PILF) discovered 244 counties in which voter registration levels exceed the number of living adults in the jurisdiction. Additionally, 279 counties have registration rates ranging from 95%-99%, which PILF determines are “implausibly high.”

    Polling is based on “best available data.” It is a coincidence that the initials are BAD. Starting from poor data makes it hard to develop a way to project with accuracy, and it’s hard enough anyway.

    California has more immigrants than any other state – in 2017, 27% of California’s residents were foreign born – and a little over half of them are US citizens. About one of eight contacts is a non-citizen and not eligible to vote. If you survey Montana, 2% of the residents are immigrants, and 58% of those are naturalized citizens. Less than 1 percent of Montana residents aren’t citizens. Few calls reach non-voters. It isn’t easy to develop a national model that projects surveys accurately.

    And then there are the folks who lie to pollsters – in 2012, polls in South Dakota had shown strong support for legislation that would limit abortion access – but the vote turned out the other way. It was the first time I encountered what is now called “the shy Trump voter.” When you think about it, it isn’t particularly rational to believe the guy who calls you and interrupts dinner has your privacy as a main concern. On that issue, it looks like 3% or more of the survey respondents weren’t truthful. Face it, there was more than a zero chance that the voice on the other end of the phone might report your comments back to your Aunt Sally!

    I am glad I never had to make a living polling and predicting elections. It’s easy to look at the data and predict Trump will carry Montana and Biden will carry California. It’s a bit more risky to project Florida, or North Carolina, or Ohio.

  • Political Junk Mail

    I took 19 pieces of political ads out of the mailbox, and one piece of mail from the American Association of Retired People offering me the opportunity to become one of them.  It was 11:15 am, and there was no room left in the post office trash can, so I reluctantly took them all home.

    Now if I take them all as valid, we have the most rotten group of candidates ever fielded in Montana.  The descriptions of their character flaws would make the devil himself wonder if these despicable characters could be safely stored in his operation.  I share this observation merely because most people put the advertisements in the post office trash before I was unable to do so.

    I examined the return addresses, etc.  I’m pretty sure that they’re funded by folks from outside Montana who are looking to keep their side in the majority.  OK – an “F” rating from the NRA is usually earned and I can check that.  Most of the other accusations seemed a bit less solid.  Well, I could probably check on the attorney general candidate who never prosecuted a case – but at least he is no less qualified than I am.

    I don’t have a problem with non-Montanans trying to influence our elections.  I can live with it.  But I would like to see them have to provide larger garbage receptacles for our post offices.

  • Covid’s Mask and Pascal’s Wager

    According to the Internet Encyclopedia of Philosophy, “Blaise Pascal (1623-1662) offers a pragmatic reason for believing in God: even under the assumption that God’s existence is unlikely, the potential benefits of believing are so vast as to make betting on theism rational.” As a stats guy, I could write this from memory, as a scientist, I need to cite a source.

    Pascal’s statistical argument is a gambler’s view of the universe – the cost of believing, of the ante, is so small compared to the infinite reward (the size of the pot).  I worked with an accountant who had a system for buying lottery tickets – his break from understanding Pascal was that both cost and reward in the statistics of lottery cards are finite – the odds really can be calculated.  Lotteries are a tax on people who don’t want to do the math.

    Covid is also a game for statisticians.  It’s still at a point where we have a bunch of unknowns, but there are fewer unknowns than there were 6 months ago.  Then the Diamond Princess was a horrifying news story – now it is data, as taken from statista.com: “A total of 712 people were infected with COVID-19 on the Diamond Princess cruise ship – 567 passengers and 145 crew members. The cruise ship, which had more than 3,500 people on board, was quarantined for around two weeks. All passengers and crew members had finally disembarked the ship by March 1, 2020.”

    Wikipedia shows 14 deaths among the 712 infected people on the Diamond Princess.  Somewhere right around 2%.  About the same as Texas and California, and lower than New York, New Jersey, and Massachusetts.

    We’re still looking at less than perfect representative numbers – but Diamond Princess has provided some data:  roughly 20% of those exposed between January 20 and February 19 wound up infected.  In March, we had estimated R0 values from 1.5 to 3.5.  Now, we have Rt values (Average number of people who become infected by an infectious person with COVID-19 in the U.S. as of October 17, 2020).  Those numbers vary from 0.91 in Mississippi to 1.31 in New Mexico.  Montana scored 1.2. 

    Generally speaking, in the absence of data, we have a tendency to assume the worst.  We have data now.  The actual infectivity is lower than the initial data – perhaps because the precautions have been effective, perhaps it is related to the fact that 80% of the people on Diamond Princess did not catch covid.  Correlation is not causation.  Causation is inferred from statistics, not proven.

    This week, an article from the American Society of Hematology stated: “Blood type O may offer some protection against COVID-19 infection, according to a retrospective study. Researchers compared Danish health registry data from more than 473,000 individuals tested for COVID-19 to data from a control group of more than 2.2 million people from the general population. Among the COVID-19 positive, they found fewer people with blood type O and more people with A, B, and AB types.

    Making statistics personal is a challenge – data suggests that my risk factors are increased by age (70), height (6’3”), asthma, and diabetes.  How much we don’t know – for neither my asthma nor the diabetes scores particularly high.  My risk factors are reduced by my blood type.  So let’s look at masking.

    My mask is like Pascal’s wager – it seems logical that any level of masking will reduce transmission.  The question is: “How much?”  I don’t have that answer.  Does my mask protect me significantly?  When I have been in surgery, the surgeons and medical staff were masked to protect me.  Similarly, is my mask to protect others?   Business Insider offers an article comparing mask effectiveness, but cautions that “Mask studies should be taken with a grain of salt.”  My mask is like Pascal’s wager – and I hope wearing it adds a sense of security. It costs me little to wear it.

  • Do you believe in science?

    I listened to a presidential debate question, “Do you believe in science?”  It seems a simple, yes or no question.  It isn’t.  I’m a sociologist, and retired demographer.  I believe that scientific method is the best way to move toward understanding and describing the world.  I’m a numbers, statistics and data guy.  A positivist.  I like my theories to be supported by numbers.  I tend to use functionalism and conflict theory – both provide frameworks that can give me the numbers that make my science work.  There is a whole lot of Karl Marx in conflict theory.

    Jurgen Habermas was at the other side of the theoretical spectrum – developing modern Critical Theory.  In his work, the model focused on language, symbolism, communication and social construction,  Horkheimer described critical theory as seeking “to liberate human beings from the circumstances that enslave them.” There is a whole lot of Karl Marx in critical theory. It’s not the sort of approach that gets a lot of measurable data.  I’m pretty sure that “positivist” is not a word that critical theorists use to describe approval.

    Our chosen theoretical approaches limit our research. I believe in the value of scientific method.  I also believe anything we accept as fact is tentative – my scientific facts are the best explanation available, with the data we have now.  Critical theorists may develop an explanation that positivists can quantify.  Most of the time that doesn’t occur.  I like my approach better – but “Do you believe in science?” is not a good question.