Monday, April 23, 2018

The Challenges of Measuring Discrimination Against LGBTI Individuals

It seems quite clear (at least to me) that there is often discriminatory feeling against lesbians, gay men, bisexuals, transgender and intersex people. One can also observe a range of survey evidence and outcomes for people in these categories in terms of family life (including marriage and parenthood), education, health, and economic outcomes. But for economists, at least, drawing a firm connection from discrimination to outcomes can be tough. Marie-Anne Valfort has written "LGBTI in OECD Countries: A Review," which appears in the OECD Social, Employment and Migration Working Papers No. 198  (June 22, 2017).

The lengthy report pulls together a considerable body of evidence that exists on the topic, and is also clear-eyed and thoughtful about the analytical difficulties that arise in this area. Here, I'll sidestep her discussion of family life, education, and health issues, and focus on economic outcomes.

One problem in this area limitations on data.  In survey data, for example, people give dramatically different answers to whether they identify as LGB, whether they have participated in same-sex sexual behavior, or whether they have sometimes felt a same-sex attraction. If it is hard to define a group, then coming up with summary statistics to characterize outcomes for that group will be difficult. And carrying out studies that seek to isolate the effects of discrimination will be difficult, too.

After reviewing the evidence for the US, where the data is better than in many places, Valfort offers this summary (references to later sections of the paper are omitted from the quotation:
"Tentative but conservative measures suggest that LGBTI stand for a sizeable minority. They represent approximately 4.5% of the total population in the US, a proportion that can be broken down as follows among LGBTI subgroups (bearing in mind that these subgroups partly overlap): 3.5% for lesbians, gay men and bisexuals if one relies on sexual self-identification known to yield lower estimates than sexual behaviour or attraction, 0.6% for transgender people and 1.1% for intersex people."
As Valfont summarizes, there have been three broad ways to look at the extent to which differences across groups are due to discrimination. One approach looks at "observational" data, and tries to adjust for factors that seem likely to matter. For example, one could look at income for people, making a statistical adjustment for levels of education, job experience, age, occupation type, and so on. If there is a wage gap remaining after taking these other factors into account, then there is at least some reason to suspect that discrimination might be an issue. However, drawing firm conclusions from such studies is difficult, for a number of reasons that Valfont describes:

-- It seems likely that LGBTI people are likely to move to places where social acceptance of their group is greater and discrimination is less. "Failing to control for this geographic sorting could therefore lead to conclude that LGBT people do not face discrimination while they actually do, an error better known as the “omitted variables bias”.  The underlying problem is that factors not observed in the data can make a difference.

-- Data is weak, and "disclosure of sexual orientation, gender identity or intersex status of LGBTI to their social environment is not a given." It is possible, as Valfont writes: In other words, only the most successful gay men and lesbians (those suffering the least from discrimination) may disclose their sexual orientation to the interviewer."

-- Valfont points out that a number of studies measure the  LGBTI population indirectly, based on surveys where people say they are living with a same-sex partner. " Put differently, most population-based surveys only allow for identifying partnered homosexuals and comparing how they fare relative to their heterosexual counterparts ...  that is surely not representative of the LGBTI population as a whole."

-- Adjusting for other factors isn't as simple as it seems, either. For example, say for the sake of argument that there is discrimination against LGBTI indiviuals in school and when growing up and thinking about occupational possibilities. Then if a researcher comes along later, and does a statistical adjustment for level of education and occupation, that researcher is (in a statistical sense) wiping out any discrimination which occurred at that earlier stage.

-- There is an issue of "household specialization bias." In heterosexual household, it is still fairly common to find a situation in which the man has a longer-term and heavier-hour commitment to the (paid) labor force than does the woman. "In heterosexual households, men are indeed typically more engaged in market activities than are women. Therefore, the average partnered heterosexual man should be more involved in the labour market than the average partnered gay man, while the average partnered heterosexual woman should be less involved in this market than the average partnered lesbian.:" Thus, findings of a wage penalty for gay men and wage premium for lesbians are common:  "However, multivariate analyses of individual labour earnings with couples-based survey data do not provide results consistent with lower job satisfaction among both gay men and lesbians. These analyses, which amount to 18 studies (26 estimates for gay men and 30 estimates for lesbians) ... reveal an earnings penalty for partnered gay men but an earnings premium (or no effect) for partnered lesbians. ...[T]his pattern is observed irrespective of the country where, or the time when the data used in these studies were collected. More precisely, partnered gay men suffer an average penalty of 8% while partnered lesbians enjoy an average premium of 7%." Sorting out how to think about this household specialization bias and to adjust for it isn't an easy task.

Another broad approach to looking at discrimination is "experimental" studies. Broadly speaking, these fall into two categories. In "correspondence" studies, researchers send out a bunch of job applications that are meant to be essentially the same, except that some of them have a fairly clear identifier that the applicant is likely to be LGBTI (or in other studies, there will be information to reveal race/ethnicity or male/female). Valfort reports:
"[T]he 13 correspondence studies that have tested for hiring discrimination based on sexual orientation typically point to an unfair treatment of the gay male and lesbian applicants: on average, they are 1.8 times less likely to be called back by the recruiter than are their heterosexual counterparts. For gay men, the heterosexual-to-homosexual callback rates ratio varies from 1.1 (Sweden – Ahmed, Andersson and Hammarstedt (2013b) and the UK - Drydakis (2016)) to 3.7 (Cyprus – Drydakis (2014b)) with an average at 1.9. For lesbians, it varies from 0.9 (Belgium – Baert (2014)) to 4.6 (Cyprus – Drydakis (2014b)) with an average at 1.7. Consistent with attitudes toward gay men being more negative than attitudes toward lesbians, homosexual men face slightly stronger hiring discrimination than do homosexual women."
Such studies offer compelling evidence that discrimination exists, but by the nature of such studies, they can only look at the non-face-to-face part of the job market. As Valfort writes:
"Moreover, this weakness implies that discrimination in the labour market is measured at only one point of an individual’s career, i.e. his/her access to a job interview. It says nothing however about his/her likelihood of being hired, or paid equally and promoted once hired. Nevertheless, audit studies indicate that, conditional on being interviewed, individuals from the minority (i.e. the group that typically receives the lowest rate of invitation to a job interview) are also less likely to be hired (e.g. C├ędiey and Foroni (2008)). These findings suggest that correspondence studies underestimate hiring discrimination."
The other experimental approach are "audit" studies, which involve people who have been trained to play a role of a person with a certain background who is applying for job, or for a mortgage, or trying to rent an apartment, and so on. Audit studies have been a powerful way of revealing racial discrimination in a US context, but they have difficulties. Because they involve real people doing real-life applications and waiting for answers, such studies are often time-consuming and expensive. But they are workable in certain contexts. Valfont gives many examples, but here are two of them:
Various field experiments have shown that sexual minorities face discrimination in their everyday life. For instance, Jones (1996) sends letters from either a same-sex or opposite-sex couple, requesting weekend reservations for a one-bed room in hotels and bed-and-breakfast establishments in the US. His results show that opposite-sex couples are granted 20% more reservations than both male and female same-sex couples. Similarly, Walters and Curran (1996) conduct an audit study where same-sex and opposite-sex couples enter retail stores in the US while an observer measures the time it takes for the staff to welcome them. They find this time to be significantly less for heterosexual than for homosexual couples who often were not assisted and who were more likely to be repudiated.
Discrimination can manifest itself in many ways: in social settings, education, health, family life, occupational pressures, job interviews, promotions and wage raises, and more. Understanding where its manifestations are more powerful can be an important step in thinking about how best to address it. 

I do wonder if changes in the legal status of LGBTI individuals may offer a handle on looking at different types of discrimination. For example, the number of gay marriages reveals something about the number of such marriages that would have been blocked earlier. Similarly, changes in occupations and pay patterns that happen after legal changes will reveal something about earlier patterns of discrimination, too.

Those interested in this subject might also want to check the post on "Some Patterns for Same-Sex Households" (February 19, 2018).

Saturday, April 21, 2018

Most Global Violent Deaths are Murder, Not War

I did not know that by far most violent deaths in the world are a result of murder, not war. The pattern is reported in Global Violent Deaths 2017: Time to Decide, by Claire Mc Evoy and Gergely Hideg. It's a report from Small Arms Survey, which is a research center at the Graduate Institute of International and Development Studies in Geneva, Switzerland. The report notes:
'"In 2016, interpersonal and collective violence claimed the lives of 560,000 people around the world. About 385,000 of them were the victims of intentional homicides, 99,000 were casualties of war, and the rest died in unintentional homicides or due to legal interventions. ...

"In 2016, firearms were used to kill about 210,000 people—38 per cent of all victims of lethal violence. About 15 per cent of these individuals died in direct conflict, while the majority fell victim to intentional homicide (81 per cent). ...

"In terms of homicides alone, states could save up to 825,000 lives between 2017 and 2030 if they gradually stepped up their approach to crime control and prevention to reach the violence reduction levels of the top performers in their respective world regions. In so doing, states in the subregion of Latin America and the Caribbean would benefit most, saving as many as 489,000 lives in total by 2030, followed by states in South-eastern Asia (86,000 lives) and Eastern Africa (56,000 lives) ..."
This report doesn't present country-by-country data on homicide rates. But the World Bank DataBank website tabulates country-by-country-rates on Intentional Homicide using data from the UN Office on Drugs and Crime's International Homicide Statistics database. In 2015, for example, the global intentional  homicide rate in this dataset was 5.3 per 100,000, while the US intentional homicide rate was 4.9 per 100,000.

For the situation of deaths in armed conflict, the SAS report shows that in recent years by far the largest share are represented by events in Syria, Iraq, and Afghanistan.

Homage: I ran into the SAS report because it was a lead story in the April 5 issue of the Economist magazine.

Friday, April 20, 2018

The Clean Cooking Problem: 2.3 Million Deaths Annually

"Today around 2.8 billion people – 38% of the global population and almost 50% of the population in developing countries – lack access to clean cooking. Most of them cook their daily meals using solid biomass in traditional stoves. In 25 countries, mostly in sub-Saharan Africa, more than 90% of households rely on wood, charcoal and waste for cooking. Collecting this fuel requires hundreds of billions of hours each year, disproportionately affecting women and children. Burning it creates
noxious fumes linked to 2.8 million premature deaths annually."

Thus reports "Chapter 3: Access to Clean Cooking,:" from Energy Access Outlook 2017: From Poverty to Prosperity, published in October 2017 by the International Energy Agency and the OECD.  The report continues:
"Progress on access to clean cooking has been gathering momentum in parts of Asia, backed by targeted policies focussed mainly on the use of LPG [liquified petroleum gas]. In China, the share of he population relying on solid fuels for cooking declined from over one-half in 2000 to one-third in 2015. In Indonesia, the share of the population using solid biomass and kerosene fell from 88% in 2000 to 32% in 2015. Despite these efforts, the number of people without clean cooking access has stayed flat since 2000, with population growth outstripping progress in many countries. In sub-Saharan Africa, there were 240 million more people relying on biomass for cooking in 2015 compared to 2000."

The report estimates that an investment of an additional $42 billion, above and beyond what is already happening, would be needed by 2030 to provide access to clean cooking for the 2.3 billion people who otherwise will not have access to clean cooking by that time. At one level, $42 billion is a lot of money: at another level, it's almost an absurdly cheap price to pay for the potential benefits.

Other chapters of the report have a useful overview of the progress toward all people having access to electricity. The big success story in the last 20 years or so is India. The lagging region is sub-Saharan Africa.

Thursday, April 19, 2018

A Classic Question: Does Government Can Empower or Stifle?

If you look at the high-income countries of the world--the US and Canada, much of Europe, Japan, Australia--all of them have government which spend amounts equal to one-third or more of GDP (combining both central and regional or local government). Apparently, high-income countries have relatively large government. Conversely, when you look at some of the world's most discouraging and dismal economic situations--say, Zimbabwe, North Korea, or Venezuela--it seems clear that the decisions of the government have played a large role in their travails. So arises a classic question: In what situations and with what rules does government empower its people and economy, and under what situations and with what rules does the government stifle them?

Like all classic questions, only those who haven't thought about it much will offer you an easy answer. Peter Boettke instead offers a  thoughtful exploration of many of the complexities and tradeoffs in his Presidential Address to the Southern Economic Association, "Economics and Public Administration," available in the April 2018 issue of the Southern Economic Journal (84:4, pp. 938-959).

Boettke offers a reminder that a number of prominent economists have pondered the issue of how states can empower or become predatory. For example, here are reminders from a couple of Nobel laureates:
Douglass North [Nobel '93] in Structure and Change in Economic History (1981) ... said that the state, with its ability to define and enforce property rights, can provide the greatest impetus for economic development and human betterment, but can also be the biggest threat to development and betterment through its predatory capacity. James Buchanan [Nobel '86] in Limits of Liberty (1975) stated the dilemma that must be confronted as follows—the constitutional contract must be designed in such a way that empowers the protective state (law and order) and the productive state (public goods) while constraining the predatory state (redistribution and rent-seeking). If the constitutional contract cannot be so constructed, then economic development and human betterment will not follow.
Although Boettke doesn't make the point here, the authors of the US Constitution struggled as well with the idea that government was an absolute necessity, but finding a way for government to be controlled was also a necessity. As James Madison wrote in Federalist #51:
"If men were angels, no government would be necessary. If angels were to govern men, neither external nor internal controls on government would be necessary. In framing a government which is to be administered by men over men, the great difficulty lies in this: you must first enable the government to control the governed; and in the next place oblige it to control itself. A dependence on the people is, no doubt, the primary control on the government; but experience has taught mankind the necessity of auxiliary precautions."
This challenge of building a government that is strong, but not too strong, and strong only in certain ways while remaining weak in others, is not just a matter of writing up a constitution or design of a government. Plenty of governments act oppressively at times, or even a majority of the time, while having the form of elections and constitutional rights. The heart of the issue, Boettke argues, runs deeper than the formal structures of government, and down to the bedrock of the social institutions on which these forms of government are based. He writes:
"The observational genius of the 20th century Yogi Berra once captured the essence of this argument while watching a rookie ball player attempting to imitate the batting stance of Frank Robinson, the recent triple crown winner, when he advised, “if you can t imitate him, don t copy him.” ... The countries plagued by poverty cannot simply copy the governmental institutions of those that are not so plagued by poverty. They are constrained at any point in time by the existing institutional possibilities frontier, and thus must shift the institutional possibilities frontier as technology and human capital adjust to find the constitutional contract that can effectively empower the protective and productive state, while effectively constraining the predatory state."
Economists have often ducked or assumed this question of institution building. For example, most of the arguments that economists make about how markets function, or about how self-interested sellers and buyers may act as if ruled by an "invisible hand" to promote social welfare, are based on the assumption that a decently functioning government is hovering in the background. Boettke refers to an essay by Lionel Robbins and writes:
"Adam Smith and his contemporaries never argued that the individual pursuit of self-interest will always and everywhere result in the public interest, but  rather that the individual pursuit of self-interest within a specific set of institutional arrangements— namely well-defined and enforced private property rights—would produce such a result. Though as Robbins (ibid, p. 12) writes, “You cannot understand their attitude to any important concrete measure of policy unless you understand their belief with regard to the nature and effects of the system of spontaneous-cooperation.” The system of spontaneous-cooperation, or economic freedom, does not come about absent a “firm framework of law and order.” The “invisible  hand,” according to the classical economists, “is not the hand of some god or some natural agency independent of human effort; it is the hand of the lawgiver, the hand which withdraws from the sphere of the pursuit of self-interest those possibilities which do not harmonize with the public good” (Robbins 1965, p. 56).
"In other words, the market mechanism works as described in the theory of the “invisible hand” because an institutional configuration was provided for by a prior Non-Market Decision Making process. The correct institutions of governance must be in place for economic life to take place (within those institutions)."
When we move outside the realm of market transactions set against a backdrop of decently functioning government, social scientists find it harder to draw conclusions. "But what happens when we move outside the realm of the market economy? Public administration begins where the realm of rational economic calculation ends."

On one side, decisions made by public administration areunlikely to involve competitive producers, choices made by consumers between these producers, and a price mechanism. Nonetheless, public decisions still have tradeoffs, and still face questions of whether the marginal benefits of a certain action (or a change in spending) will outweigh the marginal costs. 

Moreover, we know from sad experience that public administration is subject to special interest pressures and being captured by those who are supposedly the subjects of the regulation. We know that a number of politicians and government workers (no need to quibble over the exact proportion) put a high priority on pursuing their own personal career self-interest. We know that when a private sector firm fails to provide what customers want, it goes broke and is replaced by other firms, but that when a part of government fails badly in providing what citizens want, the part of government does not disappear and instead typically claims that failure is a reason for giving it more resources to do the job. 

One approach to all these issues is to take what Boettke calls "the God s-eye-view assumption," in which the all-seeing, all-wise, and all-beneficent economist can see the path that must be taken. But if you instead are skeptical of economists (and others involved in politics), then Boettke points out that some questions about public administration must be faced.
"Those who favor public administration over the market mechanism must at least acknowledge the question raised earlier—how is government going to accomplish the task of economic management?What alternative mechanisms in public administration will serve the role that property, prices and profit and loss serve within the market setting?
"Let us consider the following example—a vacant piece of land in a down-town area of a growing city. The plot of land could be used as a garage, which would complement efforts to develop commercial life downtown. Or, it could be used to build a park, encouraging city residents to enjoy green space and outdoor activities. Alternatively, it could be used to locate a school which would help stimulate investment in human capital. All three potential uses are worthy endeavors. If this was to be determined by the market, then the problem would be solved via the price mechanism and the willingness and the ability to pay. But if led by government, the use of this land will need to be determined by public deliberation and voting. We cannot just assume that the “right”
decision on the use of this public space will bemade in the public arena. In fact, due to a variety of problems associated with preference aggregation mechanisms, we might have serious doubts as to any claim of “efficiency” in such deliberations. ...

"More recently, Richard Wagner, in Politics as a Peculiar Business (2016, p. 146ff), uses the example of a marina surrounded by shops, hotels, and restaurants—think of Tampa, Florida. The marina, shops, hotels, and restaurants operate on market principles, but the maintenance of the roads and waterways are objects of collective decision making. Road maintenance and waterway dredging, for example, will be provided by government bureaus, but how well those decisions are made will have an impact on the operation of the commercial enterprises, and the viability of the commercial enterprises will no doubt have influence on the urgency and care of these bureaucratic efforts."
Boettke argues that "the idea of a unitary state populated by omniscient and benevolent expert bureaucrats" should be rejected. He also argues that economists (and other social scientists) can be prone to casting themselves in the role of these omniscient and benevolent experts. He quotes from near the beginning of James Buchanan's  1986 Nobel lecture:  “Economists should cease proffering policy advice as if they were employed by a benevolent despot, and they should look to the structure within which political decisions are made.” 

We live in a complex world, and there absolutely is a need for expert advice in many areas. But there is also crying need for experts to go beyond arguing with each other, or insulting the opposition, or attempting to get a grip on the levers of political power. There is a need for economists and other experts to participate in and to respect a broader process of institution-building and participating in the social consensus. (In a small way, this "Conversable Economist" blog is an attempt to broaden the social conversation in a way that includes expert insight without overly deferring to it. )

Boettke cites some comments from yet another Nobel laureate along these lines: 
"Elinor Ostrom concludes her 2009 Nobel lecture by summarizing the main lessons learned in her intellectual journey, and they are that we must “move away from the presumption that the government must” solve our problems, that “humans have a more complex motivational structure and more capability to solve social dilemmas” than traditional theory suggests, and that “a core goal of public policy should be to facilitate the development of institutions that bring out the best in humans ... ”  .  Self-governing democratic societies are fragile entities that require continual reaffirmation by fallible but capable human beings. “We need to ask,” Elinor Ostrom continued, “how diverse polycentric institutions help or hinder the innovativeness, learning, adapting, trustworthiness, levels of cooperation of participants, and the achievement of a more effective, equitable and sustainable outcomes at multiple scales." 

Wednesday, April 18, 2018

Global Debt Hits All-Time High

"At $164 trillion—equivalent to 225 percent of global GDP—global debt continues to hit new record highs almost a decade after the collapse of Lehman Brothers. Compared with the previous peak in 2009, the world is now 12 percent of GDP deeper in debt, reflecting a pickup in both public and nonfinancial private sector debt after a short hiatus (Figure 1.1.1). All income groups have experienced increases in total debt but, by far, emerging market economies are in the lead. Only three countries (China, Japan, United States) account for more than half of global debt (Table 1.1.1)—significantly greater than their share of global output."

Thus notes the IMF in the April 2018 issue of Fiscal Monitor (Chapter 1: "Saving for a Rainy Day," Box 1.1, as usual, citations omitted from the quotation above for readability). Here's the figure and the table mentioned in the quotation.
The figure shows public debt in blue and private debt in red. In some ways, the recent increase doesn't stand out dramatically on the figure. But remember that the vertical axis is being measured as a percentage of the world GDP of about $87 trillion, so the rising percentage represents a considerable sum. 

Here's an edited version of the table, where I cut a column for 2015. The underlying source is the same as the figure above. As noted above, the US, Japan, and China together account for half of  total global debt. 

The rise in debt in China is clearly playing a substantial role here. Explicit central government debt in China is not especially high. But corporate debt in China has risen quickly: as the IMF notes of the period since 2009, "China alone explains almost three-quarters of the increase in global private debt."

In addition, China faces a surge of off-budget borrowing from financing vehicles used by local governments, which often feel themselves under pressure to boost their local economic growth. The IMF explains: 
 "The official debt concept [in China] points to a stable debt profile over the medium term at about 40 percent of GDP. However, a broader concept that includes borrowing by local governments and their financing vehicles (LGFVs) shows debt rising to more than 90 percent of GDP by 2023 primarily driven by rising off-budget borrowing. Rating agencies lowered China’s sovereign credit ratings in 2017, citing concerns with a prolonged period of rapid credit growth and large off-budget spending by LGFVs.
"The Chinese authorities are aware of the fiscal risks implied by rapidly rising off-budget borrowing and undertook reforms to constrain these risks. In 2014, the government recognized as government obligations two-thirds of legacy debt incurred by LGFVs (22 percent of GDP). In 2015, the budget law was revised to officially allow provincial governments to borrow only in the bond market, subject to an annual threshold. Since then, the government has reiterated the ban on off-budget borrowing by local governments, while more strictly regulating the role of the government in public-private partnerships and holding local officials accountable for improper borrowing. Given these measures, the authorities do not consider the LGFV off-budget borrowing as a government obligation under applicable laws.
"There is some uncertainty regarding the degree to which these measures will effectively curb off-budget borrowing. "
An underlying theme of the IMF report is that when an economy is in relatively good times, like the US economy today, it should be figuring out ways to put its borrowing on a downward trend for the next few years. A similar lesson applies to China, where there appears to be some danger that the high levels of borrowing from firms and from local governments are creating future risks.

One old lesson re-learned in the global financial crisis is that high levels of debt can be dangerous. If stock prices rise and then fall, investors will be unhappy that they lost their gains--but for many of them, the gains were only on paper, anyway. But debt is different. If circumstances arises where debts are less likely to be repaid, then financial institutions may well find it hard to raise capital, and will be pressured to cut back on lending. If borrowing was helping to hold asset prices high (including housing, land, or stocks), then a decline in borrowing can cause those asset prices to drop. Lower asset prices make it harder to repay borrowed money, tightening the financial crunch, and slowing an economy further. 

When global debt as a share of GDP is hitting an all-time high, it's worth paying attention to the risks involved.

Tuesday, April 17, 2018

Some Economics for Tax Filing Day

U.S. tax returns and taxes owed for 2017 are due today, April 17. To commemorate, I offer some connections to five posts about federal income taxes from the last few years. Click on the links if you'd like additional discussion and sources for of any of these topics.

1) Should Individual Income Tax Returns be Public Information? (March 30, 2015)
"My guess is that if you asked Americans if their income taxes should be public information, the answers would mostly run the spectrum from "absolutely not" to "hell, no." But the idea that tax returns should be confidential and not subject to disclosure was not a specific part of US law until 1976. At earlier periods of US history, tax returns were sometimes published in newspapers or posted in public places. Today, Sweden, Finland, Iceland and Norway have at least some disclosure of tax returns--and since 2001 in Norway, you can obtain information on income and taxes paid through public records available online."

2) How much does the federal tax code reduce income inequality, in comparison with  social insurance spending and means-tested transfers? 

"The Distribution and Redistribution of US Income" (March 20, 2018) is based on a report from the Congressional Budget Office, "The Distribution of Household Income, 2014" (March 2018).

From the post: "The vertical axis of the figure is a Gini coefficient, which is a common way of summarizing the extent of inequality in a single number. A coefficient of 1 would mean that one person owned everything. A coefficient of zero would mean complete equality of incomes.

"In this figure, the top line shows the Gini coefficient based on market income, rising over time.

"The green line shows the Gini coefficient when social insurance benefits are included: Social Security, the value of Medicare benefits, unemployment insurance, and worker's compensation. Inequality is lower with such benefits taken into account, but still rising. It's worth remembering that almost all of this change is due to Social Security and Medicare, which is to say that it is a reduction in inequality because of benefits aimed at the elderly.

"The dashed line then adds a reduction in inequality due to means-tested transfers. As the report notes, the largest of these programs are "Medicaid and the Children’s Health Insurance Program (measured as the average cost to the government of providing those benefits); the Supplemental Nutrition Assistance Program (formerly known as the Food Stamp program); and Supplemental Security Income." What many people think of as "welfare," which used to be called Aid to Families with Dependent Children (AFDC) but for some years now has been called Temporary Assistance to Needy Families (TANF), is included here, but it's smaller than the programs just named.

"Finally, the bottom purple line also includes the reduction in inequality due to federal taxes, which here includes not just income taxes, but also payroll taxes, corporate taxes, and excise taxes."

3) "How Raising the Top Tax Rate Won't Much Alter Inequality" (October 23, 2015)

"Would a significant increase in the top income tax rate substantially alter income inequality?" William G. Gale, Melissa S. Kearney, and Peter R. Orszag ask the question in a very short paper of this title published by the Economic Studies Group at the Brookings Institution. Their perhaps surprising answer is "no."

The Gale, Kearney, Orszag paper is really just a set of illustrative calculations, based on the well-respected microsimulation model of the tax code used by the Tax Policy Center. Here's one of the calculations. Say that we raised the top income tax bracket (that is, the statutory income tax rate paid on a marginal dollar of income earned by those at the highest levels of income) from the current level of 39.6% up to 50%. Such a tax increase also looks substantial when expressed in absoluted dollars. By their calculations, "A larger hike in the top income tax rate to 50 percent would result, not surprisingly, in larger tax increases for the highest income households: an additional $6,464, on average, for households in the 95-99th percentiles of income and an additional $110,968, on average, for households in the top 1 percent. Households in the top 0.1 percent would experience an average income tax increase of $568,617."

In political terms, at least, this would be a very large boost. How much would it affect inequality of incomes? To answer this question, we need a shorthand way to measure inequality, and a standard tool for this purpose is the Gini coefficient. This measure runs from 0 in an economy where all incomes are equal to 1 in an economy where one person receives all income (a more detailed explanation is available here). For some context, the US distribution of income based on pre-tax income is .610. After current tax rates are applied, the after-tax distribution of income is .575.

If the top tax bracket rose to 50%, then according to the first round of Gale, Kearney, Orszag calculations, the Gini coefficient for after-tax income barely fall, dropping to .571. For comparison, the Gini coefficient for inequality of earnings back in 1979, before inequality had started rising, was .435. ... 

Raising the top income tax rate to 50% brings in less than $100 billion per year. Total federal spending in 2015 seems likely to run around $3.8 trillion. So it would be fair to say that raising the top income tax rate to 50% might increase total federal revenues by about 2%.

4) The top marginal income tax rates used to be a lot higher, but what share of taxpayers actually faced those high rates,, and much revenue did those higher rates actually collect?

Compare "Top Marginal Tax Rates: 1958 vs. 2009" (March 16, 2012), which is based on a short report by Daniel Baneman and Jim Nunns,"Income Tax Paid at Each Tax Rate, 1958-2009," published by the Tax Policy Center. The top statutory tax rate in 2009 was 35%; back in 1958, it was about 90%. What share of taxpayer returns paid these high rates? Across this time period, roughly 20% of all tax returns owed no tax, and so faced a marginal tax rate of zero percent. Back in 1958, the most common marginal tax brackets faced by taxpayers were in the 16-28% category; since the mid-1980s, the most common marginal tax rate faced by taxpayers has been the 1-16% category. Clearly, a very small proportion of taxpayers actually faced the very highest marginal tax rates.

How much revenue was raised by the highest marginal tax rates? Although the highest marginal tax rates applied to a tiny share of taxpayers, marginal tax rates above 39.7% collected more than 10% of income tax revenue back in the late 1950s. It's interesting to note that the share of income tax revenue collected by those in the top brackets for 2009--that is, the 29-35% category, is larger than the rate collected by all marginal tax brackets above 29% back in the 1960s.

5) Did you know "How Milton Friedman Helped to Invent Tax Withholding" (April 12, 2014)?

The great economist Milton Friedman--known for his pro-market, limited government views--helped to invent government withholding of income tax. It happened early in his career, when he was working for the U.S. government during World War II, and the top priority was to raise government revenues to support the war effort. Of course, the IRS opposed the idea at the time as impractical.

Monday, April 16, 2018

The Share of Itemizers and the Politics of Tax Reform

Those who fill out a US tax return always face a choice. On one hand, there is a "standard deduction," which is the amount you deduce from your income before calculating your taxes owed on the rest. On the other hand, there are a group of individual tax deductions: for mortgage interest, state and local taxes, high medical expenses, charitable contributions, and others. If the sum of all these deductions is larger than the standard deduction, then a taxpayer will "itemize" deductions--that is, filling out additional tax forms that list all the deductions individually. Conversely, if the standard deduction is larger than the sum of all the individual deductions is not larger than the list of itemized deductions, then the taxpayer just uses the standard deduction, and doesn't go through the time and bother of itemizing.

In the last 20 years or so, typically about 30-35% of federal tax returns found it worthwhile to itemize deductions.
But the Tax Cuts and Jobs Act passed into law and signed by President Trump in December 2017 will change this pattern substantially. The standard deduction increases substantially, while limits or caps are imposed on some prominent deductions. As a result, the number of taxpayers who will find it worthwhile to itemize will drop substantially.

Simulations from the Tax Policy Center, for example, suggest that the total number of itemizers will fall by almost 60%, from 46 million to 19 million -- which means that in next year's taxes, maybe only about 11% of all returns will find it worthwhile to itemize.

Set aside all the arguments over pros and cons and distributional effects of the changes in the standard deduction and the individual deductions, and focus on the political issue. It seems to me that this dramatic fall in the number of taxpayers, especially if it is sustained for a few years, will realign the political arguments over future tax reform. If one-third or so of taxpayers are itemizing--and those who itemize are typically those with high incomes and high deductions who make a lot of noise--then reducing deductions will be politically tough. But if only one-ninth of taxpayers are itemizing, while eight-ninths are just taking the standard deduction, then future reductions in the value of tax deductions may be easier to carry out. It will be interesting to see if the political dynamics of tax reform shift along these lines in the next few years.

When Britain Repealed Its Income Tax in 1816

Great Britain first had an income tax in 1799, but then abolished it in 1816. In honor of US federal tax returns being due tomorrow, April 17, here's a quick synopsis of the story.

Great Britain was in an on-and-off war with France for much of the 1790s. The British government borrowed heavily and was short of funds. When Napoleon came to power in 1799, the government under Prime Minister William Pitt introduced a temporary income tax. Here's a description from the website of the British National Archives:
‘Certain duties upon income’ as outlined in the Act of 1799 were to be the (temporary) solution. It was a tax to beat Napoleon. Income tax was to be applied in Great Britain (but not Ireland) at a rate of 10% on the total income of the taxpayer from all sources above £60, with reductions on income up to £200. It was to be paid in six equal instalments from June 1799, with an expected return of £10 million in its first year. It actually realised less than £6 million, but the money was vital and a precedent had been set.
In 1802 Pitt resigned as Prime Minister over the question of the emancipation of Irish catholics, and was replaced by Henry Addington. A short-lived peace treaty with Napoleon allowed Addington to repeal income tax. However, renewed fighting led to Addington’s 1803 Act which set the pattern for income tax today. ...

Addington’s Act for a ‘contribution of the profits arising from property, professions, trades and offices’ (the words ‘income tax’ were deliberately avoided) introduced two significant changes:
  • Taxation at source - the Bank of England deducting income tax when paying interest to holders of gilts, for example
  • The division of income taxes into five ‘Schedules’ - A (income from land and buildings), B (farming profits), C (public annuities), D (self-employment and other items not covered by A, B, C or E) and E (salaries, annuities and pensions).
 Although Addington’s rate of tax was half that of Pitt’s, the changes ensured that revenue to the Exchequer rose by half and the number of taxpayers doubled. In 1806 the rate returned to the original 10%.
Pitt in opposition had argued against Addington’s innovations: he adopted them almost unchanged, however, on his return to office in 1805. Income tax changed little under various Chancellors, contributing to the war effort up to the Battle of Waterloo in 1815.
Perhaps unsurprisingly, Britain's government was not enthusiastic about repealing the income tax even after the defeat of Napoleon. But there was an uprising of taxpayers. The website of the UK Parliament described it this way:
"The centrepiece of the campaign was a petition from the City of London Corporation. In a piece of parliamentary theatre, the Sheriffs of London exercised their privilege to present the petition from the City of London Corporation in person. They entered the Commons chamber wearing their official robes holding the petition.
"The petition reflected the broad nature of the opposition to renewing the tax. Radicals had long complained that ordinary Britons (represented by John Bull in caricatures) had borne the brunt of wartime taxation. Radicals argued that the taxes were used to fund 'Old Corruption', the parasitic network of state officials who exploited an unrepresentative political system for their own interests.
"However, the petitions in 1816 came from very different groups, including farmers, businessmen and landowners, who were difficult for the government to dismiss. Petitioners, such as Durham farmers, claimed they had patriotically paid the tax during wartime with 'patience and cheerfulness', distancing themselves from radical critics of the government.
"In barely six weeks, 379 petitions against renewing the tax were sent to the House of Commons. MPs took the opportunity when presenting these petitions, to highlight the unpopularity of the tax with their constituents and the wider public. ... Ministers were accused of breaking the promise made in 1799 when the tax was introduced as a temporary, wartime measure and not as a permanent tax. The depressed state of industry and agriculture was blamed on heavy taxation.
"The tax was also presented as a foreign and un-British measure that allowed the state to snoop into people's finances. As the City of London petition complained, it was an 'odious, arbitrary, and detestable inquisition into the most private concerns and circumstances of individuals'."
Also unsurprisingly, the repeal of the income tax led the British government to raise other taxes instead. The BBC writes: Forced to make up the shortfall in revenue, the Government increased indirect taxes, many of which, for example taxes on tea, tobacco, sugar and beer, were paid by the poor. Between 1811 and 1815 direct taxes - land tax, income tax, all assessed taxes - made up 29% of all government revenue. Between 1831 and 1835 it was just 10%."

There's a story that when Britain repealed its income tax in 1816, Parliament ordered that the records of tax be destroyed, so posterity would never learn about it and be tempted to try again. The BBC reports:
"Income tax records were then supposedly incinerated in the Old Palace Yard at Westminster. Whether this bonfire really took place we can't say. Several historians who have studied the period refer to the event as a story or legend that may have been true. Perhaps the most convincing evidence are reports that, in 1842, when Peel re-introduced income tax, albeit in a less contentious form, the records were no longer available. Another story is that those burning the records were unaware of the fact that duplicates had been sent for safe-keeping to the King's Remembrancer. They were then put into sacks and eventually surfaced in the Public Records Office."

Friday, April 13, 2018

The Global Rise of Internet Access and Digital Government

What happens if you mix government and the digital revolution? The answer is Chapter 2 of the April 2018 IMF publication Fiscal Monitor, called "Digital Government." The report offers some striking insights about access to digital technology in the global economy and how government may use this technology.

Access to digital services is rising fast in developing countries, especially in the form of mobile phones, which appears to be on its way to outstripping access to water, electricity, and secondary schools.

Of course, there are substantial portions of the world population not connected as yet, especially in Asia and Africa.
The focus of the IMF chapter is on how digital access might improve the basic functions of government taxes and spending. On the tax side, for example, taxes levied at the border on international trade, or value-added taxes, can function much more simply as records become digitized. Income taxes can be submitted electronically. The government can use electronic records to search for evidence of tax evasion and fraud.

On the spending side, many developing countries experience a situation in which those with the lowest income levels don't receive government benefits to which they are entitled by law, either because they are disconnected from the government or because there is a "leakage" of government spending to others.  The report cites evidence along these lines:
"[D]igitalizing government payments in developing countries could save roughly 1 percent of GDP, or about $220 billion to $320 billion in value each year. This is equivalent to 1.5 percent of the value of all government payment transactions. Of this total, roughly half would accrue directly to governments and help improve fiscal balances, reduce debt, or finance priority expenditures, and the remainder would benefit individuals and firms as government spending would reach its intended targets (Figure 2.3.1). These estimates may underestimate the value of going from cash to digital because they exclude potentially significant benefits from improvements in public service delivery, including more widespread use of digital finance in the private sector and the reduction of the informal sector."
I'll also add that the IMF is focused on potential gains from digitalization, which is  fair enough. But this chapter doesn't have much to say about potential dangers of overregulation, over-intervention, over-taxation, and even outright confiscation that can arise when certain governments gain extremely detailed access to information on sales and transactions. 

Thursday, April 12, 2018

State and Local Spending on Higher Education

"Everyone" knows that the future of the US economy depends on a well-educated workforce, and on a growing share of students achieving higher levels of education. But state spending patterns on higher education aren't backing up this belief. Here are some figures from the SHEF 2017: State Higher Education Finance report published last month by the State Higher Education Executive Officers Association.

The bars in this figure shows per-student sending on pubic higher education by state and local government from all sources of funding, with the lower blue part of the bar showing government spending and the upper green part of the bar showing spending based on tuition revenue from students. The red line shows enrollments in public colleges, which have gone flat or even declined a little since the Great Recession.

This figure clarifies a pattern that is apparent from the green bars in the above figure: the share of spending on public higher education that comes from tuition has been rising. It was around 29-31% of total spending in the 1990s, up to about 35-36% in the middle of the first decade of the 2000s, and in recent years has been pushing 46-47%. That's a big shift in a couple of decades.
The reliance on tuition for state public education varies wildly across states, with less than 15% of total spending on public higher ed coming from tuition in Wyoming and California, and 70% or more of total spending on public higher education coming from tuition in Michigan, Colorado, Pennsylvania, Delaware, New Hampshire, and Vermont.

There are lots of issues in play here: competing priorities for state and local spending, rising costs of higher education, the returns from higher education that encourage students (and their families) to pay for it, and so on. For the moment, I'll just say that it doesn't seem like a coincidence that the tuition share of public higher education costs is rising at the same time that enrollment levels are flat or declining. 

Wednesday, April 11, 2018

US Mergers and Antitrust in 2017

Each year the Federal Trade Commission and and the Department of Justice Antitrust Division publish the Hart-Scott-Rodino Annual Report, which offers an overview of merger and acquisition activity and antitrust enforcement during the previous year. The Hart-Scott-Rodino legislation requires that all mergers and acquisitions above a certain size--now set at $80.8 million--be reported to the antitrust authorities before they occur. The report thus offers an overview of recent merger and antitrust activity in the United States.

For example, here's a figure showing the total number of mergers and acquisitions reported. The total has been generally rising since the end of the Great Recession in 2009, but there was a substantial from 1832 transactions in 2016 to 2052 transactions in 2017.  Just before the Great Recession, the number of merger transactions peaked at 2,201, so the current level is high but not unprecedented.

The report also provides a breakdown on the size of mergers. Here's what it looked like in 2017. As the figure shows, there were 255 mergers and acquisitions of more than $1 billion. 

After a proposed merger is reported, the FTC or the US Department of Justice can request a "second notice" if it perceives that the merger might raise some anticompetitive issues. In the last few years, about 3-4% of the reported mergers get this "second request." 

This percentage may seem low, but it's not clear what level is appropriate.. After all, the US government isn't second-guessing whether mergers and acquisitions make sense from a business point of view. It's only asking whether the merger might reduce competition in a substantial way. If two companies that aren't directly competing with other combine, or if two companies combine in a market with a number of other competitors, the merger/acquisition may turn out well or poorly from a business point of view, but it is less likely to raise competition issues.

Teachers of economics may find the report a useful place to come up with some recent examples of antitrust cases, and there are also links to some of the underlying case documents and analysis (which students can be assigned to read). Here are a few examples from 2017 cases of the Antitrust Division at the US Department of Justice and the Federal Trade Commission. In the first one, a merger was blocked because it would have reduced competition for disposal of low-level radioactive waste.   In the second, a merger between two sets of movie theater chains was allowed only a number of conditions were met aimed at preserving competition in local markets. The third case involved a proposed merger between the two largest providers daily paid fantasy sports contests, and the two firms decided to drop the merger after it was challenged.
In United States v. Energy Solutions, Inc., Rockwell Holdco, Inc., Andrews County Holdings, Inc. and Waste Control Specialists, LLC, the Division filed suit to enjoin Energy Solutions, Inc. (ES), a wholly-owned subsidiary of Rockwell Holdco, Inc., from acquiring Waste Control Specialists LLC (WCS), a wholly-owned subsidiary of Andrews County Holdings, Inc. The complaint alleged that the transaction would have combined the only two licensed commercial low-level radioactive waste (LLRW) disposal facilities for 36 states, Puerto Rico and the District of Columbia. There are only four licensed LLRW disposal facilities in the United States. Two of these facilities, however, did not accept LLRW from the relevant states. The complaint alleged that ES’s Clive facility in Utah and WCS’s Andrews facility in Texas were the only two significant disposal alternatives available in the relevant states for the commercial disposal of higher-activity and lower-activity LLRW. At trial, one of the defenses asserted by the defendants was that that WCS was a failing firm and, absent the transaction, its assets would imminently exit the market. The Division argued that the defendants did not show that WCS’s assets would in fact imminently exit the market given its failure to make good-faith efforts to elicit reasonable alternative offers that might be less anticompetitive than its transaction with ES. On June 21, 2017, after a 10-day trial, the U.S. District Court for the District of Delaware ruled in favor of the Division. ...

In United States v. AMC Entertainment Holdings, Inc. and Carmike Cinemas, Inc., the Division challenged AMC Entertainment Holdings, Inc.’s proposed acquisition of CarmikeCinemas, Inc. AMC and Carmike were the second-largest and fourth-largest movie theatre chains, respectively, in the United States. Additionally, AMC owned significant equity in National CineMedia, LLC (NCM) and Carmike owned significant equity in SV Holdco, LLC, a holding company that owns and operates Screenvision Exhibition, Inc. NCM and Screenvision are the country’s predominant preshow cinema advertising networks, covering over 80 percent of movie theatre screens in the United States. The complaint alleged that the proposed acquisition would have provided AMC with direct control of one of its most significant movie theatre competitors, and in some cases, its only competitor, in 15 local markets in nine states. As a result, moviegoers likely would have experienced higher ticket and concession prices and lower quality services in these local markets. The complaint further alleged that the acquisition would have allowed AMC to hold sizable interests in both NCM and Screenvision post-transaction, resulting in increased prices and reduced services for advertisers and theatre exhibitors seeking preshow services. On December 20, 2016, a proposed final judgment was filed simultaneously with the complaint settling the lawsuit. Under the terms of the decree, AMC agreed to (1) divest theatres in the 15 local markets; (2) reduce its equity stake in NCM to 4.99 percent; (3) relinquish its seats on NCM’s Board of Directors and all of its other governance rights in NCM; (4)transfer 24 theatres with a total of 384 screens to the Screenvision cinema advertising network; and (5) implement and maintain “firewalls” to inhibit the flow of competitively sensitive information between NCM and Screenvision. The court entered the final judgment on March 7, 2017. ...

In DraftKings/FanDuel, the Commission filed an administrative complaint challenging the merger of DraftKings and FanDuel, two providers of paid daily fantasy sports contests. The Commission's complaint alleged that the transaction would be anticompetitive because the merger would have combined the two largest daily fantasy sports websites, which controlled more than 90 percent of the U.S. market for paid daily fantasy sports contests. The Commission alleged that consumers of paid daily fantasy sports were unlikely to view season-long fantasy sports contests as a meaningful substitute for paid daily fantasy sports, due to the length of season-long contests, the limitations on number of entrants, and several other issues. Shortly after the Commission filed its complaint, the parties abandoned the merger on July 13, 2017, and the Commission dismissed its administrative complaint.

Tuesday, April 10, 2018

Should the 5% Convention for Statistical Significance be Dramatically Lower?

For the uninitiated, the idea of "statistical significance" may seem drier than desert sand. But it's how research in the social sciences and medicine decides what findings are worth paying attention to as plausible true--or not. For that reason, it matters quite a bit. Here, I'll sketch a quick overview for beginners of what statistical significance means, and why there is controversy among statisticians and researchers over what research results should be regarded as meaningful or new.

To gain some intuition , consider an experiment to decide whether a coin is equally balanced, or whether it is weighted toward coming up "heads." You toss the coin once, and it comes up heads. Does this result prove, in a statistical sense, that the coin is unfair? Obviously not. Even a  fair coin will come up heads half the time, after all. 

You toss the coin again, and it comes up "heads" again. Do two heads in a row prove that the coin is unfair? Not really. After all, if you toss a fair coin twice in a row, there are four possibilities: HH, HT, TH, TT. Thus, two heads will happen one-fourth of the time with a fair coin, just by chance.

What about three heads in a row? Or four or five or six or more? You can never completely rule out the possibility that a string of heads, even a long string of heads, could happen entirely by chance. But as you get more and more heads in a row, a finding that is all heads, or mostly heads, becomes increasingly unlikely. At some point, it becomes very unlikely indeed.  

Thus, a researcher must make a decision. At what point are the results sufficiently unlikely to have happened by chance, so that we can declare that the results are meaningful?  The conventional answer is that if the observed result had a 5% probability or less of happening by chance, then it is judged to be "statistically significant." Of course, real-world questions of whether a certain intervention in a school will raise test scores, or whether a certain drug will help treat a medical condition, are a lot more complicated to analyze than coin flips. Thus, so practical researchers spend a lot of time trying to figure out whether a given result is "statistically significant" or not.

Several questions arise here.

1) Why 5%? Why not 10%? Or 1%? The short answer is "tradition." A couple of year ago, the American Statistical Association put together a panel to reconsider the 5% standard. The

Ronald L. Wasserstein and Nicole A. Lazar wrote a short article :"The ASA's Statement on p-Values: Context, Process, and Purpose," in  The American Statistician  (2016, 70:2, pp. 129-132.) (A p-value is an algebraic way of referring to the standard for statistical significance.) They started with this anecdote:
"In February 2014, George Cobb, Professor Emeritus of Mathematics and Statistics at Mount Holyoke College, posed these questions to an ASA discussion forum:
Q:Why do so many colleges and grad schools teach p = 0.05?
A: Because that’s still what the scientific community and journal editors use.
Q:Why do so many people still use p = 0.05?
A: Because that’s what they were taught in college or grad school.
Cobb’s concern was a long-worrisome circularity in the sociology of science based on the use of bright lines such as p<0.05: “We teach it because it’s what we do; we do it because it’s what
we teach.”

But that said, there's nothing magic about the 5% threshold. It's fairly common for academic papers to report the results that are statistically signification using a threshold of 10%, or 1%. Confidence in a statistical result isn't a binary, yes-or-no situation, but rather a continuum. 

2) There's a difference between statistical confidence in a result, and the size of the effect in the study.  As a hypothetical example, imagine a study which says that if math teachers used a certain curriculum, learning in math would rise by 40%. However, the study included only 20 students.

In a strict statistical sense, the result may not be statistically significant, in the sense that with a fairly small number of students, and the complexities of looking at other factors that might have affected the results, it could have happened by chance. (This is similar the problem that if you flip a coin only two or three times, you don't have enough information to state with statistical confidence whether it is a fair coin or not.) But it would seem peculiar to ignore a result that shows a large effect. A more natural response might be to design a bigger study with more students, and see if the large effects hold up and are statistically significant in a bigger study.

Conversely, one can imagine a hypothetical study which uses results from 100,000 students, and finds that if math teachers use a certain curriculum, learning in math would rise by 4%. Let's say that the researcher can show that the effect is statistically significant at the 5% level--that is, there is less than a 5% chance that this rise in math performance happened by chance. It's still true that the rise is fairly small in size. 

In other words, it can sometimes be more encouraging to discover a large result in which you do not have full statistical confidence than to discover a small result in which you do have statistical confidence.

3) When a researcher knows that 5% is going to be the dividing line between a result being treated as meaningful or not meaningful, it becomes very tempting to fiddle around with the calculations (whether explicitly or implicitly) until you get a result that seems to be statistically significant.

As an example, imagine a study that considers whether early childhood education has positive effects on outcomes later in life. Any researcher doing such a study will be faced with a number of choices. Not all early childhood education programs are the same, so one may want to adjust for factors like the teacher-student ratio, training received by students, amount spent per student, whether the program included meals, home visits, and other factors. Not all children are the same, so one may want to look at factors like family structure, health,  gender, siblings, neighborhood, and other factors. Not all later life outcomes are the same, so one may want to look at test scores, grades, high school graduation rates, college attendance, criminal behavior, teen pregnancy, and employment and wages later in life.

But a problem arises here. If a research hunts through all the possible factors, and all the possible combinations of all the possible factors, there are literally scores or hundreds of possible connections. Just by blind chance, some of these connections will appear to be statistically significant. It's similar to the situation where you do 1,000 repetitions of flipping a coin 10 times. In those 1,000 repetitions, at least a few times heads is likely to come up 8 or 9 times out of 10 tosses. But that doesn't prove the coin is unfair! It just proves you tried over and over until you got a specific result.

Modern researchers are very aware of the dangers that when you hunting through lots of  possibilities, then just by chance, a random scattering of the results will appear to be statistically significant. Nonetheless, there are some tell-tale signs that this research strategy of hunting to find a result that looks statistically meaningful may be all too common. For example, one warning sign is when other researchers try to replicate the result using different data or statistical methods, but fail to do so. If a result only appeared statistically significant by random chance in the first place, it's likely not to appear at all in follow-up research.

Another warning sign is that when you look at a bunch of published studies in a certain area (like how to improve test scores, how a minimum wage affects employment, or whether a drug helps with a certain medical condition), you keep seeing that the finding is statistically significant at almost exactly the 5% level, or just a little less. In a large group of unbiased studies, one would expect to see the statistical significance of the results scattered all over the place: some 1%, 2-3%, 5-6%, 7-8%, and higher levels. When all the published results are bunched right around 5%, it make one suspicious that the researchers have put their thumb on the scales in some way to get a result that magically meets the conventional 5% threshold. 

The problem that arises is that research results are being reported as meaningful in the sense that they had a 5% or less probability of happening by chance, when in reality, that standard is being evaded by researchers. This problem is severe and common enough that a group of 72 researchers recently wrote: "Redefine statistical significance: We propose to change the default P-value threshold for statistical significance from 0.05 to 0.005 for claims of new discoveries," which appeared in Nature Human Behavior (Daniel J. Benjamin et al., January 2018, pp. 6-10). One of the signatories, John P.A. Ioannidis provides a readable over in "Viewpoint: The Proposal to Lower P Value Thresholds to .005" (Journal of the American Medical Association, March 22, 2018, pp. E1-E2). Ioannidis writes: 
"P values and accompanying methods of statistical significance testing are creating challenges in biomedical science and other disciplines. The vast majority (96%) of articles that report P values in the abstract, full text, or both include some values of .05 or less. However, many of the claims that these reports highlight are likely false. Recognizing the major importance of the statistical significance conundrum, the American Statistical Association (ASA) published3 a statement on P values in 2016. The status quo is widely believed to be problematic, but how exactly to fix the problem is far more contentious.  ... Another large coalition of 72 methodologists recently proposed4 a specific, simple move: lowering the routine P value threshold for claiming statistical significance from .05 to .005 for new discoveries. The proposal met with strong endorsement in some circles and concerns in others. P values are misinterpreted, overtrusted, and misused. ... Moving the P value threshold from .05 to .005 will shift about one-third of the statistically significant results of past biomedical literature to the category of just “suggestive.”
This essay is published in a medical journal, and is thus focused on biomedical research. The theme is that a result with 5% significance can be treated as "suggestive," but for a new idea to be accepted, the threshold level of statistical significance should be 0.5%-- that is the probability of the outcome happening by random chance should be 0.5% or less." 

The hope of this proposal is that researchers will design their studies more carefully and use larger sample sizes. Ioannidis writes: "Adopting lower P value thresholds may help promote a reformed research agenda with fewer, larger, and more carefully conceived and designed studieswith sufficient power to pass these more demanding thresholds." Ioannidis is quick to admit that this proposal is imperfect, but argues that it is practical and straightforward--and better than many of the alternatives.

The official "ASA Statement on Statistical Significance and P-Values" which appears with the Wasserstein and Lazar article includes a number of principles worth considering. Here are three of them:
Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold. ...
A p-value, or statistical significance, does not measure the size of an effect or the importance of a result. ...
By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.
Whether you are doing the statistics yourself, or just a consumer of statistical studies produced by others, it's worth being hyper-aware of what "statistical significance" means, and doesn 't mean. 

For those who would like to dig a little deeper, some useful starting points might be the six-paper symposium on "Con out of Economics" in the Spring 2010 issue of the Journal of Economic Perspectives, or the six-paper symposium on "Recent Ideas in Econometrics" in the Spring 2017 issue. 

Monday, April 9, 2018

US Lagging in Labor Force Participation

Not all that long ago in 1990, the share of :"prime-age" workers from 25-54 who were participating in the labor force was basically the same in the United States, Germany, Canada, and Japan. But since then, labor force participation in this group has fallen in the United States while rising in the other countries. Mary C. Daly lays out the pattern in "Raising the Speed Limit on Future Growth" (Federal Reserve Bank of San Franciso Economic Letter, April 2, 2018).

Here's a figure showing the evolution of labor force participation in the 25-54 age bracket in these four economies. Economists often like to focus on this group because it avoids differences in rates of college attendance (which can strongly influence labor force participation at younger years) and differences in old-age pension systems and retirement patterns (which can strongly influence labor force participation in older years).

U.S. labor participation diverging from international trends

Daly writes (ciations omitted):
"Which raises the question—why aren’t American workers working?
"The answer is not simple, and numerous factors have been offered to explain the decline in labor force participation. Research by a colleague from the San Francisco Fed and others suggests that some of the drop owes to wealthier families choosing to have only one person engaging in the paid labor market ...
"Another factor behind the decline is ongoing job polarization that favors workers at the high and low ends of the skill distribution but not those in the middle. ... Our economy is automating thousands of jobs in the middle-skill range, from call center workers, to paralegals, to grocery checkers.A growing body of research finds that these pressures on middle-skilled jobs leave a big swath of workers on the sidelines, wanting work but not having the skills to keep pace with the ever-changing economy.
"The final and perhaps most critical issue I want to highlight also relates to skills: We’re not adequately preparing a large fraction of our young people for the jobs of the future. Like in most advanced economies, job creation in the United States is being tilted toward jobs that require a college degree . Even if high school-educated workers can find jobs today, their future job security is in jeopardy. Indeed by 2020, for the first time in our history, more jobs will require a bachelor’s degree than a high school diploma.
"These statistics contrast with the trends for college completion. Although the share of young people with four-year college degrees is rising, in 2016 only 37% of 25- to 29-year-olds had a college diploma. This falls short of the progress in many of our international competitors, but also means that many of our young people are underprepared for the jobs in our economy."
On this last point, my own emphasis would differ from Daly's. Yes, steps that aim to increase college attendance over time are often worthwhile.  But as she notes, only 37% of American 25-29 year-olds have a college degree. A dramatic rise in this number would take an extraordinary social effort. Among other things, it would require a dramatic expansion in the number of those leaving high school who are willing and ready to benefit from a college degree, together with a vast enrollments across the higher education sector.  Even if the share of college graduates could be increased by one-third or one-half--which would be a very dramatic change--a very large share of the population would not have a college degree.

It seems to me important to separate the ideas of "college" and "additional job-related training." It's true that the secure and decently-paid jobs of the future will typically require additional training past high school. At least in theory, that training could be provided in many ways: on-the-job training, apprenticeships, focused short courses focused certifying competence in a certain area, and so on. For some students, a conventional college degree will be a way to build up these skills. However, there are also a substantial number of students who are unlikely to flourish in a conventional classroom-based college environment, and a substantial number of jobs where a traditional college classroom doesn't offer the right preparation. College isn't the right job prep  for everyone. We need to build up other avenues for US workers to acquire the job-related skills they need, too.

Saturday, April 7, 2018

Misconceptions about Milton Friedman's 1968 Presidential Address

For macroeconomists, Milton Friedman's (1968) Presidential Address to the American Economic Association about "The Role of Monetary Policy" marks a central event (American Economic Review, March 1968, pp. 1-17).  Friedman argued that monetary policy had limits. Actions by a central bank like the Federal Reserve could have short-run effects on an economy--either for better or for worse. But in the long-run, he argued, monetary policy affected only the price level. Variables like unemployment or the real interest rate were determined by market forces, and tended to move toward what Friedman called the "natural rate"--which is potentially confusing term for saying that they are determined by forces of supply and demand. 

Here, I'll give a quick overview of the thrust of Friedman's address, a plug for the recent issue of the Journal of Economic Perspectives, which has a lot more, and point out a useful follow-up article that clears up some misconceptions about Friedman's 1968 speech.

The Winter 2018 issue of the Journal of Economic Perspectives, where I work as Managing Editor, we published a three-paper symposium on "Friedman's Natural Rate Hypothesis After 50 Years." The papers are:

I won't try to summarize the papers here, along with the many themes they offer on how Friedman's speech influenced the macroeconomics that followed or what aspects of Friedman's analysis have held up better than others. But to giver a sense of what's a stake, here's an overview of Friedman's themes from the paper by Mankiw and Reis:

"Using these themes of the classical long run and the centrality of expectations, Friedman takes on policy questions with a simple bifurcation: what monetary policy cannot do and what monetary policy can do. It is a division that remains useful today (even though, as we discuss later, modern macroeconomists might include different items on each list). 
"Friedman begins with what monetary policy cannot do. He emphasizes that, except in the short run, the central bank cannot peg either interest rates or the unemployment rate. The argument regarding the unemployment rate is that the trade-off described by the Phillips curve is transitory and unemployment must eventually return to its natural rate, and so any attempt by the central bank to achieve otherwise will put inflation into an unstable spiral. The argument regarding interest rates is similar: because we can never know with much precision what the natural rate of interest is, any attempt to peg interest rates will also likely lead to inflation getting out of control. From a modern perspective, it is noteworthy that Friedman does not consider the possibility of feedback rules from unemployment and inflation as ways of setting interest rate policy, which today we call “Taylor rules” (Taylor 1993).

"When Friedman turns to what monetary policy can do, he says that the “first and most important lesson” is that “monetary policy can prevent money itself from being a major source of economic disturbance” (p. 12). Here we see the profound influence of his work with Anna Schwartz, especially their Monetary History of the United States. From their perspective, history is replete with examples of erroneous central bank actions and their consequences. The severity of the Great Depression is a case in point.

"It is significant that, while Friedman is often portrayed as an advocate for passive monetary policy, he is not dogmatic on this point. He notes that “monetary policy can contribute to offsetting major disturbances in the economic system arising from other sources” (p. 14). Fiscal policy, in particular, is mentioned as one of these other disturbances. Yet he cautions that this activist role should not be taken too far, in light of our limited ability to recognize shocks and gauge their magnitude in a timely fashion. The final section of Friedman’s presidential address concerns the conduct of monetary policy. He argues that the primary focus should be on something the central bank can control in the long run—that is, a nominal variable ... "

Edward Nelson offers a useful follow-up to these JEP papers in  “Seven Fallacies Concerning Milton Friedman’s `The Role of Monetary Policy,'" Finance and Economics Discussion Series 2018-013, Board of Governors of the Federal Reserve System, Nelson summarizes at the start:
"[T]here has been widespread and lasting acceptance of the paper’s position that monetary policy can achieve a long-run target for inflation but not a target for the level of output (or for other real variables). For example, in the United States, the Federal Open Market Committee’s (2017) “Statement on Longer-Run Goals and Policy Strategy” included the observations that the “inflation rate over the longer run is primarily determined by monetary policy, and hence the Committee has the ability to specify a longer-run goal for inflation,” and that, in contrast, the “maximum level of employment is largely determined by nonmonetary factors,” so “it would not be appropriate to specify a fixed goal for employment.”
Nelson then lays out seven fallacies. The details are in his paper: here, I just list the fallacies with a few words of his explanations.
Fallacy 1: “The Role of Monetary Policy” was Friedman’s first public statement of the natural rate hypothesis
"Certainly, Friedman (1968) was his most extended articulation of the ideas (i) that an expansionary monetary policy that tended to raise the inflation rate would not permanently lower the unemployment rate, and (ii) that full employment and price stability were compatible objectives over long periods. But Friedman had outlined the same ideas in his writings and in other public outlets on several earlier occasions in the 1950s and 1960s."
Fallacy 2: The Friedman-Phelps Phillips curve was already presented in Samuelson and Solow’s (1960) analysis
"A key article on the Phillips curve that is often juxtaposed with Friedman (1968) is Samuelson  and Solow (1960). This paper is often (and correctly, in the present author’s view) characterized as advocating the position that there is a permanent tradeoff between the unemployment rate and  inflation in the United States."
Fallacy 3: Friedman’s specification of the Phillips curve was based on perfect competition and no nominal rigidities
"Modigliani (1977, p. 4) said of Friedman (1968) that “[i]ts basic message was that, despite appearances, wages were in reality perfectly flexible.” However, Friedman (1977, p. 13) took exception to this interpretation of his 1968 paper. Friedman pointed out that the definition of the natural rate of unemployment that he gave in 1968 had recognized the existence of imperfectly competitive elements in the setting of wages, including those arising from regulation of labor markets. Further support for Friedman’s contention that he had not assumed a perfectly competitive labor market is given by the material in his 1968 paper that noted the slow adjustment of nominal wages to demand and supply pressures. ... Consequently, that (1968 Friedman] framework is
consistent with prices being endogenous—both responding to, and serving as an impetus for, output movements—and the overall price level not being fully flexible in the short run."
Fallacy 4: Friedman’s (1968) account of monetary policy in the Great Depression contradicted the Monetary History’s version
"But the fact of a sharp decline in the monetary base during the prelude to, and early stages of, the 1929-1933 Great Contraction is not in dispute, and it is this decline to which Friedman (1968) was presumably referring." 
Fallacy 5: Friedman (1968) stated that a monetary expansion will keep the unemployment rate and the real interest rate below their natural rates for two decades
"[T]these statements are inferences from the following passage in Friedman (1968, p. 11): “But how long, you will say, is ‘temporary’? … I can at most venture a personal judgment, based on some examination of the historical evidence, that the initial effects of a higher and unanticipated rate of inflation last for something like two to five years; that this initial effect then begins to be reversed; and that a full adjustment to the new rate of inflation takes about as long for employment as for interest rates, say, a couple of decades.” The passage of Friedman (1968) just quoted does not, in fact, imply that a policy involving a shift to a new inflation rate involves twenty years of one-sided unemployment and real-interestrate gaps. Such prolonged gaps instead fall under the heading of Friedman’s “initial effects” of  the monetary policy change—effects that he explicitly associated with a two-to-five-year period, with the gaps receding beyond this period. Friedman described “full adjustment” as comprising decades, but such complete adjustment includes the lingering dynamics beyond the main  dynamics associated with the initial two-to-five year period. It is the two-to-five year period that would be associated with the bulk of the nonneutrality of the monetary policy change."
Fallacy 6: The zero lower bound on nominal interest rates invalidates the natural rate hypothesis
"A zero-bound situation undoubtedly makes the analysis of monetary policy more difficult. In addition, the central bank in a zero-bound situation has fewer tools that it can deploy to stimulate aggregate demand than it has in other circumstances. But, important as these complications are, neither of them implies that the long-run Phillips curve is not vertical."
Fallacy 7: Friedman’s (1968) treatment of an interest-rate peg was refuted by the rational expectations revolution. 
"The propositions that the liquidity effect fades over time and that real interest rates cannot be targeted in the long run by the central bank remain widely accepted today. These valid propositions underpinned Friedman’s critique of pegging of nominal interest rates."
Since the JEP published this symposium, I've run into some younger economists who have never read Friedman's talk and lack even a general familiarity with his argument. For academic economists of whatever vintage, it's an easily readable speech worth becoming acquainted with--or revisiting.