National Tax Journal, September 2010, 63 (3), 397–418

THE DISTRIBUTION OF INCOME TAX NONCOMPLIANCE
Andrew Johns and Joel Slemrod

This paper uses newly available data from the IRS to assess the distributional
consequences of U.S. federal income tax noncompliance for the tax year 2001. We
find that, when taxpayers are arrayed by their estimated “true” income, defined as
reported income adjusted for underreporting, the ratio of aggregate misreported
income to true income generally increases with income, although it peaks among
taxpayers with adjusted gross income in the 99.0 to 99.5 percentile. In sharp contrast,
the ratio of underreported tax to true tax is highest for lower-income taxpayers.
Keywords: tax evasion, tax gap, income distribution
JEL Codes: D31, D63, H26

I. MOTIVATION AND INTRODUCTION
In this paper we use the newly available data from the IRS’s most recent comprehensive study of individual income tax noncompliance, the National Research Program, to
assess the distributional consequences of income tax noncompliance in the U.S. federal
income tax for the tax year 2001. We find that, when taxpayers are arrayed by their
estimated “true” income, defined as reported income adjusted for the underreporting
estimated by the IRS tax gap methodology, the ratio of aggregate misreported income
to true income generally increases with income, although it peaks among taxpayers
with adjusted gross income in the 99.0 to 99.5 percentile. In sharp contrast, the ratio of
underreported tax to true tax is highest for the lowest-income taxpayers. This contrast
in results reflects the fact that a given percentage reduction in taxable income corresponds to a particularly high percentage reduction in tax liability for taxpayers with
taxable income just above the taxpaying threshold. Much of the distributional pattern
of noncompliance is associated with the fact that on average high-income taxpayers
receive their income in forms that have higher noncompliance rates. But this is not the
whole story because similar, although not identical, patterns apply to misreporting per-

Andrew Johns: Office of Research, Analysis, and Statistics, Internal Revenue Service, Washington, DC,
USA (anjohns@vt.edu)
Joel Slemrod: University of Michigan, Ann Arbor, MI, USA (jslemrod@umich.edu)

 398

National Tax Journal

centages of given income sources. The inequality of true adjusted gross income (AGI),
as measured by the Gini coefficient, is slightly below that of reported AGI, while the
inequality of true AGI minus reported income tax is slightly higher than that of reported
AGI minus reported income tax.
II. DATA SOURCE AND METHODOLOGY
The estimates in this paper are based on data from the National Research Program
(NRP) Individual Income Tax Reporting Compliance Study for the 2001 tax year, supplemented with IRS-calculated estimates of unreported income that examiners were unable
to detect.1 The methodology for measuring the individual income tax underreporting gap
has three components: (1) errors detected by examiners during random audits, including over-reporting of deductions, offsets, and credits, (2) adjustments for unreported
income that the examiners were unable to detect during those audits, and (3) average
marginal tax rates applied to the total estimated underreporting of each type of income
and to the over-reporting of offsets to income. Adjustments for undetected income are
based on an econometric technique called “detection controlled estimation” (DCE).2
For tax year 2001, the NRP selected a stratified random sample of approximately
45,000 returns. Data exclusions, primarily due to data anomalies, resulted in a subset of
36,699 returns that was used for the tax gap analysis.3 Sample details are shown in Table
A1. Each case in the original sample was given a base weight equal to the inverse of the
probability of selection. These weights were then adjusted to account for the excluded
cases, so that estimates could be projected to the overall population.
During an initial classification stage, case-building materials such as third-party
information returns, prior-year returns, and dependent information were collected by
NRP and then reviewed by experienced examiners referred to as classifiers. Based on
the results of these reviews, some returns were accepted as filed (i.e., were reasonably
believed to have no under-reporting) without any examination, while others were
assigned to either correspondence or face-to-face audits.4

1
2

3

4

For details, see U.S. Department of the Treasury (2005a, 2005b, 2005c) and Plumley (2005).
Also included is an estimate of unreported tip income based on typical industry tipping rates, which was
allocated proportionally to the amount of tip income actually reported.
An example would be if a taxpayer reported $20,000 of what should be Schedule C income as wage income.
Because the type of income may have employment tax consequences, the examiner may increase Schedule
C income by $20,000 and decrease wages by $20,000. Line-item compliance estimates generally exclude
cases like this example in which the taxpayer enters the income on the wrong line or schedule. Although
procedures had been put in place to identify these misclassification errors, initial results showed inconsistencies in how they were handled, and for this reason some returns were excluded from the analysis.
Correspondence audits were limited to returns with at most three compliance issues that could be addressed through documentation requests sent to the taxpayer. Of the 36,699 returns used for this analysis,
84 percent were subject to face-to-face audits, 9 percent were accepted as filed, and 6 percent were subject
to correspondence audits. In the remaining (less than 1 percent of) returns, the taxpayer did not respond
to the notice, did not show up for the examination, or mail addressed to the taxpayer was returned as
undeliverable.

 The Distribution of Income Tax Noncompliance

399

If a return was assigned to be audited, then the classifier identified which issues, or
lines on the returns, were mandatory for the examiner to audit. It was at the examiner’s
discretion whether to extend the examination beyond those classified lines. It was also
at the discretion of the examiner to extend the examination to flow-through entities of
which the taxpayer was a partner or shareholder. If the examiner did audit the flowthrough entity, e.g., a partnership or S corporation, those results are reflected in the tax
gap estimates. Although the detection-controlled estimation methodology, discussed
below, likely accounts for some portion of flow-through income that was not detected
during the examination, it is not known whether it accounts for the majority of underreported flow-through income.5
The IRS then applied DCE to those returns subject to audit, in order to adjust for
unreported income that examiners were unable to detect.6 The DCE methodology,
developed in Feinstein (1990, 1991, 2004) is based on a joint maximum likelihood
estimation of two equations: (1) a noncompliance equation that models the total
amount of underreported income, and (2) a detection equation that models the fraction
of noncompliance detected by the IRS examiner. The noncompliance equation models
underreported income using a censored regression model and assumes a displaced lognormal distribution. The log of the unobserved magnitude of noncompliance, with a
displacement parameter, is modeled as a tobit function of a set of return characteristics
as well as dummy variables for various ranges of positive income.
The detection equation allows for the possibility that the ability of IRS examiners to
detect noncompliance varies systematically across examiners and classifiers. The model
estimates the fraction of detected unreported income modeled as a linear combination of
a vector of return characteristics that proxy for the complexity of the return (the number
of issues examined and the type of audit) as well as characteristics of the examiner such
as the examiner’s pay scale grade and, for those examiners who perform a sufficient
number of audits in the sample, a fixed individual effect.
As Feinstein (1991) acknowledges, estimating the examiner detection rate is fraught
with identification problems, as that rate is never actually observed — what is observed
is the product of the true noncompliance rate and the detection rate. As Feinstein

5

6

The IRS has recently completed an NRP study of S corporations that filed returns for tax years 2003 and
2004. The results from that study may be used to supplement future individual income tax underreporting
gap estimates.
In IRS tax gap studies prior to the tax year 2001, estimates of the amount of income not detected during
the random audits consisted of multipliers based on a comparison with tax year 1976 audit results from
the Taxpayer Compliance Measurement Program (TCMP), a precursor of the NRP, where examiners did
not have use of information reporting (IRP) documents with the income reported on those documents. The
results of the comparison showed that, for every $1 detected without the use of IRP documents, another
$2.28 went undetected. This resulted in the use of a 3.28 multiplier for prior tax gap estimates, with some
variations depending on type of income. Feinstein (1991) reports that aggregate tax gap estimates for tax
years 1982 and 1987 based on the DCE methodology are remarkably similar to those based on the previous
IRS methodology. For background on detection controlled estimation models, see Feinstein (1990, 1991,
2004) and U.S. Department of the Treasury (1996). The 2001 DCE methodology was developed by Brian
Erard and Jonathan Feinstein under contract with the IRS.

 400

National Tax Journal

(1991, p. 33) puts it, “… a given level of average detected violation may be due to a
high frequency of evasion and a low frequency of detection … or to the opposite.” An
intuition for how the DCE procedure resolves this fundamental identification problem
is provided in Feinstein (1991, p. 33) who notes, “… the DCE estimates may be seen as
tying down absolute detection rates by finding a set of “best” examiners in the data and
assigning them the highest detection rates; all other examiner rates are then determined
by comparing their performance to these top examiners.”
The DCE analysis was done separately for two groups of returns. A return was
allocated to one of the following groups: (1) Returns without reported Schedule C
or Schedule F profit or loss, and with reported total positive income (TPI)7 less than
$100,000, or (2) Returns with reported Schedule C or Schedule F profit or loss, or with
reported total positive income greater than or equal to $100,000. Within each of these
two tax return groups, noncompliance equations were then estimated separately for
total income and for “low-visibility” income subject to little or no information reporting, which included farm or nonfarm proprietor income, income from a partnership or
S corporation, rental or royalty income, gains or losses reported on Form 4797, and
income reported on the Form 1040 “other income” line. “High-visibility” income had
at least some systematic information reporting and included wages and tips, interest and
dividends, state and local tax refunds, alimony, capital gains, pensions, unemployment
compensation, and Social Security income.
The noncompliance equations that resulted from the DCE analysis were used to
estimate the amount of total income underreporting (i.e., detected plus undetected) and
the amount of low-visibility income underreporting. Unreported high-visibility income
was then set to the difference between these two DCE estimates. Each DCE estimate
for total underreported income was divided by the amount of underreporting actually
detected. This procedure generates four separate “multipliers,” one for each type of
return and income-visibility category:
Non-business returns with reported TPI < $100,000
Low-visibility income: 4.158
High-visibility income: 2.009
Business returns or returns with reported TPI > $100,000
Low-visibility income: 3.358
High-visibility income: 2.340.
The DCE multipliers were then used to calculate, on a return-by-return basis, line-item
net misreported amounts (NMAs) by multiplying the amount of underreported income
detected during the NRP audit by the appropriate one of the four DCE multipliers. The
multiplier was applied only to the detected underreporting of a line item if the sample
return was selected for face-to-face audit and the examiner detected some underreported
income. Note that this technique assumes that detection rates are similar across line items
7

Total positive income (TPI) is generally the sum of all positive income amounts reported on individual
income tax returns, and therefore excludes negative net income amounts.

 The Distribution of Income Tax Noncompliance

401

within each type of return and income-visibility category. The use of the DCE multipliers will understate estimates of undetected income for some taxpayers, and almost
certainly will do so for the class of returns subject to correspondence audits and those
audited returns where no income underreporting was detected, because no adjustment
is made in these cases. Conversely, it may overstate estimates of undetected income
for other taxpayers. Note specifically that the use of the multipliers implicitly allocates
undetected income in proportion to the amount of income that was detected, within a
given income visibility category. To the extent that certain types of low-visibility income
are harder to detect than others, the use of the DCE multipliers may also overstate or
understate the amount of noncompliance for some income sources.8
Note finally that the individual underreporting gap estimates reported here focus only
on misreporting on returns filed on a timely basis, and therefore do not take into account
all noncompliance by individual taxpayers; the IRS estimates a separate tax gap for
individual nonfilers, which includes late-filed returns. Nor do the estimates explicitly
account for income derived from illegal activities. If the NRP examiner found income
from illegal activities during the audit, that income is included but, as this would have
been detected incidentally, it likely represents a very small portion of the whole.
III. NET MISREPORTING
A. Net Misreporting by Income Source
Table 1 presents the aggregate tax gap figures for 2001, by income source, based on
the NRP study (U.S. Department of the Treasury, 2006) for the individual income tax
and estimates extrapolated from earlier studies for other taxes.9 The overall gross tax
gap estimate is $345 billion, which amounts to 16.3 percent of estimated actual (paid
plus unpaid) tax liability.10 Of the $345 billion estimate, the IRS expects to recover $55
billion through late payments and enforcement actions, resulting in a “net tax gap”—
that is the tax not collected — for tax year 2001 of $290 billion, which is 13.7 percent
of the tax that should have been paid.
As discussed in Slemrod (2007), about two-thirds of all underreporting of income
happens on the individual income tax. For the individual income tax, understated
income ― as opposed to overstating of exemptions, deductions, adjustments, and
credits ― accounts for over 80 percent of individual underreporting of tax. Business
8

9

10

The estimates based on the DCE-adjusted NRP subset do not come with standard errors, but we can infer
something about the confidence surrounding estimates by looking at Table A1, which shows the number
of tax returns, by income class, that comprise the sample.
The second column of Table 1 may refer to the percentage of the corresponding true amount of income,
offsets to income, credits, or tax depending on the row of the table.
This percentage is not much different than earlier estimates based on extrapolations from the tax gap studies
based on 1988 TCMP data (for example, U.S. Department of the Treasury, 1996). However, taking into
account changes in methodology and the uncertainty of the estimating procedures, one cannot conclude
that the noncompliance rate has remained steady, as opposed to trending up or down.

 402

National Tax Journal

Table 1
Components of the 2001 Individual Income Tax Underreporting Gap
Tax Gap
($billion)

Gross Tax Gap
Underreporting
Individual Income Tax
Underreported Nonbusiness Income
Wages and salaries
Net capital gains
Taxable pension annuities, IRA distributions
Taxable interest and dividends
Other
Underreported Business Income
Nonfarm proprietor income
Partnership, S corporation, estate and net trust income
Rent and royalty net income
Farm net income
Overreported Offsets to Income
Deductions
Exemptions
Statutory adjustments to income
Overreported Credits
Employment Tax
Self-employment tax
FICA and unemployment taxes
Corporation Income Tax
Large (>$10 million assets) corporations
Small (<$10 million assets) corporations
Estate and Excise Taxes
Nonfiling
Individual Income Tax
Other
Underpayment
Individual Income Tax
Corporation Income Tax
Other
Enforced and Other Late Payments
Net Tax Gap (tax not collected)

345
285
197
56
10
11
4
3
28
109
68
22
13
6
15
14
4
–3
17
54
39
15
30
25
5
4
27
25
2
34
23
2
9
55
290

Percentage of the
Corresponding
True Amount

16.3
18
4
1
12
4
4
38
43
57
18
51
72
4
5
5
–21
26
7
52*
2*
17*
14*
29*
4*
1*
2*
2*
2*
1*
1*
3*
13.7*

Source: Slemrod (2007), calculated from U.S. Department of the Treasury (2006).
Note: Only the figures for the individual income tax and the self-employment tax are based on IRS
National Research Program results; the rest are IRS extrapolations from earlier tax gap estimates.
* Calculated by the authors.

 The Distribution of Income Tax Noncompliance

403

income, as opposed to wages or investment income, accounts for about two-thirds of
the understated individual income. Taxpayers who were required to file an individual
tax return, but did not, accounted for slightly less than 10 percent of the gap. While the
individual income tax comprises about two-thirds of the estimated underreporting, the
corporation income tax makes up slightly more than 10 percent and the employment
tax gap makes up about one-fifth of total underreporting.
Perhaps the most striking aspect of the aggregate tax gap estimates is the huge variation
in the rate of misreporting as a percentage of true income by type of income (or offset).
Only 1 percent of wages and salaries and 4 percent of taxable interest and dividends
are misreported, all of which must be reported to the IRS by those who pay them; in
addition, wages and salaries are subject to employer withholding. In sharp contrast,
self-employment business income, which is not subject to information reports, has a
sharply higher estimated net misreporting percentage (NMP): an estimated 57 percent
of nonfarm proprietor income is not reported, a total of $68 billion, which by itself
accounts for more than a third of the total estimated underreporting for the individual
income tax.11 Over half is attributable to the underreporting of business income, of
which nonfarm proprietor income is the largest component.
B. Net Misreporting Percentages by True Income Group
The published information about the 2001 tax gap study shown in Table 1 provides
no information about the distribution of income tax noncompliance across income
groups.12 To investigate this topic, we analyzed the micro data from the NRP along
with the DCE-based multipliers.13
The basic results are shown in Table 2, where taxpayers are grouped according to
what we call “true income,” that is, by percentiles of the adjusted gross income (AGI)
11

12

13

The numerator of the net misreporting percentage is the sum of all misreporting and includes any over-reporting
of income. In order to account for sources of income that can take negative values, the denominator of the
net misreporting percentage is defined as the sum of the absolute values of the estimated true amounts.
The one published table that we know of that attempts something similar to our Table 2, in Christian
(1994), is based on the results of the Taxpayer Compliance Measurement Program (TCMP), the forerunner of the NRP, for tax year 1988; it is shown in Table A2. First, note that Table A2 presents measures of
the voluntary compliance level, defined as reported tax liability divided by corrected tax liability, so it is
similar to, although the obverse of, what is reported here in column 2 of Table 2. However, the methodology was significantly different from the one used to create Table 2 and therefore the two tables are not
readily comparable. First, the Voluntary Compliance Levels (VCLs) reported in Table A2 are based on the
raw TCMP results (i.e., the results were not adjusted for undetected underreported income). Second, and
more important, the taxpayers are grouped by reported AGI rather than estimated true AGI. Nonetheless,
even with these caveats in mind, the results in Table A2 are somewhat similar to those in column 2 of
Table 2. Both tables indicate that the rate of misreported tax declines with income, but the effect is more
pronounced in Table A2 because it is arrayed by reported income. This amplifies the effect because, other
things equal, those who claim to have low income are on average more noncompliant than those who
report that they have high income.
Erard and Ho (2003) analyze the distribution of noncompliance by occupation, based on the tax year 1988
TCMP data.

 404

National Tax Journal

Table 2
Net Misreporting Percentages by True AGI, Tax Year 2001
True AGI

Bottom 10%
10% –20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

NMP for AGI

NMP for Tax after Refundable Credits

–1
4
5
5
6
7
7
8
8
11
18
19
15
11

71
56
38
27
21
20
16
16
14
17
21
20
15
18

Source: National Research Program data.

that, according to the tax gap methodology, they should have reported. In other words,
to calculate true AGI the estimated amount of DCE-adjusted noncompliance due to
unreported income was added back to the reported AGI. Grouping taxpayers by reported
AGI, rather than true AGI, would paint a misleading picture of the relationship between
noncompliance and the true income level as, other things equal, noncompliant taxpayers
would appear to have lower income than they really have. It is important to note that
Table 2 reports net misreporting percentages by true AGI group, where net misreporting
percentages are defined as the sum of estimated misreporting divided by the sum of the
absolute values of the corresponding true values, be it AGI in the first column and tax
after refundable credits in the second column.14
The first column of Table 2 shows that the net misreporting percentage rises continually with true income, until it peaks at 19 percent for the estimated true AGI group
comprising the top 99.0 to 99.5 percent, whereupon it declines in the highest percentile
group. However, the misreporting percentage for the highest true income class, with
true income above $2 million, is still above the NMP for any true income group below
the 95th percentile. Splitting taxpayers into two groups, above and below $100,000,
clearly reveals that the net misreporting percentage of income is much higher for the
14

Tax after refundable credits as defined in this paper does not include self-employment tax.

 The Distribution of Income Tax Noncompliance

405

higher-income taxpayers: 15.2 percent for those with true income above $100,000, and
7.0 percent for those with true income below $100,000.
Column 2 of Table 2 shows that there is a very different pattern for the net misreporting percentage of tax after refundable credits. It is highest for the low-income groups,
and lowest for the highest-income group. The pattern is not monotonic with income.
The net misreporting percentage for tax after refundable credits declines with true
income from the low-income groups until the 80–90th decile, then increases until the
95–99 percent group, after which it declines again until the highest-income group. The
stark difference between column 1 and column 2 of Table 2 in part reflects the graduated, step-function nature of the U.S. income tax rate schedule. To see the implications
of the graduated rate structure, consider individuals at different points of the income
distribution. For very high-income people, whose income far exceeds the top bracket
cutoff, marginal tax rates are only slightly higher than average tax rates, because the
benefit of the lower rates, exemptions, etc., becomes vanishingly small. Thus, for a
multimillionaire, understating income by 11 percent understates tax liability by about 11
percent.15 In contrast, consider a married couple filing jointly using the standard deduction with two dependents with $50,000 of AGI. Based on the 2007 tax rate schedule,
their tax liability if reporting accurately is $2,922 (implying an average tax rate of 5.84
percent). If, though, they understate their AGI by 10 percent, so that their reported AGI
is $45,000, their tax liability is $2,172, reflecting a drop of $750 in tax liability ($5,000
times the marginal tax rate of 15 percent). Thus, an income misreporting percentage of
10 percent corresponds to a tax misreporting percentage of 25.7 percent ($750 divided
by $2,922). In the extreme, a taxpayer whose income is just over the taxable income
threshold for having positive tax liability can, by understating their income by a small
percentage, completely wipe out their tax liability.16
C. Aggregate Underreporting by AGI Group
Table 3 shows the fraction of aggregate underreporting of AGI and of tax after
refundable credits, by true AGI and reported AGI group. Columns 1 and 3 of the table
reveal that, when arrayed by true AGI, the majority of underreporting — 63 percent —
15

16

If the understated income is disproportionately in the form of preferentially taxed capital gains, then it
could be that understating income by, say, 11 percent, reduces overall tax liability by less than 11 percent.
For a marginal change in taxable income, the ratio of the percentage change in tax liability with respect
to a percentage change in taxable income is equal to m/a, where m is the marginal tax rate and a is the
average tax rate. With a smooth tax function, m/a is decreasing in taxable income as long as ma′ > m′a,
where a prime denotes a derivative; this need not be true throughout the income distribution even under
a generally progressive tax system. The marginal tax rate does not, though, change smoothly in the U.S.
tax schedule, but rather jumps discretely across brackets. This results in an infinitely high value of m/a
just over the threshold for taxability followed by a gradual decline, and a discrete jump up at the taxable
income that corresponds to the next higher marginal tax rate; once within the top bracket, the value of m/a
declines asymptotically to one. This pattern can explain why in Table 2 the values in column 2 relative to
column 1 are the highest for the lower-income groups and are about equal for the highest-income groups.

 406

National Tax Journal

Table 3
Fraction of Aggregate AGI Underreporting and Underreporting of Estimated Tax
after Refundable Credits, by Estimated True and Reported AGI, Tax Year 2001

AGI

Underreporting
of AGI, by
Underreporting
Estimated
of AGI, by
True AGI
Reported AGI

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

#
1
1
2
3
5
6
9
12
12
24
7
20
100

13
8
8
10
9
7
8
8
8
5
10
2
3
100

Underreporting
of Tax after
Refundable Credits,
by Estimated True AGI

Underreporting
of Tax after
Refundable Credits,
by Reported AGI

1
2
3
3
3
4
5
7
11
10
23
7
21
100

8
6
8
10
9
7
8
9
9
7
13
2
4
100

** Less than 0.5%.

is associated with taxpayers in the top decile of true AGI, when measured in terms of
AGI, and is 61 percent in terms of tax.
Table 3 also shows how misleading it can be to draw conclusions about the distribution of tax noncompliance based on reported AGI. Comparing column 2 to column 1
or comparing column 4 to column 3 shows that using reported income as the grouping
concept misleadingly suggests that noncompliance is overwhelmingly a phenomenon
of the low and middle-income classes. According to column 2, 63 percent of underreporting is associated with tax returns in the bottom seven deciles. Column 1 reports
that the more appropriate percentage is 18. For tax after refundable credits, column 4
misleadingly suggests that 56 percent of underreporting is done by those in the bottom
seven deciles, while column 3 reports that a more accurate figure is 21 percent.
D. Net Misreporting by Line Item
The pattern of noncompliance by true income group raises the question of whether
high-income taxpayers have generally higher income misreporting percentages because
they receive the types of income generally misreported, as Bloomquist (2003) suggests,
or whether certain types of income have higher misreporting percentage because they

 The Distribution of Income Tax Noncompliance

407

are received more by high-income people. The analysis of this section suggests that
both factors are at play, but that the former predominates.
We first note that high-income taxpayers are much more likely to receive their income
in a form that, for reasons to be discussed later, have relatively high average misreporting percentages. We know from IRS Statistics of Income data on reported income
that wages and salaries, which are subject to very low misreporting rates, comprise a
much higher percentage of AGI for lower-income groups.17 The mirror image of this is
that the high-income groups receive a higher percentage of their income in the form of
partnership and Subchapter S business income and, especially, long-term capital gains
that have higher overall misreporting rates.18
To pursue this issue, we first present in Table 4 misreporting percentages by estimated
true AGI group for each of several income sources. Table 4 shows clearly that, within
categories of income that are generally subject to relatively high misreporting percentages (the last three columns), the misreporting percentage is higher for the high-income
groups. Note, though, that as with the overall misreporting percentage by estimated
true income group shown in Table 2, this percentage peaks in a high, but not the highest, income group. This phenomenon is most striking for capital gains, where the net
misreporting percentage for the highest income group is just 6 percent.
IV. IMPLICATIONS FOR ESTIMATES OF INCOME DISTRIBUTION
AND TAX PROGRESSIVITY
Recognizing the distributional pattern of income tax noncompliance has implications
for our understanding of income inequality and the effective progressivity of the income
tax system. There are two distinct issues here. First, if estimates of income inequality
are based on incomes reported for tax purposes, then misreported taxable incomes will
cause errors in the measurement of income inequality and the relationship of income
to tax liability — i.e., tax progressivity. Second, to the extent that tax noncompliance
affects remitted tax liabilities, it affects the actual distribution of after-tax income and
tax liability, and the actual progressivity of the income tax system.
In this section we see to what extent estimates of each are affected by the DCEcorrected estimates of income tax noncompliance.
A. True versus Apparent Distribution of Adjusted Gross Income
We begin by addressing the effect of noncompliance on the measured distribution of
pre-tax income. Table 5 shows the distribution of AGI, as reported and as adjusted for

17

18

Source: U.S. Department of Treasury, Internal Revenue Service, Individual Complete Report (Publication
1304), Table 1.4, “All Returns: Sources of Income, Adjustments, and Tax Items, by Size of Adjusted Gross
Income, Tax Year 2007,” http://www.irs.gov/pub/irs-soi/07in14ar.xls.
See Table 1 of Campbell and Parisi, 2003. Table A3 recalculates the shares of estimated true income based
on the NRP estimates of estimated true income, and Table A4 presents the shares of reported income based
on the NRP estimates of reported income.

 408

National Tax Journal

Table 4
Net Misreporting Percentages of Selected Income Sources,
by Estimated True AGI, Tax Year 2001

Estimated True
AGI

Salaries
and
Wages

Interest

Dividends

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

#
4
2
2
2
2
1
1
1
1
1
1
#
1

1
3
1
3
2
3
2
3
7
2
3
15
2
4

1
4
1
5
2
5
3
4
2
5
5
5
3
4

Business
(Sch C)

Part. ,
S Corp,
Estate &
Trust

Capital
Gains

–12
15
38
43
47
58
58
63
61
65
59
50
55
57

2
*1
*3
8
6
20
7
11
8
19
22
19
19
18

–13
–14
7
19
2
22
16
24
17
14
24
20
6
12

* Estimate based on fewer than 10 observations.
** Less than 0.5 percent.

estimated noncompliance. In each case the income groups are defined according to the
concept being measured; for example, true AGI percentages are calculated over all tax
returns in the appropriate group, and true AGI percentages are arrayed by estimated true
AGI groups. The second column, which displays reported AGI arrayed by reported AGI
groups, corresponds to what we would find in the aggregate statistics routinely published
by the Statistics of Income Division of the IRS. The first column shows the distribution
of estimated true AGI, that is, reported AGI adjusted by the estimated misreporting.
The two columns of Table 5 are not substantially different. To a fairly small degree,
the distribution of estimated true AGI is more concentrated among the top five percentiles than is reported AGI — 32.7 percent compared to 32.2 percent. However, the two
Lorenz curves intersect, so that one cannot say unambiguously that the distribution of
estimated true income is greater than that of reported income.
B. True versus Apparent Distribution of Tax Liabilities
Table 6 shows how the distribution of individual income tax liability changes when
the reported figures are adjusted to reflect estimated noncompliance. As in Table 5,

 The Distribution of Income Tax Noncompliance

409

Table 5
Distribution of Estimated True AGI and Reported AGI,
Tax Year 2001
AGI

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

Estimated True AGI

Reported AGI

0.3
1.6
2.7
3.9
5.2
6.7
8.8
11.5
15.6
10.9
14.9
3.8
14.0
100.0

0.1
1.6
2.7
3.9
5.2
6.8
8.9
11.7
16.0
11.0
14.4
3.7
14.1
100.0

the second column shows the distribution of reported tax liability when taxpayers are
grouped by their reported AGI; this is similar to what could be learned from the published statistics based on tax returns as filed. In this case the distribution of reported
tax liability is unambiguously more unequal than the distribution of estimated true tax
liability, as the Lorenz curve of the former is always below that of the latter. This is
broadly consistent with the results shown in Table 2.
C. Changes in Inequality as Measured by Gini Coefficients
One way to summarize the implications of income tax noncompliance for both
measured and actual inequality is by computing Gini coefficients. The Gini coefficient
is based on the Lorenz curve, which plots the proportion of the total of some variable,
often income, of the population (y-axis) that is cumulatively earned by the bottom x%
of the population; it is computed as the ratio of the area that lies between the line of
equality (at 45 degrees) and the Lorenz curve, to the total area under the line of equality.
We report some relevant calculations in Table 7.
The first column of Table 7 summarizes the impact of income tax misreporting on the
Gini coefficient of various concepts of pre-tax and after-tax income in tax year 2001.
The first two rows show that inequality of estimated true (pre-tax) AGI, as measured
by the Gini coefficient, is actually slightly lower than the inequality of reported AGI:
0.5697 versus 0.5727. The very small change is consistent with the small difference in

 410

National Tax Journal

Table 6
Distribution of Estimated True Tax Liability and Reported Tax Liability,
Tax Year 2001
AGI

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

Estimated True Tax Liability
(After Refundable Credits)

Reported Tax Liability
(After Refundable Credits)

**
–0.3
**
1.0
2.4
3.9
6.0
8.6
13.8
11.5
19.9
6.5
26.9
100.0

–0.2
–0.8
–1.0
0.0
1.8
3.7
5.8
8.7
14.1
11.8
19.9
6.7
29.6
100.0

** Less than 0.5 percent.

Table 7
Gini Coefficients for Various Income Measures, Tax Years 2001 and 1988
Row #

1.
2.
3.
4.
5.
1

Income Measure

Reported AGI
Estimated True AGI
Reported AGI - Reported Tax Liability
Estimated True AGI - Reported Tax Liability
Estimated True AGI - Estimated True Tax Liability

2001 NRP

1988 TCMP1

0.5727
0.5697
0.5322
0.5372
0.5322

0.5276
0.5252
0.5024
0.4999

Source: Bishop, Formby, and Lambert (2000, Table 1, Row 13).

the distributions by percentile shown in Table 5. Recall, though, that the two Lorenz
curves do not intersect, so that the Gini coefficient is not an unambiguous measure of
inequality differences.
The remaining rows of Table 7 correspond to various measures of after-tax income.
The third row shows the Gini coefficient of reported income minus reported tax liability.

 The Distribution of Income Tax Noncompliance

411

The reduction in the Gini coefficient between the first and third rows (=0.0405) is the
change due to income taxation one would measure based on data that is unadjusted for
noncompliance. The fourth row shows the Gini coefficient of estimated true income
minus reported tax; this is the appropriate concept of after-tax income assuming that
none of the misreported income is detected or ever paid. Not surprisingly, this concept has a higher Gini coefficient than either the third (or fifth) row, because it adds
back in unreported income without any accompanying, and inequality-reducing, tax
liability.
The difference between the second and fourth rows (=0.0325) repeats that calculation
using estimated actual AGI rather than reported AGI, and shows that the change in the
Gini coefficient is actually somewhat less than that obtained using unadjusted data.
Comparing the fourth and fifth rows provides information about the distributional
consequences of income tax noncompliance, as summarized by Gini coefficients. It
indicates that, if all noncompliance were to vanish so that everyone was subject to their
estimated true tax liability, then the Gini coefficient would decline by 0.0051. Comparing
the fifth row with the third row shows that full reporting (i.e., no noncompliance) would
make the Gini coefficient of after-tax income about the same as one would calculate if
using unadjusted data for true income and actual tax liability.
The second column of Table 7 shows the tax year 1988 results from Bishop, Formby,
and Lambert (2000), who analyze the micro data from the 1979, 1982, 1985, and 1988
TCMP studies to assess the effects of noncompliance and tax evasion on the vertical
(and horizontal) distribution of after-tax income and tax burden. They find, as we do
for tax year 2001, that including unreported income as measured by the TCMP studies19
has only a very small (negative) impact on pre-tax income inequality as measured either
by the standard Gini coefficient or the extended Gini coefficient developed by Yitzhaki
(1983) that can place more or less weight on the lower part of the income distribution.
Including both unreported income and additional taxes owed also has a small impact
on the Gini coefficient.
A comparison across columns for 2001 and 1988 suggests that income inequality rose
significantly over this period; this has been noted in scores of other studies. In addition,
if the effect of the tax system on inequality can be measured by the difference between
the Gini coefficient for reported income and the Gini coefficient for reported income
minus reported (actual) tax, the decline was larger in 2001 (0.04050) than it was in 1988
(0.0252). This suggests that the tax system in 2001 was more successful at reducing
what otherwise would be a higher level of pre-tax inequality. Note, though, that a better way to measure the change in the redistributional effect of the income tax system
would be to compare the change in the difference between the Gini coefficient of true
income and the Gini coefficient of true income minus reported tax, as in the fourth row
of Table 7, but Bishop, Formby, and Lambert (2000) do not report the latter statistic.
19

Bishop, Formby, and Lambert (2000) appear to consider income taxes but not self-employment taxes,
the same procedure we employ here. There is no explicit statement about whether they make use of the
multiplier that adjusts for undetected income, although their results suggest that they do.

 412

National Tax Journal

V. CONCLUSIONS
One of the key findings of this paper is that, when taxpayers are arrayed by their
estimated “true” income, the ratio of aggregate misreported income to true income
generally increases with income. What might explain this pattern of results? Part of the
story is that for non-tax reasons higher-income people are more likely to receive income
from sources that are more difficult for the tax authority to monitor. A model of rational
tax noncompliance, as first outlined by Allingham and Sandmo (1972), suggests that,
depending on the relationship of penalties to the amount and nature of noncompliance,
more noncompliance would be associated with lower risk aversion,20 higher marginal
tax rates, 21 a lower perceived probability of detection, a lower perceived effect of the
level of noncompliance on the perceived probability of detection, and a lower penalty
for detected evasion. On average, of course, higher-income taxpayers do face higher
marginal tax rates. They also, though, face higher average audit rates.22 Note also, as
stressed in Yitzhaki (1987), that a higher marginal tax rate implies that less income need
be understated to achieve a given size gamble in after-tax income.
Microeconometric analysis of the NRP data, along the lines of Clotfelter’s (1983)
analysis of the 1969 TCMP data, might be insightful, but this kind of exercise is hampered by the lack of extensive demographic information on tax returns, the limited
variability of marginal tax rates conditional on income, and extremely limited information on variations in perceived probability of detection (indeed limited to average audit
rates across broad classes of income, and the presence of business income). Controlled
experiments, for example as reported in Slemrod, Blumenthal, and Christian (2001),
have the promise of more compelling identification of the possible determinants of
noncompliance, but are rare.23
A few caveats must accompany the presentation of our results. The first, and most
obvious, is that the NRP estimates of noncompliance are just that — estimates. To
the extent that there is systematic error related to true income, the results we present
here misrepresent the reality of how noncompliance varies by income group. This is a
cause for substantial concern, given the plausible possibility of systematic differences
20

21

22

23

More noncompliance relative to income for higher-income returns would be consistent with declining
relative risk aversion.
It is, however, important to note the point made by Yitzhaki (1974) that, when the penalty for a given
amount of evasion is a fraction of the detected tax evasion, a higher tax rate automatically increases the
penalty for a given amount of taxable income understatement. In this case an increase in the tax rate does
not change the terms of a tax evasion gamble, and has only an income effect; under usual assumptions
about risk aversion, this implies that a tax rate increase would reduce, rather than increase, evasion.
The IRS reports that the audit coverage rate in fiscal year 2008 for returns with adjusted gross income less
than $200,000 was less than one percent but rose continuously for higher income groups, reaching 9.77
percent for returns with AGI exceeding $10,000,000 (U.S. Department of the Treasury, Internal Revenue
Service, 2009, Table 9b).
See Andreoni, Erard, and Feinstein (1998) or Slemrod and Yitzhaki (2002) for surveys of the empirical
literature on tax noncompliance.

 The Distribution of Income Tax Noncompliance

413

in the ability of auditors to detect misreporting by type of income, the plausible possibility that the misreporting of upper-income taxpayers is more sophisticated and thus
harder to detect, and the inability of the Detection Controlled Estimation procedure to
completely correct for both of these factors. In addition, non-systematic errors would
cause an overestimate of the extent to which noncompliance is a phenomenon of truly
high-income taxpayers; this is true because an overestimate of noncompliance also
overstates true income, while an underestimate does the reverse.
Second, noncompliance has attendant costs that are not measured here.24 There is the
risk involved due to the uncertainty of ultimate remittance and penalty. There are often
real costs incurred to identify and implement certain noncompliance strategies, and to
camouflage them. Indeed, a model of rational tax noncompliance suggests that, at the
margin, the expected utility of tax savings will be exactly offset by the expected utility
of costs. Of course, this marginal condition does not imply that there is no private gain
from engaging in noncompliance. With assumptions about the nature of these offsetting
costs, one can quantify the adjustments needed to calculate the net-of-cost gain. For
example, if the marginal cost was linearly increasing in the amount of noncompliance
and was equal to zero at zero noncompliance, then the net-of-cost gain would be exactly
half of the gross-of-cost gain that we calculate in this paper. If the marginal costs were
increasing in the amount of noncompliance, then the net-of-cost gain would exceed half
of the gross-of-cost gain. Rather than presenting net-of-cost figures based on arbitrary
assumptions about the cost of misreporting function, we present unadjusted figures
accompanied by this caveat.
Subject to these caveats and the others mentioned throughout the paper, we tentatively
conclude that, when taxpayers are arrayed by their “true” income, the ratio of aggregate
misreported income to true income generally increases with income, although it peaks
among taxpayers with adjusted gross income between $500,000 to $1,000,000, and
is lower than the peak ratio for individuals with income above $1,000,000. In sharp
contrast, the ratio of underreported tax to true tax is higher for lower-income taxpayers,
reflecting the fact that a given percentage reduction in taxable income corresponds to a
particularly high percentage reduction in tax liability for taxpayers with taxable income
just above the taxpaying threshold.
ACKNOWLEDGMENTS
The content of this article is the opinion of the authors and does not necessarily represent the position of the Internal Revenue Service. We are grateful to Ed Emblom, Mark
Mazur, and Alan Plumley for allowing access to the National Research Program data,
and to Alan for many helpful discussions about its methodology. We also would like to
acknowledge Kim Bloomquist for developing the Individual Underreporting Tax Gap
24

Note also that some of the noncompliance would have been detected in the ordinary course of enforcement,
upheld upon appeal and ultimately remitted, perhaps with attendant penalties added.

 414

National Tax Journal

Model, and Brian Erard and Jonathan Feinstein for developing and implementing the
Detection Controlled Estimation methodology. We also thank Alan, Brian, Ed, Jonathan,
Kim, and Mark as well as two exceptionally incisive referees for helpful comments on
an earlier draft, but take sole responsibility for the content.
REFERENCES
Allingham, Michael G., and Agnar Sandmo, 1972. “Income Tax Evasion: A Theoretical Analysis.”
Journal of Public Economics 1 (3-4), 323–338.
Andreoni, James, Brian Erard, and Jonathan Feinstein, 1998. “Tax Compliance.” Journal of
Economic Literature 36 (2), 818–860.
Bishop, John A., John P. Formby, and Peter Lambert, 2000. “Redistribution through the Income
Tax: The Vertical and Horizontal Effects of Noncompliance and Tax Evasion.” Public Finance
Review 28 (4), 335–350.
Bloomquist, Kim M., 2003. “Trends as Change in Variance: The Case of Tax Noncompliance.”
Paper presented at the IRS Research Conference, June 2, Washington, DC. http://www.irs.gov/
pub/irs-soi/bloomquist.pdf.
Campbell, David, and Michael Parisi, 2003. “Individual Income Tax Returns, 2001.” Statistics
of Income Bulletin (Fall), 8–45.
Christian, Charles W., 1994. “Voluntary Compliance with the Individual Income Tax: Results
from the 1988 TCMP Study.” IRS Research Bulletin (1993/1994), 35–42.
Clotfelter, Charles, 1983. “Tax Evasion and Tax Rates: An Analysis of Individual Returns.”
Review of Economics and Statistics 65 (3), 363–373.
Erard, Brian, and Chih-Chin Ho, 2003. “Explaining the U.S. Income Tax Continuum.” eJournal
of Tax Research 1 (2), 93–109.
Feinstein, Jonathan S., 1990. “Detection Controlled Estimation.” Journal of Law and Economics
33 (1), 233-276.
Feinstein, Jonathan S., 1991. “An Econometric Analysis of Income Tax Evasion and Its Detection.” Rand Journal of Economics 22 (1), 14–35.
Feinstein, Jonathan S., 2004. “Statistical Analysis of Compliance Using the NRP Data: Detection
Controlled Models: Slides.” Paper presented at the IRS Research Conference, June 2, Washington,
DC. http://www.irs.gov/pub/irs-soi/1-2feinst.pdf
Plumley, Alan, 2005. “Preliminary Update of the Tax Year 2001 Individual Income Tax Underreporting Gap Estimates.” Paper presented at the IRS Research Conference, Washington, DC.
http://www.irs.gov/pub/irs-soi/05plumley.pdf
Slemrod, Joel, 2007. “Cheating Ourselves: The Economics of Tax Evasion.” Journal of Economic
Perspectives 21 (1), 25–48.

 The Distribution of Income Tax Noncompliance

415

Slemrod, Joel, Marsha Blumenthal, and Charles Christian, 2001. “Taxpayer Response to an
Increased Probability of Audit: Evidence from a Controlled Experiment in Minnesota.” Journal
of Public Economics 79 (3), 455–483.
Slemrod, Joel, and Shlomo Yitzhaki, 2002. “Tax Avoidance, Evasion, and Administration.”
In Auerbach, Alan J., and Martin Feldstein (eds.), Handbook of Public Economics, Volume 3,
1423–1470. North-Holland, Amsterdam, The Netherlands.
U.S. Department of the Treasury, 1996. Federal Tax Compliance Research: Individual Income
Tax Gap Estimates for 1985, 1988, and 1992. Publication 1415. Internal Revenue Service,
Washington, DC.
U.S. Department of the Treasury, 2005a. New IRS Study Provides Preliminary Tax Gap Estimate.
IRS-2005-38. Internal Revenue Service, Washington, DC.
U.S. Department of the Treasury, 2005b. Tax Gap Facts and Figures. Internal Revenue Service,
Washington, DC.
U.S. Department of the Treasury, 2005c. Understanding the Tax Gap. FS-2005-14. Internal
Revenue Service, Washington, DC.
U.S. Department of the Treasury, 2006. IRS Updates Tax Gap Estimates. IRS-2006-28. Internal
Revenue Service, Washington, DC.
U.S. Department of the Treasury, 2007. Individual Income Tax Returns (Complete Report).
Publication 1304. Internal Revenue Service, Washington, DC.
U.S. Department of the Treasury, 2009. Internal Revenue Service Data Book, 2008. Publication
55B. Internal Revenue Service, Washington, DC.
Yitzhaki, Shlomo, 1974. “A Note on ‘Income Tax Evasion: A Theoretical Analysis.’” Journal of
Public Economics 3 (2), 201–202.
Yitzhaki, Shlomo, 1983. “On an Extension of the Gini Index.” International Economic Review
24 (3), 617–628.
Yitzhaki, Shlomo, 1987. “On the Excess Burden of Tax Evasion.” Public Finance Quarterly 15
(2), 123–137.

 416

National Tax Journal

APPENDIX
This appendix contains supplementary tables described in the text.

Table A1
Sample Size and Weighted Number of Returns by Level of True AGI
Based on TY 2001 Tax Gap Model
True AGI

Number of Returns in Sample

Weighted Number of Returns
(Thousands)

1,615
1,758
2,015
2,019
2,661
3,024
3,358
4,010
4,357
3,178
5,055
1,589
2,060
36,699

12,698
12,560
12,562
12,568
12,576
12,574
12,563
12,570
12,569
6,284
5,028
629
629
125,808

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

Table A2
Voluntary Compliance Levels by AGI, 1988
AGI

$0–5K
5K–10K
10K–25K
25K–50K
50K–100K
100K–250K
250K–500K
>500K

Voluntary Compliance Level

84.2
78.7
88.8
92.4
93.2
91.3
95.7
97.1

Note: Voluntary compliance level is reported tax liability divided by corrected tax
liability.
Source: Christian (1994), based on 1988 TCMP.

 The Distribution of Income Tax Noncompliance

417

Table A3
Composition of True Income by True AGI Based on TY 2001 Tax Gap Model

True AGI

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

Business
(Sch. C)

Part.,
S Corp,
Estate &
Trust

Capital
Gains

Other

8.4
1.4
1.4
1.7
1.1
0.8
1.1
1.1
1.3
1.7
2.0
2.8
3.1

6.5
6.0
4.8
4.8
4.5
5.8
5.5
6.7
7.2
10.2
14.7
12.6
6.5

–32.6
0.0
0.1
0.5
0.3
0.6
0.6
0.9
1.4
2.6
7.5
15.2
24.4

11.5
0.1
0.6
0.4
0.2
0.5
0.4
0.8
1.3
2.1
4.7
8.1
19.4

–48.6
12.8
13.1
11.8
12.2
11.4
12.9
13.5
12.1
11.9
11.3
9.8
8.1

1.7

8.1

5.7

4.4

11.3

Salaries
and
Wages

Interest

Dividends

139.2
74.8
75.0
77.7
78.7
78.6
77.2
74.6
74.3
69.5
56.9
48.4
34.8

15.5
4.9
4.9
3.2
2.9
2.4
2.4
2.4
2.6
2.0
2.8
3.1
3.8

65.8

2.9

 418

National Tax Journal

Table A4
Composition of Reported Income by Reported AGI based
on TY 2001 Tax Gap Model

True AGI

Bottom 10%
10%–20%
20%–30%
30%–40%
40%–50%
50%–60%
60%–70%
70%–80%
80%–90%
90%–95%
95%–99%
99.0%–99.5%
Top 0.5%
Total

Salaries
and
Wages

Interest

Dividends

419.6
75.3
74.4
77.3
83.6
81.4
83.4
79.6
79.9
78.6
69.8
59.8
41.9
72.8

50.3
4.7
5.3
3.9
2.6
3.0
2.0
2.8
2.6
2.2
3.1
3.4
4.2
3.1

32.0
1.3
1.5
1.9
1.1
0.9
0.9
1.1
1.3
1.6
2.3
3.1
3.6
1.8

Business
(Sch. C)

Part.,
S Corp,
Estate &
Trust

Capital
Gains

Other

–1.2
9.6
5.9
3.9
3.4
2.6
2.8
2.4
2.7
3.5
6.0
7.5
3.0
3.7

–155.3
0.0
0.0
0.1
0.4
0.7
0.5
1.0
1.3
1.9
6.2
13.9
23.0
5.1

40.7
0.0
0.6
0.4
0.4
0.2
0.2
0.7
0.9
1.7
3.9
6.6
20.7
4.3

–286.1
9.2
12.3
12.4
8.6
11.3
10.3
12.4
11.2
10.5
8.7
5.8
3.5
9.1

 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.