July 2008 IRS MIGRATION DATA The decennial Census provides all sorts of interesting morsels that can be passed on to readers, viewers or listeners. But there is one item the census does not cover: where the people in your community are coming from and where are they moving to. With the IRS migration data, you can track movement in and out of counties. Moreover, financial information in the data allows you to gauge whether your community is gaining or losing wealth. Thus this data - which is processed through a cooperative agreement between the IRS and the Census Bureau - makes a perfect complement to other census data. There are, however, important limitations to the data that are outlined below. NICAR's data covers filing years 1992-93 (calendar years 1991-92) to filing years 2006-07 (calendar years 2005-06). The Census Bureau compiles the dataset by matching the Social Security number of the primary taxpayer from one year to the next. If the address for that taxpayer changes, they are considered migrants. Given the methodology, here are some limitations to keep in mind: 1. Some migrants may be missed because of a change in their status. For instance, a taxpayer could marry between filings (becoming a secondary taxpayer the next year) or their income level could change to the point where they don't need to file a return. Errors in the SSNs reported by the taxpayer or transcription mistakes at the IRS could also cause underreporting. 2. The IRS uses tax returns filed up through near the end of September, meaning that returns filed afterwards are not covered in the data. The IRS estimates 95 percent to 98 percent of all returns are included. 3. The data can be incorrect for the limited number of people who use an address other than their own on their tax form e.g. their accountant. If the accountant moves from one year to the next, the taxpayer will mistakenly be included as a migrant. The Census Bureau says this is not a very frequent problem, though it tracks counties where it thinks "spurious migration" has occurred. NICAR has included these lists from 1995-96 on with its documentation. More detailed information about data limitations is available in the Word document "publicdocumentationr" included in the IRS-Census Documents folder. For filing years 2006-07, NICAR received state to state migration flows information from IRS. The tables are named STO0607.dbf and STI0607.dbf. Layouts of the tables can be found in the "NICAR documents" folder under reclayst0607.txt. ABOUT THE DATA The "exemps" field is a key column. The IRS uses the number of exemptions claimed on a tax form as measure of the number of people who entered or exited the county. Understanding how the IRS maintains the privacy of taxpayers is a big challenge in using this dataset. In instances where there are fewer than 10 returns filed, the taxpayers are grouped into a broader summary level - such as a state or regional category - which the Census Bureau calls "suppression." CHANGE FOR 2005-2006 dataset: Officials began using a "D" to represent suppressed data. In response, NICAR created three new fields to accommodate the character "D," while still allowing users to run calculations on the figures included in the original fields. The new fields "Returnsch," "Exempsch," and "Aagich," include the "D" entries and are placed next to the fields "Returns," "Exemps," and "Aagi," which convert the "d" to zero and feature the data in numeric format. Calculations should be done on the numeric fields. Prior to year "056," the new NICAR-created fields will be empty. The Census Bureau designates records containing summarized figures (those that have absorbed suppressed data) by using different combinations of county and state FIPS codes. It further explains the records by using two-letter designations that are like postal codes and by summarizing the geography level in a separate field. The codes changed beginning with the 1995-96 tax years, so NICAR split the data into four tables to account for the changes. In the years prior to 1995-96, for example, taxpayers who migrated to/from counties in the Northeast United States are grouped into a category that has a state FIPS of "63" and a county FIPS of "011." It has an "XX" for a state postal abbreviation and an explanation saying "Region 1: Northeast." In counties where there is very minimal movement, the migration is grouped into one record with the explanation of "Suppress All Flows." The grouped records are similar for years 1995-96 and beyond, though there are some differences. One major difference is that these later years include a five-record summary of total migration in the entire U.S. NICAR has included code sheets explaining the different suppressed levels. One thing to keep in mind regarding the various flows is the fact that there won't necessarily be one-to-one matches between the incoming and outgoing tables. As an example, consider San Francisco County in California for 1992-1995. In 1992-93, there was enough of an in-migration to warrant a separate record for people moving from Eagle County, Colo. However, there must have been few -- if any -- people moving from San Franciso to Eagle County -- since there is no corresponding record in the outflow table. This will complicate matters if you are trying to calculate net losses/gains for individual counties. One other major difference between the two sets of years is the use of "Total Money Income" (TMI) in 1992-93 through 1994-95 and "Adjusted Gross Income" (AGI) from 1995-96 onward. The Aggregate Total Migrant Income (Atmi) and Median-Adjusted Gross Money Income (Mtmi) fields are listed in the 95-07 table, but do not contain data. Financial data is reflected in the AGI field. ATMI and MTMI data is included in the 92-95 table. NICAR has included IRS documentation that explains the differences between Total Money Income and Adjusted Gross Income. The documentation also has other information about the dataset. We have also provided some correspondence with the Census Bureau that outlines the suppression process. RETURNS, EXEMPTIONS & MORE The explanation below gets complicated. Keep in mind, however, that the instances described are very few when compared with the total number of records in the database. In most instances, the records will be more straightforward. In addition to the suppression methods described above, the Census Bureau uses a "-1" in certain instances of limited flow. For instance, in the 1992-93 to 1994-95 tables, a "-1" denotes instances where there were fewer than 10 county nonmigrants. In the years from 1995-96 to 2003-04, the "-1" shows up in "total migration - foreign" records. The "-1" are usually spread across the whole record, which means you will find these numbers in the AGI and TMI fields. Additionally, for years 1995-96 and 1996-97, the Census Bureau used a system in the median AGI field where all the negative numbers have no meaning; any negative median value just means the value was less than zero. Similarly, any median value above $99,999, means the value was $100,000 or more. The system changed from 1998-99 on. Under this system, if the median was less than $0, it was replaced with "-1". (so now a median value of "-1" can either mean a suppressed or value < 0.) If the value of the number of returns or exemptions is "-1", then the "-1" in the median means suppressed. Otherwise it means < $0. A median value of more than $99,999 was changed to "1". For more information, call the IRS Statistics of Income Division at 202-874-0410 or e-mail them at sis@soi.irs.gov. People there can help you or direct you to Census Bureau staff that can be of assistance. ONE OTHER NOTE The IRS issued revised data for 2000-01. This data contains the revised data. The IRS notes in its ovrrid01: "The original 2000-2001 gross migration data review process showed that most counties had larger gross in-migration and gross out-migration than they had in prior years. We believe this was due to address changes occurring after the time of filing." In ovrrid03 IRS lists the counties that encountered problems with the Migration Data for 2003. ON THIS CD In addition to this README, this CD includes the following folders and files: LEGAL.TXT - The legal agreement regarding the use of NICAR data. ***TABLES I9295.DBF - County inflows from 1992-93 to 1994-95. Record count: 292,646. O9295.DBF - County outflows from 1992-93 to 1994-95. Record count: 291,277. I9507.DBF - County inflows from 1995-96 to 2006-07. Record count: 1,339,153. O9507.DBF - County outflows from 1995-96 to 2006-07. Record count: 1,332,535. FIPS.DBF - A lookup table with state and county FIPS codes STI0607.DBF* - State inflows of 2006-07. Record count: 2,805. STO0607.DBF* - State outflows of 2006-07. Record count: 2,805. (*NICAR received the state migration flow data for 2006-07 filing year. The tables provide state level migration information in a very neat format.) ***DOCS/NICAR documents README.TXT RECLAY9295.TXT - Record layout for the 1992-93 to 1994-95 inflows/outflows. RECLAY9507.TXT - Record layout for the 1995-96 to 2006-07 inflows/outflows. SUMM9295.TXT - Code sheet for the suppressed/summary records in the 1992-93 to 1994-95 inflows/outflows. SUMM9507.TXT - Code sheet for the suppressed/summary records in the 1995-96 to 2006-07 inflows/outflows. FIPSLAYOUT.XLS - Record layout of FIPS.DBF RECLAYST0607.TXT - Record layout for the 2006-07 state inflows/outflows. ***DOCS/IRS-Census documents SUPPRESSEXPLAIN.TXT - An explanation of the suppression levels for 1992-93. This will help you understand suppression for the later years as well. IRS9697.DOC - Documentation the IRS provided with the data, including information on TMI and AGI, for 1996-97. It also contains general information helpful for other years. publicdocumentation2004-2005.txt - IRS documentation provided with the 2005-2006 release, including: Definitions and explanations of terms; data content and comparability; code lists for states and counties and summary level information; and technical documentation for suppression procedures. publicdocumentation2005-2006.txt - IRS documentation provided with the 2006-2007 release. REV0304.XLS - Contains revisions to the 2003-04 data from the IRS, received by NICAR in June 2006. ***DOCS/IRS-Census documents/SPURIOUS Contains eight text files that list "spurious migration" -- that is, counties with problems in the migration data -- for 1995-96 through 2002-03. ***DOCS/IRS-Census documents/0102 extuser.doc -- Supplemental documentation for the 2002-03 migration data. co0102us.doc -- Technical documentation from the IRS for 2001-02 migration data. You can ignore section A, file characteristics, because it relates to the original data as received by NICAR. However, the other sections include helpful information. ***DOCS/IRS-Census documents/0001 Contains four text files with documentation and other technical information, including a list of contacts regarding questions on the IRS data. ***DOCS/IRS-Census documents/exceptionlist Contains two text files with documentation of the instances of "-1" flow explained above. RELATED STORIES These stories, which have used the IRS Migration Data, are available from the IRE Resource Center. They can be ordered by calling (573) 882-3364. Story #23271 Jan. 2007. The Charlotte Observer and charlotte.com published stories and interactive maps that show county-to-county migration in North Carolina and across the U.S. The report highlighted the trend of upstate New Yorkers moving to the Charlotte region. An accompanying map is based on the most recent five years of IRS county migration data. Story #18993 February 2002. The Census 2000 Supplementary Survey shows trends in migration by immigrants and domestic migrants (newcomers from other parts of the US). Regions not attracting either group have often experienced a prolonged economic decline or lack natural or cultural amenities that many migrants seek. California has the largest number of foreign-born residents, while Western and Southeastern states tend to attract many domestic migrants. States in the Midwest, Northeast and parts of the South have few migrants and tend to have older, less diverse populations. Story #17357 July 2000. This USA TODAY series analyses economic growth and development and finds that "41 million people - more than one in seven Americans - live in a county on the Atlantic Ocean or the Gulf of Mexico." The analysis of population and demographic trends, and building permits, finds that "coastal counties are growing significantly faster than the rest of the country in population, employment and gross domestic product." The series reveals that in spite of multiple natural threats - like hurricanes, rising sea level, fragile sands and erosion - "growth pressure keeps building" and "all levels of government foster this amenity-driven middle class lifestyle..." Story #16677 March 2000. The Syracuse Newspapers devoted five years to this eight-part series examining the migration trend out of Syracuse to other parts of the country in search of better employment opportunities. The series takes an in-depth look at the impact of the economic recovery of the early 1990s and the resulting population shift across the country. RELATED TIPSHEETS These tipsheets, available from the IRE Resource Center, may help in dealing with IRS Migration data or writing migration-related stories. IRE members can download these materials at http://www.ire.org/resourcecenter/ Tipsheet #1337 September 2000. "Ten Census 2000 stories any newspaper can do." The story ideas, from Paul Overberg of USA Today, include not only migration, but also immigration, group living and diversity. Tipsheeet #1230 June 2000. "Investigating sprawl: Techniques and revelations." This tipsheet, by Greg Reeves of the Kansas City Star and Rose Ciotta of The Philadelphia Inquirer, offers a list of data sets to explore how urban sprawl affects your coverage area. Tipsheet #659 March 2000. "Fresh demographic data for communities in flux." D'Vera Cohn of The Washington Post explains how to find data for cities, counties or states that are changing demographically.