Downloads
- Worldwide Airport Codes
- US Airport 3-letter codes in numerical code order (with latitude, longitude and DOT numerical codes).
- Stata dictionary for reading ticket-level DB1A/DB1B dataset
- Quarterly ticket-level DB1A/DB1B datasets in one zip file (151 individual zipped fixed format files)
- All 151 Quarterly ticket-level DB1A/DB1B datasets in one zip file (2.7GB)
- Aggregated market-carrier dataset for all quarters 1979q1-2016q3. (544mb .dta)
- Aggregated market-carrier dataset for all quarters, 1979q1-2016q3 zipped (212mb, zipped .dta)
- Aggregated market-carrier dataset 1979q1-2016q3 in csv format(1033mb .csv)
Important Information
The data available on this page are only for academic research. The use of these data for consulting or other for-profit activities is not permitted under the terms of use.
The domestic DB1A/DB1B data are now available without charge from the Department of Transportation website for 1993-present.
NOTE: The Data Bank 1A was replaced in 2003 by Data Bank 1B. These datasets are identical except for (1) DB1B identifies both the operating and the ticketing carrier while DB1A assumed they were the same and (2) DB1B has changed the codes for seat classes. The data sets here are based on the DB1A data through 2003Q4 and DB1B data starting with 2003Q1. Below they are referred to collectively as DB1A/DB1B. Both are a 10% sample of nearly all tickets sold, where selection of the sample is based on the last digit of the ticket identifier being a zero.
You can purchase the full DB1A/DB1B data directly from the DOT (if you are a U.S. citizen and after you receive authorization from DOT) for a cost of about $350 per CD. Many quarters of data fit on a CD. For more information on this, go to the DOT's airline data webpage. Also, for more information on the DB1A/DB1B, go to the DOT's website. The DB1A/DB1B is a quarterly dataset. The quarter in which a ticket appears in the sample is based on the first date of travel of the ticket. There is no further information in the dataset about dates of travel.
All data are available here are for 1979Q1 through 2016Q3, however, there are data reliability issues for the first few years, with the worst being 1980Q4 when EA and DL appear to have significantly under-reported. Through much of the 1980s, there are airport reporting problems with some airlines reporting city names rather than the particular airport (e.g., NYC rather than LGA) for some of their tickets. These problems are not corrected in these files.
There are two forms of the data available here:
1. Quarterly Ticket-Level Domestic DB1A/DB1B
This is a translation of the DOT DB1A/DB1B that drops the more unusual tickets (e.g., any ticket with more than 4 coupons, one-way ticket with more than 2 coupons) and all international tickets.
Size: About 100-200 Mb per quarter uncompressed. About 10-30 Mb per quarter zipped.
Each file is in fixed format with the following record layout:
1 = Point of Purchase (B=Base Airport, R=Reference Airport) 2-4 = 3-letter code for base airport 5-7 = 3-letter code for reference airport 8-10 = 3-letter code for change-of-plane airport if there is one 11-12 = 2-letter code for first-segment carrier 13-14 = 2-letter code for second-segment carrier 15-16 = 2-letter fare code for first segment 17-18 = 2-letter fare code for second segment 19-22 = Fare 23-26 = First-segment distance 27-30 = Second-segment distance 31-34 = Base-to-reference nonstop distance 35-40 = Number of passengers 41-42 = Reporting carrier (assumed to be first segment carrier in DB1A) 43-45 = 3-digit numerical code for base airport (see airports.lst) 46-48 = 3-digit numerical code for reference airport (see airports.lst) 49-51 = 3-digit numerical code for change-of-plane airport (see airports.lst) 52 = Ticket type code
SCREENING - The following records are dropped from the original DB1A:
Any record that includes an airport outside the 50-state U.S.
Any record with more than 4 coupons
Any one-way ticket with more than 2 coupons and any 3 or 4 coupon ticket with more than two trip-break points
OTHER NOTES:
Round-trip and open-jaw tickets are broken into two records, one for each directional trip. For round-trip (closed-jaw) tickets, the fare in DB1A is divided in half for each of the directional trips. For open-jaw tickets, fare is divided by proportion of ticket miles is each of the directional trips.
Some records contain airport codes that are not included in the DOT's Database 5 or for which no location information is present. For these records, no distances are calculated. The records are still included with the 3-letter airport codes, but distances are set to 0. For these records, round-trip open-jaw fares are divided in half, rather than weighted by share of distance, since distance is not known.
No screening of data based on fare reasonableness has been done.
TRIP TYPE CODES
- O = One-way ticket.
- R = One direction of a round-trip ticket.
- U = One direction of an "unbalanced" ticket (i.e., round trip ticket with 2-coupons in one direction and 1-coupon in other direction).
- I = Interline ticket. Change of carrier within at least one of the directional trips on the ticket. Used only on round-trip tickets, because interline on one-way tickets is evident from carrier listings. Note: Not used if outbound trip entirely on one carrier and return entirely on a different carrier. Such tickets are not distinguishable from one-carrier round-trip tickets in this datasets. [Supercedes U or R].
- J = Open Jaw ticket. Trip destination on second directional trip of the ticket is not equal to trip origination on first directional trip of the ticket. [Supercedes U, R or I].
- In addition to the fixed format dataset, each zip file contains four more files:
- AIRPORTS.LST is a listing of U.S. domestic airports in the order that corresponds to the numerical codes for airports in the dataset (e.g., ORD is the first airport listed and has the numerical code 001. The ordering is based on a list of airports by airport size from the 1980s.)
- AIRPORTCODES.DOCX is a listing of all airports worldwide and their codes as of 2019. This is taken from http://airportcodes.org .
- NBOE.DCT is a Stata dictionary file that reads the fixed format data set into Stata.
- NBOEYYQ.RPT is a file for quarter YYQ that reports various statistics on the production of this file from the full DB1A/DB1B, including the share of tickets that fall into various categories.
2. Aggregation of Domestic DB1A/DB1B into Market-Carrier Dataset
This is a relatively compact summary of the domestic DB1A/DB1B. It compresses all O&D information for a carrier on a route into one record, giving direct and change-of-plane passenger counts and average fares for the given carrier on the route. The Market Data file is available as a Stata dataset. It is created from the domestic airline ticket data from the DOT DB1A/DB1B, a 10% sample of all tickets collected by US carriers.
Size: About 500 MB uncompressed for all data, 1979Q1 to 2016Q3 (200 MB compressed).
OTHER NOTES
- Tickets with an international segment are excluded.
- First-class tickets are excluded.
- Tickets must be one-way or round-trip; open-jaw, circle trips, etc are excluded.
- A ticket must have no more than 2 coupons for a one-way trip, no more than 4 coupons (and no more than 2 coupons each way) for a round-trip ticket.
- Tickets with fare less than $20 or fares above $9998 excluded.
- Tickets with fares more than 5 times USDOT's Standard Industry Fare Level for observed trip distance during observed quarter are excluded.
- Records are for one-way trips, so round-trip tickets are split into two one-way observations
- route is a pair of airports without regard to direction
- carrier-set is one carrier and a blank if the trip is one-coupon. If the trip is two-coupon, carrier-set is the pair of airlines with codes listed in alphabetical order. Information about the order of flights is not retained.
- dir-cop is a distinction between one-coupon (direct) and two-coupon (change-of-plane) tickets. On two-coupon tickets, the location of the change-of-plane is not retained, though the average total routing distance for all c-o-p tickets collapsed into a single record is reported.
Record Layout: yr -- year qtr -- quarter ap1 -- 3-letter alpha code of the first airport (by alphabetical ordering) ap2 -- 3-letter alpha code of the secondt airpor (by alphabetical ordering) - blank if one-coupon ticket cr1 -- 2-letter alpha code of the first carrier (by alphabetical ordering) cr2 -- 2-letter alpha code of the second carrier (by alphabetical ordering) - blank if one-coupon ticket pax -- number of passengers reported in record nsdst -- non-stop distance from airport ap1 to airport ap2 avdst -- average total routing distance of passengers in this record - equal to nsdst if one-coupon ticket avprc -- average one-way equivalent price paid by passengers reported in this record cop -- 0 if one-coupon ticket, 1 if two-coupon ticket Questions may be sent to data@nber.org