Data Critique

Airline Traffic Passenger Statistics

In our dataset, “Airline Traffic Passenger Statistics,” we were given a table with airlines and a series of statistics of those airlines dating back to July, 2005. From this dataset, we were able to find out a lot of information regarding airlines over the course of 11 years. The columns of information that are included in this dataset contain: The date of the activity (Year, month, day), airline name, whether they are domestic or international, the price range of the flight, the passenger count, and the month and year the plane flew. Some other information provided include: the airports the flight took off of and landed at and boarding area. The purpose of this dataset is to exhibit patterns and trends of passenger traffic of various airlines over a long period of time. From this information, we will be able to take a look at notable incidents or accidents that have occurred to specific airlines and observe the fluctuation in passenger statistics before and after the events to determine whether or not the accidents had an effect. Unusual fluctuations in numbers would represent a lack of trust people have in the airline or fear of traveling. Being able to analyze patterns in how people react to accidents or fear can help give insight to the type of incidents that make people avoid flying. If it’s a large accident, is the fear a valid response? Flying plays a key part in the world’s transportation for not only travelers, but workers as well, so should airlines do more to reassure people after an accident or should consumers just “write it off as a one off”? Beyond dips, we can also analyze the large overarching trend of data over a much larger period of time. If flight traffic has overwhelmingly increased in the last decade, is it because flight travel is more accessible to more people or because it has simply become a necessity in today’s world or maybe it’s a symptom of time-space compression (larger distances seem shorter due to transportation advances)? In essence, not only are the micro trends insightful, but the macro as well can paint a narrative. Another category to consider is the country the flight is coming from and to where. Combining this with the flight traffic, we can see which countries are more popular for travelers and we can also contrast this with the flight traffic in the opposite direction to even get a general idea of migration patterns of people using air travel.

The dataset does fail, however, to provide some key information, such as the demographic of the passengers, and the exact time of departure and arrival of the flights. It is important to know the raw numbers as they help us understand the magnitude or severity of the issue with airline safety. However, it would have been meaningful to know more about the type of passengers that board certain airlines and the time is important to figure out the statistics of when people are likely or not likely to go on flights. Due to the nature of the dataset, it is hard to glean meaningful humanistic insights into airline safety. We have minimal insights based on this data set into how humans themselves feel about air travel in general. Do these numbers match the fear of air travel? Is it more safe for the more expensive seats in first or business class? What about the super-wealthy with private jets? Can they fly safely? How are outside factors, such as the news or media, influencing air travel? Without consulting other datasets, these questions cannot be answered.

NTSB Aviation Investigation Reports

The NTSB (National Transport Safety Board) is a government agency that overlooks all civil aviation accidents in the United States and also helps out with hundreds of accident reports in other foreign countries as well. For each accident report in the dataset, it includes the country, city, state(if the accident was in the U.S.) as geographic data. It also details the highest level of injury (minor, serious, or fatal) along with count for each injured level. Finally, it also details the probable cause of the accident as well as the date of the accident. For this dataset, we can get information like how many accidents happen a year, fatality counts for each accident, as well map all that to geospatial data. This information is important because it helps supplement our airline traffic passenger dataset with reports of actual aviation accidents as well as provide some hard statistics for crashes and accidents.

Of course, this dataset is not perfect. For one this dataset does not have an exhaustive list of all foreign aviation accidents, rather only the ones where the NTSB participated in. While this is a sizable amount of accident reports, it is by no way comprehensive of the world. But it is exhaustive when it comes to the ones that happen in the U.S. Thus when using this dataset, it is best to separate the reports into foreign and domestic reports in order as the number of domestic cases vastly outnumber the total of foreign ones. This data was generated from the first hand accident reports that NTSB help investigate so the data is very reliable. While it is very reliable, it is lacking information about flight details. For example, it states where the crash happens, but not really much about where the flight came from or where or what type of flight it is or what the aircraft was. Supplementary information like that will be needed to be cross referenced with another source. With the information present, some other questions naturally arise: What is the distinction between minor and severe injuries? Why do only some of the cases have a probable cause? With just the dataset, these questions are not really answerable.