Final Project: Biodiversity in U.S National Parks

Hi everyone!

For my final project, I plan to explore the 2016 National Park Dataset: https://www.kaggle.com/datasets/nationalparkservice/park-biodiversity/data

Problem Description:

Nonnative species have been identified as a major contributor to biodiversity decline because invasive species are highly adaptive to any environment and can easily outcompete natives. This loss of native species is a particular issue for U.S National Parks that were created to be untouched and protected. According to the National Park Service, they disturb ecological processes, harm ecosystem integrity, degrade natural resources, interfere with visitor experiences in parks, and exacerbate climate change and fragmentation from land use change (NPS, 2021). In order to understand these non-natives, it is important to understand their distribution, where they have been reducing biodiversity, and potential reasons to why.

Problem Objectives:
  • Visualize the largest National Parks in size and their locations in a map
  • Perform visual distribution analysis to understand each state's biodiversity density
  • Analyze the differences in native versus nonnative species in states with the highest and lowest biodiversity density
  • Utilize Part to Whole Analysis to visualize native and nonnative species categories for each park
  • Address the gap in current analysis of this data to include a discussion on nativity as a factor of biodiversity loss

Related Work:

I first began developing my project idea based on the analysis I had seen done using this dataset previously. Regional Bird conducted a project answering which U.S national parks had the least and most biodiversity density and how they were distributed geographically. 


They found smaller parks had greater biodiversity because these areas are more specialized like Hot Springs National Park. A large biodiversity density was found to be rarer than large biodiversity count.







This inspired me to understand what the biodiversity density would be for the whole state. What can that tell us about the state's management of biodiversity?

Jonathan Bouchet conducted a state focused project analyzing the number of species found in each state. 

He found the largest number of species were found on the west. He also analyzed the conservation status of these species in bar graphs to understand which categories of species were in danger.






From all of this existing work, I did not see a focus on native and nonnative species and the relation to biodiversity. Thus, my project works to address this gap through maps, pie charts, and stacked bar charts.

Solution:

(1) Background
Before I began my analysis, I wanted to provide context on where these national parks I will be studying are visually distributed on a map and identify which of these parks are known for their large size which is often assumed to have more biodiversity. I created a bubble chart and map in Tableau to accomplish this.

In this visualization, it is clear the greatest number of large national parks are in Alaska followed by a few in the contiguous U.S on the coasts.


In this map, you can see the distribution of parks and I have labeled the parks that were identified to be the largest in the bubble chart. We can see a large concentration of parks on the west coast providing a reason to why Jonathan found many species in this area, This map has a legend of all of the colors corresponding to the 50+ parks, but is not included here as it is not a necessary element for our purpose.

(2) Visualize Biodiversity Density for each state

To begin, I imported the two datasets parks.csv with locations and acres and species.csv with the species information. In order to condense all the data, I had to clean and merge these files in R. There were a few states that shared parks, so I created a function that assigned these parks to the state with the most of the park area. 

Biodiversity Density calculation: total number of present species counted / the total acres of national park area for the whole state. 

Utilizing the five principles of visualization, I decided creating a choropleth map utilizing R would be the best to show the biodiversity level of each state since contrast is the most important measure.



It is important to note that not every state has a national park, so those are shown in white. 











The states with the most biodiversity density are surprisingly states on the east coast. From previous work, it was clear the west had the most park area and species. This may be due to the specialized functions of these small parks in places like Ohio. There are so many species in the Everglades FL, but I wonder if there are too many of the same such as nonnatives that have resulted in such a low biodiversity score.

To explore this, I will look at the number of native and nonnative species found in Ohio (high biodiversity density state) and Florida (low biodiversity density state). 

(3) Stacked Bar Charts of Native vs Nonnative Species by State and State Parks

I created subset datasets that narrowed data to only include Florida and Ohio Parks and count the number of present species that are native and nonnative. These stacked bar charts were created using R.



In this graph, we can see Florida has more species overall and the percentage of non-natives is lower than Ohio. This conflicts with the idea that the more biodiverse area would have less nonnatives, however a closer look into each of these states' parks may give us a better understanding.






First, it is important to note there is only one national park in Ohio that is Cuyahoga Valley. There is a high percentage of nonnatives in this Ohio Park but it is clearly matched by the Everglades with even more non-natives found in their dominance in Dry Tortugas and Biscayne. However, it might be easier to compare these natives and nonnatives individually by parks.




This animation made in R shows us the same graph of counts of natives and nonnatives in each of these states' parks, but now we can see just how overpowering the dominance of nonnatives are in all of these parks making the slightly more natives found in the Florida parks potentially less relevant
    
From this analysis, we understand nonnatives are in high proportions in both extremes of biodiversity dense states. It may be significant to understand the species types found in each of these parks to see what species are more dominant in non-natives or natives. Is it possible that there are more of one type of nonnative in Florida reducing its overall biodiversity index? Let's find out!

(4) Part to Whole Analysis of Species Distribution in these 4 parks

I utilized R to format and clean my datasets to include only the categories of species, parks in FL and OH, and present native species/non-natives. These cleaned datasets were exported as csv files to be used in Tableau for my pie chart visualizations.

From the natives, it is clear vascular plants are the majority in Ohio's Cuyahoga Valley and Florida's Biscayne and Everglades. The size of the pie charts indicate Cuyahoga had the most number of species which is indicative of being highly biodiverse. An interesting finding is Dry Tortugas has a high number of native bird species in a overall less diverse park. Is it possible that more vascular plants contributes to native biodiversity than other species types?


From the nonnatives, we see a similar trend of an overpowering number of vascular plants and second highest being birds. When it comes to nonnatives, Ohio is covered in nonnative vascular plants more than the Florida parks. However, birds are non-native birds are signficantly less in Ohio while they cover more extensive percentage of Florida National Parks. It seems non-native birds are creating more damage than we would expect, while vascular plants are seen in high percentages for both native and non-natives.

Conclusion and Future Work

From this analysis, it is clear nativeness of species plays an integral role in the biodiversity density differences of U.S National Parks. In the future, I would look into conservation status of these natives and non-natives in both of these states to see what it tells us about their biodiversity levels. It might be cool to do a deeper dive into non-native birds in Florida to understand their role in decreasing biodiversity as that was a key finding in this analysis.

It has been exciting and rewarding taking this Visual Analytics this semester! I have learned numerous methods of visualizations and softwares to use. 

Check out all the code to this project in GitHub!
Tableau Graphs:

Reference Links
https://www.nps.gov/articles/invasive-species.htm 
https://www.kaggle.com/code/regionalbird/national-parks
https://www.kaggle.com/code/jonathanbouchet/biodiversity-in-us-national-parks-2016

-Ramya's POV

Comments

Popular posts from this blog

Module 8: Correlation and ggplot2

Module 12: Social Network Analysis