How to build a biodiverse city: environmental determinants of bird diversity within and among 1581 cities

Cities are novel environments compared with the evolutionary history of the species that reside within them. Collectively, cities and their fauna can be thought of as ecosystems, recognized as playing a critical role in supporting global biodiversity, but they are fundamentally a combination of ‘old species’ surviving or thriving in a new environment. We aimed to understand—at a broad macroecological scale—how biodiversity responds to urban ecosystems both among and within cities. We integrated > 5 million eBird citizen science observations with remotely sensed landcover products throughout 1581 cities within the continental United States. We first investigated the species-area relationship as it pertains to cities and compared the slope of this relationship to randomly sampled polygons (i.e., among cities). Second, we investigated how biodiversity responds to an urbanization gradient at the level of localized bird observations (i.e., within cities). We found strong support for the longstanding species-area relationship: geographically larger cities had greater species richness. Surprisingly, the species-area relationship was stronger (i.e., steeper slope) in cities when compared to the species-area relationship for randomly sampled polygons in the study region. Our findings suggest that diverse and heterogeneous cities play a significant role in supporting biodiversity. But we also found that there is a consistent threshold where the level of urbanization begins to profoundly and negatively affect biodiversity. Critically, urban planning at the city-scale and at a local-scale (e.g., neighborhood) should focus on preserving attributes of water-cover and tree-cover for increased biodiversity to keep as much of the city as possible above this threshold value.


Introduction
Cities first appeared on planet earth * 6000 years ago, while the evolutionary history of most bird species currently residing within them dates back 1-10 million years (McKinney 2002;Weir and Schluter 2007;Nemeth and Brumm 2009;McDonnell and Hahs 2015). Because of the relative lack of time for species to evolve and speciate within cites, it is unsurprising that cities generally have negative impacts on local biodiversity (McKinney 2006(McKinney , 2008Š álek et al. 2015). For example, species richness (Blair 1996;Chace and Walsh 2006;Concepción et al. 2016), species diversity (Blair and Launer 1997;Wang et al. 2001), phylogenetic diversity (Knapp et al. 2012(Knapp et al. , 2017Ricotta et al. 2012;La Sorte et al. 2018), and functional diversity (Pavao-Zuckerman and Coleman 2007;Pauw and Louw 2012;Nock et al. 2013;La Sorte et al. 2018) can all be negatively impacted by cities.
Despite the overall negative impacts of urbanization on biodiversity (McDonald et al. 2013), cities are increasingly recognized for their ability to support biodiversity (Kühn et al. 2004;Baldock et al. 2015;Goertzen and Suhling 2015;Lepczyk et al. 2017b;Kowarik and Lippe 2018). Urban biodiversity can also include surprisingly high levels of threatened species (e.g., Ives et al. 2016). Further, urban biodiversity also provides benefits for physical and psychological well-being for humans (Fuller et al. 2007;Hedblom et al. 2017).
One of the most generalized principles in ecology is the species-area relationship Wilson 1963, 2001), also applicable to the relationship between cities and biodiversity (Ferenc et al. 2014;Beninde et al. 2015;Lepczyk et al. 2017b). This positive relationship is probably a result of the scaling effects taking place in cities (Pautasso 2006), including a positive relationship between city size and amount of green area (Fuller and Gaston 2009;Lepczyk et al. 2017a). However, this ecological relationship can result in an oxymoronic relationship: larger geographic cities have more species. Large geographic cities, however, are not a solution to maintaining biodiversity (i.e., abundance, evenness, species richness) within urban environments. Even accounting for geographic area relationships, not all cities support biodiversity equally, and this varies regarding the measure of biodiversity used. Indeed, biodiversity responds to urbanization inconsistently among cities (Chamberlain et al. 2017); sometimes non-linearly (Lepczyk et al. 2008;Batáry et al. 2018) and sometimes with a peak at intermediate levels of urbanization (Callaghan et al. 2019c).
Understanding how biodiversity responds to urbanization processes-both within and among cities-can help influence conservation and policy decisions of local relevance (Evans et al. 2009;Fuller and Gaston 2009;Aronson et al. 2014). From a policy-relevant perspective (Sutherland et al. 2006;Puppim de Oliveira et al. 2011), a more nuanced understanding of the effects of city size on biodiversity is needed (Lepczyk et al. 2017a). Lepczyk et al. (2017a) highlighted that cities are not homogenized, support communities dominated by native species, and also called for a mechanistic understanding of effective conservation strategies in cities. Specifically, we need to fully understand: (1) how a given city compares with cities of similar size; and (2) whether urbanization thresholds (e.g., Garaffa et al. 2009) within cities exist, for example where biodiversity is particularly negatively impacted.
While cities may not be natural or semi-natural (Bradshaw 2003), they can be thought of as functioning ecosystems (Odum 1971;Parlange 1998;Andersson 2006;Taylor and Hochuli 2015) with unique ecological footprints (McDonnell et al. 2009). At the broadest, macroecological sense, cities generally have some proportion of each of three main landcover types-vegetation, water, and impervious surface (Dobbs et al. 2017). A large number of studies have investigated one or more of these habitats to understand intra-city characteristics which predict biodiversity (Dickman 1987;Cornelis and Hermy 2004;Parsons et al. 2006;Bickford et al. 2010;Hedblom and Söderström 2010;Bates et al. 2011;Fontana et al. 2011;Lizée et al. 2012;Concepción et al. 2016), but with largely inconclusive results. Wide variation in biodiversity responses exists among cities and associated studies. Some of the variation in results is likely derived from methodological differences among studies, including different landcover products, different definitions of habitats (e.g., how a 'city' is defined), and differences in scale of the studies. A recent metaanalysis (Beninde et al. 2015) found that patch area, corridors, and vegetation structure were the most significant drivers of biodiversity within cities. Beninde et al. (2015) also concluded that local habitat variables are more important than landscape variables. Similarly, a recent global analysis (Aronson et al. 2014) found that anthropogenic factors (i.e., landcover and city age) were major drivers of bird and plant diversity among cities. Together, these results suggest that local policies (e.g., protection of remnant habitats) within cities can influence biodiversity (Puppim de Oliveira et al. 2011).
Our overall objective was to assess how bird biodiversity responds to urban ecosystems. We had three specific objectives. Objective 1: we assessed the relationship between species richness and city size, asking whether the slope of the species-area relationship is greater for cities than that of randomly-sampled habitat patches. Objective 2: we identified cities that are under-or over-performing in regards to city size and investigated the relationship between this relative performance with a suite of macroecological habitat predictors (i.e., tree-cover, water-cover, the interaction between tree-cover and water-cover, proximity to the coast, and enhanced vegetation index). Objective 3: we investigated biodiversity patterns within cities, assessing biodiversity responses along a continuous urbanization gradient, asking whether there are thresholds along this gradient where biodiversity is particularly negatively impacted.

Study-sites: cities
We used the U.S. Census bureau's definition of urban areas and downloaded the U.S. Census Bureau's 2017 urban areas shapefile. This product defines urban areas based on population density and a suite of land-use characteristics (e.g., residential and commercial urban land uses) that help to define the urban footprint, and is the most widely recognized urban definition in the United States. Urbanized areas are defined as areas with [ 50,000 people and urban clusters as areas that contain between 2500 people and 50,000 people, and these areas are available as a shapefile with multiple polygons for free download (see more here: https://catalog.data.gov/dataset/tiger-line-shapefile-2017-2010-nation-u-s-2010-census-urban-area-national). We included both urban clusters and urbanized areas in our analysis. This product treats some of the larger conurbations in the United States as one 'urban area' (e.g., New York/New Jersey). Because this is contradictory to our definition of city (i.e., a geo-political boundary capable of local-level policy relevance), such large amalgamations of urban areas (i.e., when more than one recognizable city were combined) were excluded from analyses.
We then intersected these cities with a freely available United States Cities Database (available here: https://simplemaps.com/data/us-cities), which provides information on municipal population, city population, population density, latitude, and longitude. For a city to be further considered for potential analyses, it had to be included in both the cities database and the U.S. Census Bureau's shapefile-after large conurbations were removed. We also focused our analysis on the contiguous United States, removing Alaska and Hawaii from consideration, with the above process resulting in a potential suite of 2888 cities (Online Appendix 1).

Bird data
We used eBird data to estimate biodiversity among and within cities. eBird is a successful citizen science project (Sullivan et al. 2009(Sullivan et al. , 2014Wood et al. 2011;Callaghan and Gawlik 2015), initiated in 2002 by the Cornell Lab of Ornithology. The database hosts [ 800 million observations, freely accessible to researchers and practitioners (https://ebird.org/ data/download). eBird enlists volunteer birdwatchers who submit bird observations in the form of checklists-defined as a list of birds seen and/or heard in a particular area at a particular time. An extensive network of regional volunteers (Gilfedder et al. 2018) use their local expertise to create filters that flag potential submissions as unusual: either unexpected species or abundances of species. If an observation trips a filter, then it is reviewed before inclusion in the dataset. For our analysis, we applied an additional set of filters to the eBird data to remove potential outliers (e.g., checklists which departed from the 'average' eBird checklist) from the entire potential pool. Checklists were excluded if they were incomplete, did not follow the stationary, random, or travelling protocols, or if they did not meet standard duration and distance criteria: \ 5 min, [ 240 min, \ 5 km in distance, or \ 500 Ha in area searched (e.g., La Sorte et al. 2014;Johnston et al. 2015;Callaghan et al. 2017). Additionally, if multiple observers submitted an eBird checklist together (i.e., a 'shared' checklist), then we randomly sampled one of those checklists to include in analyses. All seabirds were eliminated from analyses because our question of interest applied to terrestrial bird diversity (i.e., seabirds rarely use urban areas). We downloaded the eBird basic dataset (version ebd_relDec-2018), and filtered for observations from January 1st, 2010 to December 31st, 2018, aligning with the period of richest data in the eBird dataset.

Joining cities and eBird data
For each of the 2888 potential cities, we extracted all eBird checklists which met our criteria above (Online Appendix 1). Cities which had a minimum of 50 checklists were included in analyses, leaving us with a total sample of 1581 cities ( Fig. 1a; Online Appendix 1). We set fifty checklists as an initial threshold because it has previously been shown that this can be sufficient to estimate species richness within urban greenspaces (Callaghan et al. 2017)-and because initial visual exploration of the data showed significant variation among cities which had \ 50 eBird checklists. We acknowledge the positive relationship between the number of eBird checklists and the estimated species richness in a city (Online Appendix 2), and thus accounted for this uneven sampling effort among cities-i.e., uneven confidence in the species richness estimation-throughout our models for objective 1 and objective 2 (see below for modelling details) using weights (Solon et al. 2015). This works by giving each data point (i.e., a city) in the regression analysis a differential amount of influence on the parameter estimation, whereby the level of influence a data point has on the regression is a function of the number of underlying eBird checklists in that city used to provide the species richness estimation. We also explored the generalizability of our results by re-analyzing our data with a varying cutoff of underlying eBird checklists used to include/exclude cities (see details in Online Appendix 3) and found our key findings to be robust regardless of the number of checklists used as a minimum cutoff.

Objective 1: species-area relationships
For each city, we calculated the total species richness, defined as the total number of unique species reported among all checklists. We modelled the species-area relationships among cities using generalized additive models (Wood 2017). The GAMs were fitted using a quadratically penalized likelihood approach, with the smoothing parameters estimated via Generalized Cross Validation, optimizing trade-off between model complexity and model fit. We relied on the mgcv package to fit these models (Wood 2003(Wood , 2004Wood et al. 2016). Because of underlying latitudinal and longitudinal gradients in species richness (Field et al. 2008), models were fitted using a two-dimensional thin plate spline on the sphere (Wahba 1981;Wood 2003) which helped to account for known two-dimensional species richness gradients throughout the United States in the model-fitting procedure.
We first explored the general relationship between total species richness in a city and the area of the city (log-transformed) using a GAM with a Poisson distribution, where and Gallup, New Mexico) showing the visualization from space, with the city measured by VIIRS nighttime lights. The max, mean, and median are shown respectively for each of these cities in parentheses after the city name. Online Appendix 10 further contextualizes city-wide night-time lights values weights were provided as the log-transformed total number of checklists in a city (see Online Appendix 2). Total species richness was treated as the response variable, while logtransformed city area was treated as a parametric predictor variable (i.e., non-smoothed term). We also assessed the species-area relationship from randomly sampled polygons throughout the continental United States, which could encompass all land use and land cover types. To do so, we assigned 25,000 randomly located points within the study area using the 'sf' package in R and the size (i.e., area in km 2 ) of each polygon was determined from a random sample of the city-size frequency distribution. Of the potential points assigned, 12,446 buffered polygons fell entirely within the United States and of this set 1284 met our criteria with a minimum of 50 eBird checklists. The randomly assigned points generally approximated the locations of the cities throughout the United States (Online Appendix 4). We then investigated the species-area relationship in randomly sampled polygons, using the same specified model as that for the species-area relationship in cities, described above.
Because we were interested in whether the slopes of the species-area relationships varied between cities and randomly sampled polygons, we specified an additional model which was the same as described above (i.e., Poisson distribution with a two-dimensional thin plate spline for latitude and longitude), but with an interaction term between two parametric predictor variables (i.e., non-smoothed term): log-transformed patch area and 'analysis' (i.e., cities or randomly sampled polygons). We were interested in overall species richness patterns among cities, but investigated intra-annual changes in the aforementioned models with smaller subsets of cities included (because not all cities met our initial cutoff of 50 eBird checklists in every season). We found generally comparable results intra-annually, compared with our overall results (Online Appendix 5) and thus present the results of the overall species richness patterns throughout the remainder of the manuscript.
Objective 2: relationships between city-specific habitat attributes and biodiversity among cities We extracted the residuals from the species-area relationship models described in Objective 1. The magnitude of this residual can be thought of as an estimate of either under-performance or over-performance in their species richness relative to other cities of similar size (Fig. 2a). To extract residuals, we fitted a model as previously described above, but with the smooth term for latitude and longitude removed (as this was accounted for in the next portion of the modelling process) because we wanted the residuals of the model to be independent of where a given city lies in two-dimensional space-i.e., we were solely interested in the residuals between species richness and log-transformed area.
We used Google Earth Engine (Gorelick et al. 2017) and the U.S. Census Bureau's delineations of urban areas (see details above) to extract habitat attributes for each city (sensu Callaghan et al. 2018). Using globally-derived datasets at a 30-m resolution, from multi-temporal time series of both MODIS and Landsat satellite imagery, we calculated the following for each city: (1) the mean tree-cover (Sexton et al. 2013); (2) the proportion of water-cover; and (3) the average annual composite (2014-2018) enhanced vegetation index (EVI)-an optimized version of NDVI which better accounts for the sensitivity in high biomass regions (Huete et al. 2002). Additionally, for each city, we calculated the distance from the coastline as we suspected that species richness may be higher in coastal regions due to high alignment with migratory corridors.
To assess the relationship between residual species richness and the city-specific habitat attributes, we used a GAM with a Gaussian distribution where residual species richness was the response variable, with parametric terms for the distance to coast, mean EVI, mean tree-cover, proportion of water, and an interaction between proportion of water and treecover (testing for any positive or negative additive effects of water-cover and tree-cover). All variables were standardized (i.e., scaled and centered) prior to modelling to ensure effect sizes were comparable. In addition, we accounted for spatial autocorrelation among cities by fitting the model with a smooth term including latitude and longitude as described above. We first tested for correlation among predictor variables (Online Appendix 6) but found weak evidence of correlation so all predictors were included.
Objective 3: biodiversity response to urbanization gradients within cities We calculated four response variables of biodiversity for any eBird checklist which met our criteria (see above). These response variables were: (1) the species richness-i.e., total number of species; (2) total abundance-i.e., the sum of all abundance estimates; (3) Shannon diversity-i.e., Shannon diversity index using the vegan package (Oksanen et al. 2010); and (4) phylogenetic diversity-a measure of biodiversity incorporating the phylogenetic difference among species (Faith 1992), using the picante package in R (Kembel Fig. 2 a In order to understand which cities over-performed and under-performed based on city size, we investigated the relationship between city area and total species richness within a city by modelling the residuals, which accounted for the significant relationship between city area and species richness. The residuals then represented cities which were 'overperforming' and 'underperforming' relative to city size. b There was a strong positive relationship between city area and total species richness in a city (red points and line in a). This relationship was significant after accounting for the strong relationship between species richness and the total number of eBird lists (Online Appendix 2). There was also strong evidence that the slope of the city speciesarea relationship was greater than that of random polygons chosen from across the US; highlighting the potential value of cities for biodiversity et al. 2010). Not all checklists provided all response variables. eBird participants can add an 'X' to signify presence of a species without an abundance estimate on an eBird checklist, and as such, checklists which included an 'X' were included only for the species richness and phylogenetic diversity analyses. For our analysis of species richness and phylogenetic diversity, we included a total of 5,420,748 checklists from 538,466 unique localities throughout the United States. For Shannon diversity and total abundance, we analyzed 5,002,534 checklists.
As predictor variables, we assigned each eBird checklist a relative value of urbanization, on a continuous scale (Fig. 1). We used VIIRS night-time lights (Elvidge et al. 2017) and assigned each eBird checklist the mean night-time lights value within a 5 km buffer of that checklist (sensu Callaghan et al. 2019a). The 5 km buffer was chosen to encompass potential spatial biases in selected sampling locations of the eBird checklists and the results of species-specific responses to urbanization is robust to buffer size (Callaghan et al. 2019b). Night-time light level is highly correlated with the level of urbanization and is commonly used in remote sensing studies (Pandey et al. 2013;Zhang and Seto 2013;Ma et al. 2015;Stathakis et al. 2015;Elvidge et al. 2019), thus making it a representative continuous proxy for urbanization levels. Each checklist was also assigned the mean treecover, proportion of water-cover, and the average annual enhanced vegetation index composite (2014-2018), using the methods described above for Objective 2, within a 5-km buffer surrounding each checklist. All predictor variables were assessed for collinearity prior to modelling.
Using the checklist-level urbanization classification, defined above, we again used GAMs to assess these relationships among all cities but fitted with the REML method. First, we employed a GAM for each of the four response variables (i.e., four separate models). These models consisted of the response variables regressed against a smooth term for our parameter of interest which was the level of urbanization. A smooth term was used as there is strong support for non-linear responses of biodiversity to urbanization gradients (e.g., Batáry et al. 2018) and we thus did not want to make assumptions about linear relationships. Other variables included in the models, to account for varying effort among checklists, included duration and distance-travelled for each checklist, fitted with a thinplate regression spline. There is significant variation in the temporal usage of urban areas, driven by migratory species in this system (e.g., La Sorte et al. 2014), but this was not of intrinsic interest in our analysis (see Online Appendix 5). Therefore, we accounted for temporal autocorrelation and non-independence of eBird checklists by assigning each eBird checklist a 'season' of the year (North American spring, summer, winter, or fall based on solstices and equinoxes), and this was included in the model as a smoothed term with a cyclical cubic regression spline. To account for possible spatial autocorrelation among cities, we included a smooth term with a thin-plate regression spline for latitude and longitude, estimating the spatial effect in the model as a smoothed 2-dimensional spline. City was treated as a random effect in these models. Species richness was modelled with a Poisson distribution, Shannon diversity was not transformed and modelled with a Gaussian distribution, and abundance and phylogenetic diversity were modelled with a Gaussian distribution after log-transforming the response variables. We fitted an additional suite of models, but also included the aforementioned macroecological predictor variables in addition to the level of urbanization (see Table 1 for details).

Statistical analyses and data availability
All analyses were performed within the R statistical environment (R Core Team 2018), and relied heavily on the tidyverse workflow (Wickham et al. 2019). Statistical significance (1) Tree-cover; (2) water-cover; (3) interaction between tree-cover and water-cover; (4) enhanced vegetation index (EVI); and (5) distance to the coast Question 1 How do macro-ecological predictor variables influence the residual species richness of a city (i.e., residuals between the species-area relationship among cities)?

Model
Generalized additive model with each city as a data where residual species richness was the response variable, with parametric terms for the distance to coast, mean EVI, mean tree-cover, proportion of water-cover, and an interaction between proportion of water-cover and tree-cover (1) VIIRS night-time lights measure of urbanization; (2) tree-cover; (3) water-cover; (4) interaction between tree-cover and water-cover; (5) enhanced vegetation index (EVI) Question 1 Are there thresholds where biodiversity negatively responds along a continuous urbanization gradient?
Model (N = 4) Generalized additive models for each response variable against a nonlinear term for urbanization Question 2 Is urbanization the driving force influencing local-level biodiversity on a checklistlevel within a city, confirming our question above?
Model (N = 4) Generalized additive models for each response variable against a nonlinear term for urbanization in addition to the other macro-ecological predictor variables was concluded at p \ 0.05. Relevant code and data necessary to reproduce these analyses are available here: https://zenodo.org/record/4288599.

Results
Objective 1: species-area relationships For the 1581 cities included in our analysis of total species richness among cities ( Fig. 1a; Online Appendix 7), the average number of checklists used was 3220 with a median of 334, ranging from 50-our minimum cut-off-to 171,466 (Seattle, Washington). Species richness in a city was strongly related to the area of the city ( Fig. 2b; Online Appendix 8; parameter estimate = 0.18, z-value = 355.1, p \ 0.001, deviance explained: 65.1%). The species-area relationship among random polygon patches was also significantly positive (Online Appendix 8; parameter estimate = 0.12, z-value = 195.5, p \ 0.001, deviance explained: 39.2%), albeit the slope (i.e., parameter estimate) was less than that for the species-area relationship among cities. We also found that the slope (i.e., the interaction between log-transformed patch area and 'analysis') for random polygons was significantly less than the slope for cities ( Fig. 2b; Online Appendix 8; parameter estimate = -0.07, z-value = -86.78, p \ 0.001, deviance explained: 54%). These relationships remained significantly positive as we increased the underlying cutoff used for city inclusion (Online Appendix 3). Fig. 3 The standardized parameter estimates (and 95% confidence intervals) showing the relationship between residual species richness and our macroecological predictor variables. A generalized additive model was used to model this relationship. Greater than the red line represents high species richness than predicted by city area alone, and less than the red line represents less species richness than predicted by city area alone Objective 2: relationships between city-specific habitat attributes and biodiversity among cities There was a significant relationship between the proportion of water-cover within a city (Online Appendix 9; p \ 0.001) and the residual species richness, and a non-significant relationship (Online Appendix 9; p = 0.055) between tree-cover and residual species richness (Fig. 3). Importantly, the standardized parameter estimate for proportion of watercover was 2.5 times that of tree-cover (Fig. 3). The distance to the coast and mean EVI had no noticeable effect on residual species richness, but the interaction between water-cover and tree-cover was negatively associated with residual species richness (Online Appendix 9; p = 0.027) indicating that there is a level where both water-cover and tree-cover can negatively impact species richness.
Objective 3: biodiversity response to urbanization gradients within cities Biodiversity of birds declined strongly with increased urbanization at the scale of individual bird observations (Fig. 4). For each of the response variables (species richness, Shannon diversity, phylogenetic diversity, and abundance), there was a relatively clear threshold (i.e., where the biodiversity response values consistently appeared to drop) at a certain urbanization level, which corresponded to a VIIRS night-time lights (i.e., radiance value) of approximately 80 nW cm -2 sr -1 (Fig. 4). However, the maximum VIIRS nighttime lights (Online Appendix 10) only reaches above this threshold for 363 (* 23%) cities. Further, the mean VIIRS night-time lights (Online Appendix 10) among all cities reaches a maximum of 45 nW cm -2 sr -1 (New Orleans, Louisiana). There was a positive relationship between the area of a city and the maximum VIIRS night-time lights (Online Appendix 10). The value of 80 nW cm -2 sr -1 roughly corresponds to an impervious surface level of * 50% and a human population density of * 10,000 people/km 2 (Online Appendix 10). When urbanization level was included as a parametric term, along with water-cover, tree-cover, and EVI, urbanization level generally had the most negative influence on biodiversity, and water-cover had the least effect on biodiversity, although most effects were in the negative direction (Online Appendix 11).

Discussion
We used [ 5 million eBird checklists in [ 1500 cities to provide a generalized understanding of city-level influences on biodiversity among and within cities. Cities-with their diverse and heterogeneous habitats (Callaghan et al. 2019c)-clearly play an important role in supporting avian diversity (Dearborn and Kark 2010;Ives et al. 2016;Soanes et al. 2019), even when compared with randomly sampled patches incorporating natural areas.
We also found strong evidence that at a city-level the proportion of water-cover, and to a lesser extent, tree-cover, influence residual species richness. Although cities can support significant levels of biodiversity, we did find evidence of a relatively clear threshold which appears to negatively impact biodiversity responses, consistent among cities. But further research should work to contextualize this threshold. Critically, urban planning at the cityscale and at a local-scale (e.g., neighborhood) should focus on preserving water attributes and tree-cover for increased biodiversity. This mechanistic understanding should underpin the effective conservation of birds in urban environments. The significant relationship between the number of species in a city and the size of a city ( Fig. 1b and Online Appendix 8) confirms previous studies (MacGregor-Fors et al. 2011;Ferenc et al. 2014;Beninde et al. 2015). Surprisingly, the slope of this species-area relationship appears stronger (i.e., steeper) than random polygon patches, highlighting the potential for cities to incorporate diverse and species-rich bird communities. We acknowledge, however, that this comparison was made with random polygon patches that could also include urban areas, and future work should investigate the comparison between purely natural areas with cities.
We found that the proportion of water-cover within a city was critical for over-performing cities, confirming the global importance of wetlands (Gibbs 2000;Dudgeon et al. 2006). The importance of wetlands in urban areas has also been recently recognized (Ehrenfeld 2000;Whited et al. 2000;Hettiarachchi et al. 2015;Palta et al. 2017). Even if remnant wetlands do not reside or are no longer present in a city, constructed wetlands are a plausible, and feasible achievement for cities (Ma et al. 2010;Blicharska and Johansson 2016). These often achieve many goals, including contact with nature, stormwater recycling, and benefits for biodiversity (Zedler and Leach 1998;Nassauer 2004;Hansson et al. 2005).
We used a continuous urbanization gradient (i.e., an explicit urbanization gradient) to characterize biodiversity responses to urbanization within cities. But the majority of other studies (Blair 1996;Blair and Launer 1997;Clergeau et al. 1998;Chace and Walsh 2006) rely on categorical characterization of habitats (e.g., an implicit urbanization gradient). By categorizing habitats, these studies assume that biodiversity responds similarly at similar levels of urbanization, and this may not be true. Importantly, though, our research differs from that of other research as we were only interested in investigating urbanization gradients within a city-the unit of potential management. Most research extends analyses to investigate the urban-rural gradients to include the 'rural' and/or 'natural' habitats which are usually outside of city-boundaries (Clergeau et al. 1998;Chamberlain et al. 2017). These results highlight the importance of understanding the local-level habitat influences and thus management within cities (Fernandez-Juricic and Jokimäki 2001;Melles et al. 2003;Chamberlain et al. 2004;Bryant 2006). Nevertheless, understanding local-level influences of biodiversity may only be applicable up to a certain extent: we found support for the notion that there is a distinct threshold-among all cities-at which biodiversity responds particularly negatively (* 80 radiance value from VIIRS night-time lights). Interestingly, though, this threshold is relatively rare-only 23% of cities have a maximum VIIRS night-time lights value greater than this threshold, and no cities have a mean VIIRS night-time lights above this threshold (Online Appendix 10). This suggests that even within cities, biodiversity can persist relatively well (Lepczyk et al. 2017b)-up to a certain point.
Our analysis incorporated more than 1500 cities throughout the continental United States-a much larger sample size than previous studies. For example, previous broadscale studies have investigated a total of 41 different cities in Europe (Ferenc et al. 2014), and a recent meta-analysis included 75 cities worldwide (Beninde et al. 2015). Our large sample size was made possible because of broad-scale empirical data collected by citizen scientists (Bonney et al. 2009). We used these citizen science data to look at broad-scales and found that our models were generally well-fit (i.e., deviance explained of models was generally [ 50%) relying on these data, although there may be issues of spatial-mismatch between the scale of eBird sampling and the macro-ecological predictors we used in our analysis. For example, because we aggregated all eBird checklists within a city, we may be missing important intra-city patterns of underlying eBird sampling-i.e., birders probably self-select the most biodiverse regions within a city to sample thus leaving the more urbanized portions of a city under-represented. However, we expect this bias to exist regardless of city size, and thus this systematic bias makes cities comparable. Our reliance on eBird citizen science data also is subject to birders preferentially chasing rare species within a city, potentially inflating species richness results (Callaghan et al. 2017), but we again expect that this pattern would be consistent regardless of city size thereby having little influence on our findings. Lastly, the timing of citizen science data submissions within a year is not equal, with a peak of birder coverage during migration (e.g., spring and autumn). Although we found generally robust results with regard to season (Online Appendix 5), future work should investigate the temporal understanding of our results, investigating intra-and inter-annual changes within urban areas (Dallimer et al. 2011;La Sorte et al. 2014).
We provide broad-scale patterns while also highlighting opportunities for smaller-scale research questions. First, we only investigated broad biodiversity responses, and future work should aim to understand how bird species guilds and functional groups respond among and within cities (Devictor et al. 2008;Flynn et al. 2009;Conole and Kirkpatrick 2011). For example, the negative association between the interaction of tree-cover and water-cover with residual species richness could be explained by specific groups of species (cf., waterbirds and passerines) relying on different habitat types. We also did not look at the habitat matrix surrounding a point-for instance, corridors could be a significant driver supporting biodiversity (Savard et al. 2000)-and future work should test our results with finer-scale mapping of habitat variables. Importantly, we conclude by highlighting that our workflow relies on open-access data and remotely-sensed landcover maps. As increasingly fine-scaled remote-sensing data are mapped (Pasetto et al. 2018) and quantity and quality of citizen science data improves (Wood et al. 2011), we believe our framework provides a way to understand the mechanistic patterns shaping biodiversity trends among and within cities, globally.

Conclusions
First, we found that there is a significantly positive species-area relationship among 1581 cities within the United States for birds. And surprisingly, this relationship is stronger (i.e., more positive) than species-area relationships within randomly-sampled polygon patches. Second, we found that the proportion of water-cover and tree-cover within a city are the strongest drivers of over-performing cities. Third, we identified a potentially significant threshold where bird diversity within cities is negatively impacted. Together, our results suggest that diverse and heterogeneous cities play a significant role in supporting bird diversity.