A lesbian couple I know in Hyattsville, Maryland have on several occasions told me that—according to a source they don’t remember—Hyattsville has the largest fraction of lesbian households of any city in the country. This claim makes me wonder what the actual distribution of LGBT people in the US is. While a few large cities (San Francisco, New York, Washington DC) and resort communities (Provincetown, MA; Fire Island, NY; Rehoboth Beach, DE) have reputations as gay enclaves, it’s not entirely clear to me how accurate this is, especially as it has become safer and more common for LGBT people elsewhere in the country to come out of the closet.

Unfortunately, since the Census doesn’t include questions on sexual orientation or gender identity, there isn’t an easy source of data to measure the LGBT population in different parts of the country. However, the American Community Survey (ACS) recently added a question to distinguish same-sex and opposite-sex couples.

The results of this question are reported, starting with the 2019 ACS 5-Year Estimates, in data table B09019, “Household Type (Including Living Alone) By Relationship” as counts of people who are the “Opposite-Sex Spouse,” “Same-Sex Spouse,” “Opposite-Sex Unmarried Partner,” and “Same-Sex Unmarried Partner” of the householder. Since these counts don’t include whoever is considered the householder, the “Same-Sex Spouse” and “Same-Sex Unmarried Partner” counts are the numbers of married and of unmarried same-sex couples. Unfortunately, though, there is no way to distinguish lesbian couples from gay male couples, so this can’t be used to verify my friends’ claim about lesbian couples in Hyattsville.

Methods

Although the B09019 data doesn’t allow one to distinguish lesbian and gay male couples from each other, I decided it would be interesting to map the densities of same-sex couples across the US. The obvious geography for doing so might seem to be counties, but some ACS data is not reported for geographies containing less than 65,000 people to avoid being personally identifiable and to ensure sufficiently large statistical samples, and a number of counties, especially in rural areas, are smaller than this.

Instead, I used an R script (shown below) to pull the data at the Public Use Microdata Area (PUMA) level. PUMAs are fully-covering, non-overlapping subsets of states with populations of at least 100,000 defined by the Census Bureau to allow the reporting of data at a sub-state level. I then mapped the number of same-sex couples (sum of same-sex married and unmarried partners of householders) as a fraction of all couples (sum of same-sex and opposite-sex married and unmarried partners of householders) using ArcMap.

It is worth noting that this approach does not actually capture all romantic couples, or even all cases of romantic partners living together: only romantic couples where one member of the couples was reported as the “householder” will be included. Specifically, this means that couples living with one member of the couple’s parents in the parents’ household, or in group households with a number of adults, will likely not be counted. However, I think it probably gets a large enough fraction of long-term couples to be of some interest.

Results

Naturally, my first maps were of the contiguous US as a whole. Unsurprisingly major cities tended to show up as concentrations of same-sex couples, but my choice to show PUMA boundaries makes this hard to evaluate at this scale, since the black of the PUMA-boundary lines overwhelms the fill color of the PUMAs for the very small PUMAs in dense areas. Still, some interesting patterns emerge.

New England, the West Coast (except the San Joaquin Valley) and New Mexico have higher than average populations of same-sex couples, mostly in the 1-2% range rather than below 1%. Substantial portions of the Southeast are also in the 1-2% range, though. The lowest concentrations seem to be on the Great Plains (except in Western Oklahoma for some reason) and parts of the Mountain West.

Particularly notable in rural areas is a band of higher densities of same-sex couples in New York and New England: most of Vermont and the “Borscht Belt” of New York have between 2-3% same-sex couples and the Pioneer Valley of Massachusetts has more than 3% same-sex couples. The high concentrations in Vermont (known for being unusually liberal for a rural state even by New England standards) and in the Pioneer Valley (home to a large number of colleges including the elite women’s colleges Smith and Mt. Holyoke) are not particularly surprising, but I can’t come up with a good explanation for the Borscht Belt having a lot of same-sex couples.

Close-Ups of Regions

A closer look at the Northeast specifically makes large concentrations of same-sex couples in cities show up: Boston, Providence, New York, Philadelphia, Baltimore, and DC are all visible, as are Portland, Maine; Worcester, Massachusetts; Wilmington, Delaware; Hartford, New Haven, and Bridgeport in Connecticut; Richmond and Hampton Roads in Virginia; and Albany, Syracuse, Rochester, and Buffalo in New York.

A number of college towns, most notably Ithaca and Binghamton in New York, but also State College, Pennsylvania and Salisbury, Maryland also show up, as do the traditional LGBT resort communities of Provincetown, Massachusetts (which shares its PUMA with all of Cape Cod, Nantucket, and Martha’s Vineyard), Fire Island, New York (the westernmost PUMA on Long Island), and Rehoboth Beach in Sussex County, Delaware. While New England and parts of Upstate New York having relatively large LGBT populations is not surprising, I have no real explanation for the relatively high densities of same-sex couples in the Alleghenies in Pennsylvania, Maryland, and West Virginia.

I also made regional maps of California, the Pacific Northwest, Texas, and the Southeast, all of which show similar patterns of large concentrations of same-sex couples in big cities, college towns, and some gay resort towns. Apparently Palm Springs, California falls into this last category: something I hadn’t realized and was quite surprised by, but explains why it stands out so distinctly in the California map.

However, since urban areas both have the highest concentrations of same-sex couples and tend to have PUMAs too small to easily see on these maps, a different approach was needed for them.

Same-Sex Couples in Urban Areas

The PUMAs with the highest concentrations of same-sex couples in the US, in order of decreasing concentration, are Palm Springs, CA; The Castro, San Francisco, CA; Chelsea, New York, NY; West Hollywood, Los Angeles, CA; central Washington, DC; and the North Side, Chicago, IL, all of which have over 11% of couples as same-sex couples. Next, in the 7-9% range, are central Atlanta, GA; central San Diego, CA; Oakland Park, Fort Lauderdale, FL; South of Market, San Francisco, CA; central Fort Lauderdale, FL; Capitol Hill, Seattle, WA; Lower Manhattan, New York, NY; and Central Harlem, New York, NY. Finally, the 6-7% range is Long Beach, CA; northern Washington, DC; central Columbus, OH eastern Washington, DC; West Harlem, New York, NY; northeastern Washington, DC; Miami Beach, FL; and downtown Seattle, WA.

This list is heavy on south Florida PUMAs (three in the Miami metropolitan) area and on urban cores. In particular, most of the most urban cities in the US, as identified in my Master’s thesis, are heavily represented. Notably, however, Philadelphia and—particularly surprisingly, given New England’s high concentrations of same-sex couples overall—Boston are missing from the list.

As Doug Newman pointed out on Twitter when I shared these results there, a likely cause of the omission of Boston, and of which PUMAs have the highest rankings in general, is the particular manners in which PUMAs are drawn in each metro area, often following existing geographies used by local planning agencies. For example, he notes:

It also seems related to how concentrated same-sex couples are within cities and even just where PUMA boundaries are drawn. Boston doesn’t have such well-defined gay neighborhoods.

Within big cities, I suspect these boundaries might also be arbitrary in a way that obscures information. NYC’s are based on community districts, which don’t necessarily have strong demographic similarities, especially outside Manhattan. If there was one that ran along the waterfront in Brooklyn, it would probably show up here, but the CDs (and thus PUMAs) just aren’t drawn that way.

Doug Newman on Twitter, 13 June 2021

This, of course, is an example of the Modifiable Areal Unit Problem, a bane of human geographers everywhere: when point data is aggregated into arbitrary geographic units, the choice of sizes and boundaries for those units can have a substantial effect on apparent patterns in the aggregated data.

Sample Code

Below, I’ve included the R code I used to pull the Census data using the tigris and tidycensus R packages as an sf object that can be plotted in R or saved as a shapefile to be displayed in QGIS or ArcMap. (I made the above maps in ArcMap, largely as practice with it, as I have more mapping experience in QGIS.)

library(dplyr)
library(sf)
library(tigris)
library(tidycensus)

# Insert your Census API key here.
census_api_key("",install=FALSE) 
options(tigris_use_cache = TRUE)

# Create vector of data to be downloaded from American Community Survey.
datavars_vec <- 
  c("POPULATION"      = "B03002_001", # Total Population for Race
    "HOUSINGUNITS"    = "B25002_001", # Total Housing Units (includes vacant)
    "PERCAPITAINCOME" = "B19301_001", # Per-Capita Income
    "OS_Spouse"       = "B09019_010",
    "OS_Partner"      = "B09019_012",
    "SS_Spouse"       = "B09019_011",
    "SS_Partner"      = "B09019_013") 

# Pull American Community Survey data using the tidycensus library.
# Use cartographic boundary shapefiles (cb) for display because they
# have water areas removed.
data_sf <- get_acs(geography="public use microdata area",
                   variables=datavars_vec,
                   year=2019,
                   survey="acs5",
                   geometry=TRUE,
                   output="wide",
                   keep_geo_vars = TRUE,
                   cb=TRUE)