On this page, I present how to exploit FINESS data for EHPAD (nursing home/retirement houses). Moreover, I present how to create a panel of EHPAD and various summary statistics. We first start with the 2019 data.
The FINESS dataset has the universe of all health care providers in France. It provides information on the type of health care (liberal professions, hospitals, etc.) and the location (either the postal adress or the georeference, depending on the dataset).
It is extracted from here. I accessed the database on February 29, 2024.
The way variables are coded changes from a year to another. So far, I open year 2019 (as I match with some census data, only avaible in 2019, in another project).
path = "/Users/mmoglia/Dropbox/research/raw_data/finess"
library(dplyr)
library(sf)
library(foreign)
library(haven)
library(ggplot2)
library(COGugaison)
library(viridis)
I then open the dataset.
# I. Open the dataset
finess = read.csv2(paste0(path,"/raw/etalab_stock_et_20191231.csv")) %>%
select(starts_with("date"),
libmft,libcategetab,
siret,cog,
nofinesset,nofinessej) %>%
rename("codgeo"="cog")
ehpad = finess %>%
filter(libcategetab=="EHPAD") %>%
select(-dateautor,-datemaj,-libcategetab)
count(ehpad)
As you notice, the libcategetab
directly provides
information on EHPAD, and I filter this specific category and remove
unwanted variables. It leaves us with 7520 EHPAD in 2019 in France.
To visualize the data, we can compute the density of EHPAD per 1,000 inhabitants across commuting zones (I choose CZ because cities are too numerous in France, and we would end up with many NAs).
The shapefile is downloaded here.
ze.shp = read_sf(paste0(path,"/../shp/ze2020/fond_ZE2020_geo20.shp")) %>%
rename("codgeo"="code")
From INSEE, we obtain the population by age class (5 years) and by census year. The latest being 2019, we choose this one.
pop = readxl::read_xls(paste0(path,"/../population/pop-sexe-age-quinquennal6819.xls"),
sheet="COM_2019",range="A14:AT38228",col_names=T)
# Aggregate age class by 0-20, 20-60, and 60+
pop = pop %>%
mutate(codgeo=paste0(DR,CR)) %>%
mutate(young = rowSums(select(.,starts_with("ageq_rec01"), starts_with("ageq_rec02"), starts_with("ageq_rec03"), starts_with("ageq_rec04")))) %>%
mutate(active = rowSums(select(.,starts_with("ageq_rec05"), starts_with("ageq_rec06"), starts_with("ageq_rec07"), starts_with("ageq_rec08"),
starts_with("age_rec09"),starts_with("age_rec10"),starts_with("age_rec11"),starts_with("age_rec12")))) %>%
mutate(old = rowSums(select(.,starts_with("ageq_rec13"), starts_with("ageq_rec14"), starts_with("ageq_rec15"), starts_with("ageq_rec16"),
starts_with("age_rec17"),starts_with("age_rec18"),starts_with("age_rec19"),starts_with("age_rec20")))) %>%
mutate(pop=young+active+old) %>%
select(codgeo,young,active,old,pop)
# Change the current zipcode to codgeo, and aggregate at the ZE level thanks to COGuaison package
pop = pop %>%
changement_COG_varNum(annees=c(2019:2020),codgeo_entree = "codgeo",donnees_insee = TRUE,agregation=TRUE) %>%
nivsupra(codgeo_entree="codgeo",COG=2020,agregation=F,nivsupra="ZE2020") %>%
group_by(ZE2020_codgeo) %>%
summarise_at(c("young","active","old","pop"),sum,na.rm=TRUE) %>%
mutate(year=2020) %>%
ungroup() %>%
rename("ze2020"="ZE2020_codgeo")
We now need to merge these population data with the
ehpad
database. Before that, we aggregate the ehpad data
base by city, before merging it to the ze.shp
database
(which is at the city level but contains the crosswalk between cities
and their 2020 ZE).
ehpad.city = ehpad %>%
group_by(codgeo) %>%
mutate(count = n())
We can now merge ehpad.city
with the shapefile and
aggregate at the CZ level!
ehpad.shp = ehpad.city %>%
right_join(ze.shp) %>%
st_as_sf() %>%
filter(!grepl("^(01|02|03|04|06)", ze2020)) %>%
group_by(ze2020) %>%
summarize(geometry = st_union(geometry),
count = sum(count,na.rm=T)) %>%
left_join(pop)
ggplot(ehpad.shp) +
geom_sf(aes(fill=log(count/old*1000)),color=NA) +
scale_fill_viridis(option="E",name=" ") +
labs(title="Number of Ehpad in 2019",
subtitle="Log ratio of EHPAD for 1000 inh. aged 60+, ZE level",
caption="Own computations. Source: FINESS, RP2019") +
theme_void()+
theme(plot.title = element_text(hjust = 0.5,family="serif",face="bold"),
plot.subtitle=element_text(hjust=0.5,family="serif"))