
Americans Have Eight Kinds of Days
The American Time Use Survey takes a wide sample of Americans and interviews them in great detail about what they did the day before. They have done this each year going back to 2003. With a little discretion, I can group the activities into broader categories and produce the following graph:
There is a lot to unpack in this graph. I may come back to some of the trends here in future posts, but right now I want to take an entirely different approach to the data: cluster analysis.
Code for this article is available here.
Looking at average daily hours spent is enlightening, but it obscures the fundamental experience of time-use: What kinds of days do Americans have? A 1 hour-a-day average for socializing could mean that every American spends one hour each day socializing, or it could mean that 5% of Americans spend 20 hours a day socializing and the rest don’t socialize at all. It could also mean that half of Americans spend 7 hours a day socializing on weekends, and the rest don’t socialize at all. My point: Averages don’t tell us much about the ways in which individual Americans spend their days.
Cluster analysis can solve this problem. Clustering is a family of machine learning techniques that ask the question: What types of cases are there? It can help us say things like “There are two kinds of people in the world…” or “There are three types of days that Americans have”.
I will be using the k-means clustering algorithm - probably the most popular clustering method. Given the number of clusters k, it finds a “center” for each cluster - the average of all cases in the cluster - so that each case is assigned to its nearest center. This makes it very easy to visualize the clusters, since each cluster’s center is the paradigmatic example for all of its members. But k-means does have one tricky aspect: The algorithm only figures out what the clusters are like, not how many of them there should be. The right number of clusters needs to be figured out in advance.
A popular, sophisticated method for finding the optimal number of clusters (the optimal k) is the gap statistic method, introduced by Tibshirani et al. (Standford University, 2001). This metric compares the total within-cluster dispersion for each k (how spread out the cluster members are) with the dispersion that could be expected if the data were totally random and uncluttered. The “gap” is the difference between these two numbers. The chosen number of clusters is the lowest one for which one higher would result in a significantly smaller gap.
Running this algorithm on a subset of the time use data results in the following graph:
$data
logW E.logW gap SE.sim clusters ymin ymax
1 13.50450 14.78657 1.282068 0.001480213 1 1.280588 1.283548
2 13.30652 14.74339 1.436868 0.001392369 2 1.435476 1.438261
3 13.21941 14.72478 1.505371 0.001388561 3 1.503982 1.506759
4 13.19028 14.70927 1.518987 0.001386684 4 1.517600 1.520373
5 13.16623 14.69873 1.532501 0.001375256 5 1.531126 1.533877
6 13.14326 14.68896 1.545699 0.001373610 6 1.544325 1.547073
7 13.12145 14.68063 1.559187 0.001364544 7 1.557822 1.560551
8 13.09773 14.67253 1.574803 0.001375587 8 1.573427 1.576178
9 13.07577 14.66646 1.590693 0.001347867 9 1.589345 1.592041
10 13.05627 14.66081 1.604540 0.001351497 10 1.603188 1.605891
11 13.03194 14.65612 1.624181 0.001339475 11 1.622842 1.625521
12 13.01210 14.65159 1.639487 0.001369429 12 1.638118 1.640857
13 12.99600 14.64733 1.651325 0.001376263 13 1.649949 1.652702
14 12.98763 14.64309 1.655458 0.001348115 14 1.654110 1.656806
15 12.97476 14.63887 1.664107 0.001363974 15 1.662743 1.665471
16 12.96563 14.63474 1.669111 0.001412159 16 1.667699 1.670523
17 12.95959 14.63177 1.672184 0.001374198 17 1.670809 1.673558
18 12.94847 14.62906 1.680590 0.001350035 18 1.679240 1.681940
19 12.94091 14.62653 1.685616 0.001341744 19 1.684274 1.686958
20 12.93651 14.62405 1.687544 0.001363016 20 1.686181 1.688907
$layers
$layers[[1]]
mapping: group = 1
geom_line: na.rm = FALSE, orientation = NA
stat_identity: na.rm = FALSE
position_identity
$layers[[2]]
mapping:
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
$layers[[3]]
mapping: ymin = ~ymin, ymax = ~ymax
geom_errorbar: na.rm = FALSE, orientation = NA, width = 0.2
stat_identity: na.rm = FALSE
position_identity
$layers[[4]]
mapping: xintercept = ~xintercept
geom_vline: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity
$scales
<ggproto object: Class ScalesList, gg>
add: function
add_defaults: function
add_missing: function
backtransform_df: function
clone: function
find: function
get_scales: function
has_scale: function
input: function
map_df: function
n: function
non_position_scales: function
scales: list
set_palettes: function
train_df: function
transform_df: function
super: <ggproto object: Class ScalesList, gg>
$mapping
$x
<quosure>
expr: ^clusters
env: 0x11553c040
$y
<quosure>
expr: ^gap
env: 0x11553c040
attr(,"class")
[1] "uneval"
$theme
$line
$colour
[1] "black"
$linewidth
[1] 0.5454545
$linetype
[1] 1
$lineend
[1] "butt"
$arrow
[1] FALSE
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_line" "element"
$rect
$fill
[1] "white"
$colour
[1] "black"
$linewidth
[1] 0.5454545
$linetype
[1] 1
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_rect" "element"
$text
$family
[1] ""
$face
[1] "plain"
$colour
[1] "black"
$size
[1] 12
$hjust
[1] 0.5
$vjust
[1] 0.5
$angle
[1] 0
$lineheight
[1] 0.9
$margin
[1] 0points 0points 0points 0points
$debug
[1] FALSE
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$title
NULL
$aspect.ratio
NULL
$axis.title
NULL
$axis.title.x
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
[1] 1
$angle
NULL
$lineheight
NULL
$margin
[1] 3points 0points 0points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.title.x.top
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
[1] 0
$angle
NULL
$lineheight
NULL
$margin
[1] 0points 0points 3points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.title.x.bottom
NULL
$axis.title.y
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
[1] 1
$angle
[1] 90
$lineheight
NULL
$margin
[1] 0points 3points 0points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.title.y.left
NULL
$axis.title.y.right
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
[1] 0
$angle
[1] -90
$lineheight
NULL
$margin
[1] 0points 0points 0points 3points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.text
$family
NULL
$face
NULL
$colour
[1] "black"
$size
NULL
$hjust
NULL
$vjust
NULL
$angle
NULL
$lineheight
NULL
$margin
NULL
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.text.x
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
[1] 1
$angle
NULL
$lineheight
NULL
$margin
[1] 2.4points 0points 0points 0points
$debug
NULL
$inherit.blank
[1] FALSE
attr(,"class")
[1] "element_text" "element"
$axis.text.x.top
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
[1] 0
$angle
NULL
$lineheight
NULL
$margin
[1] 0points 0points 2.4points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.text.x.bottom
NULL
$axis.text.y
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
[1] 1
$vjust
NULL
$angle
NULL
$lineheight
NULL
$margin
[1] 0points 2.4points 0points 0points
$debug
NULL
$inherit.blank
[1] FALSE
attr(,"class")
[1] "element_text" "element"
$axis.text.y.left
NULL
$axis.text.y.right
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
[1] 0
$vjust
NULL
$angle
NULL
$lineheight
NULL
$margin
[1] 0points 0points 0points 2.4points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$axis.ticks
$colour
[1] "black"
$linewidth
NULL
$linetype
NULL
$lineend
NULL
$arrow
[1] FALSE
$inherit.blank
[1] FALSE
attr(,"class")
[1] "element_line" "element"
$axis.ticks.x
NULL
$axis.ticks.x.top
NULL
$axis.ticks.x.bottom
NULL
$axis.ticks.y
NULL
$axis.ticks.y.left
NULL
$axis.ticks.y.right
NULL
$axis.ticks.length
[1] 3points
$axis.ticks.length.x
NULL
$axis.ticks.length.x.top
NULL
$axis.ticks.length.x.bottom
NULL
$axis.ticks.length.y
NULL
$axis.ticks.length.y.left
NULL
$axis.ticks.length.y.right
NULL
$axis.line
$colour
[1] "black"
$linewidth
[1] 0.5
$linetype
NULL
$lineend
NULL
$arrow
[1] FALSE
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_line" "element"
$axis.line.x
NULL
$axis.line.x.top
NULL
$axis.line.x.bottom
NULL
$axis.line.y
NULL
$axis.line.y.left
NULL
$axis.line.y.right
NULL
$legend.background
$fill
NULL
$colour
[1] NA
$linewidth
NULL
$linetype
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_rect" "element"
$legend.margin
[1] 6points 6points 6points 6points
$legend.spacing
[1] 12points
$legend.spacing.x
NULL
$legend.spacing.y
NULL
$legend.key
list()
attr(,"class")
[1] "element_blank" "element"
$legend.key.size
[1] 1.2lines
$legend.key.height
NULL
$legend.key.width
NULL
$legend.text
$family
NULL
$face
NULL
$colour
NULL
$size
[1] 0.8 *
$hjust
NULL
$vjust
NULL
$angle
NULL
$lineheight
NULL
$margin
NULL
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$legend.text.align
NULL
$legend.title
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
[1] 0
$vjust
NULL
$angle
NULL
$lineheight
NULL
$margin
NULL
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$legend.title.align
NULL
$legend.position
[1] "top"
$legend.direction
NULL
$legend.justification
[1] "center"
$legend.box
NULL
$legend.box.just
NULL
$legend.box.margin
[1] 0cm 0cm 0cm 0cm
$legend.box.background
list()
attr(,"class")
[1] "element_blank" "element"
$legend.box.spacing
[1] 12points
$panel.background
$fill
[1] "white"
$colour
[1] NA
$linewidth
NULL
$linetype
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_rect" "element"
$panel.border
list()
attr(,"class")
[1] "element_blank" "element"
$panel.spacing
[1] 6points
$panel.spacing.x
NULL
$panel.spacing.y
NULL
$panel.grid
$colour
[1] "grey92"
$linewidth
NULL
$linetype
NULL
$lineend
NULL
$arrow
[1] FALSE
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_line" "element"
$panel.grid.major
list()
attr(,"class")
[1] "element_blank" "element"
$panel.grid.minor
list()
attr(,"class")
[1] "element_blank" "element"
$panel.grid.major.x
NULL
$panel.grid.major.y
NULL
$panel.grid.minor.x
NULL
$panel.grid.minor.y
NULL
$panel.ontop
[1] FALSE
$plot.background
$fill
NULL
$colour
[1] "white"
$linewidth
NULL
$linetype
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_rect" "element"
$plot.title
$family
NULL
$face
NULL
$colour
NULL
$size
[1] 1.2 *
$hjust
[1] 0
$vjust
[1] 1
$angle
NULL
$lineheight
NULL
$margin
[1] 0points 0points 6points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$plot.title.position
[1] "panel"
$plot.subtitle
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
[1] 0
$vjust
[1] 1
$angle
NULL
$lineheight
NULL
$margin
[1] 0points 0points 6points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$plot.caption
$family
NULL
$face
NULL
$colour
NULL
$size
[1] 0.8 *
$hjust
[1] 1
$vjust
[1] 1
$angle
NULL
$lineheight
NULL
$margin
[1] 6points 0points 0points 0points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$plot.caption.position
[1] "panel"
$plot.tag
$family
NULL
$face
NULL
$colour
NULL
$size
[1] 1.2 *
$hjust
[1] 0.5
$vjust
[1] 0.5
$angle
NULL
$lineheight
NULL
$margin
NULL
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$plot.tag.position
[1] "topleft"
$plot.margin
[1] 6points 6points 6points 6points
$strip.background
$fill
[1] "#F2F2F2"
$colour
[1] "black"
$linewidth
[1] 0.7
$linetype
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_rect" "element"
$strip.background.x
NULL
$strip.background.y
NULL
$strip.clip
[1] "inherit"
$strip.placement
[1] "inside"
$strip.text
$family
NULL
$face
NULL
$colour
[1] "grey10"
$size
[1] 0.8 *
$hjust
NULL
$vjust
NULL
$angle
NULL
$lineheight
NULL
$margin
[1] 4.8points 4.8points 4.8points 4.8points
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$strip.text.x
NULL
$strip.text.x.bottom
NULL
$strip.text.x.top
NULL
$strip.text.y
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
NULL
$angle
[1] -90
$lineheight
NULL
$margin
NULL
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$strip.text.y.left
$family
NULL
$face
NULL
$colour
NULL
$size
NULL
$hjust
NULL
$vjust
NULL
$angle
[1] 90
$lineheight
NULL
$margin
NULL
$debug
NULL
$inherit.blank
[1] TRUE
attr(,"class")
[1] "element_text" "element"
$strip.text.y.right
NULL
$strip.switch.pad.grid
[1] 3points
$strip.switch.pad.wrap
[1] 3points
attr(,"class")
[1] "theme" "gg"
attr(,"complete")
[1] TRUE
attr(,"validate")
[1] TRUE
$coordinates
<ggproto object: Class CoordCartesian, Coord, gg>
aspect: function
backtransform_range: function
clip: on
default: TRUE
distance: function
draw_panel: function
expand: TRUE
is_free: function
is_linear: function
labels: function
limits: list
modify_scales: function
range: function
render_axis_h: function
render_axis_v: function
render_bg: function
render_fg: function
reverse: none
setup_data: function
setup_layout: function
setup_panel_guides: function
setup_panel_params: function
setup_params: function
train_panel_guides: function
transform: function
super: <ggproto object: Class CoordCartesian, Coord, gg>
$facet
<ggproto object: Class FacetNull, Facet, gg>
attach_axes: function
attach_strips: function
compute_layout: function
draw_back: function
draw_front: function
draw_labels: function
draw_panel_content: function
draw_panels: function
finish_data: function
format_strip_labels: function
init_gtable: function
init_scales: function
map_data: function
params: list
set_panel_size: function
setup_data: function
setup_panel_params: function
setup_params: function
shrink: TRUE
train_scales: function
vars: function
super: <ggproto object: Class FacetNull, Facet, gg>
$plot_env
<environment: 0x10d1a0390>
$labels
$labels$subtitle
[1] "Gap Statistic"
$labels$y
[1] "Gap statistic (k)"
$labels$x
[1] "Number of clusters k"
$labels$title
[1] "Optimal number of clusters"
$labels$group
[1] "group"
$labels$ymin
[1] "ymin"
$labels$ymax
[1] "ymax"
$labels$xintercept
[1] "xintercept"
attr(,"class")
[1] "gg" "ggplot"
The maximum gap statistic is identified here at k = 20, the maximum k tested. Nevertheless, it is clear from the full graph that 3 is the lowest reasonable number of clusters. A pattern similar to this, in which the gap statistic continues to rise after an initial local maximum, is discussed by Tibshirani et al. in the original paper:
In cases where there are smaller subclusters within larger well-separated clusters, it can exhibit non-monotone behaviour. Hence it is important to examine the entire gap curve rather than simply to find the position of its maximum.
In our case, this means that there are a lot of fine distinctions in the ways people spend their days, but k = 3 may be a good start for a very general breakdown. For now then, there are three types of days that Americans have!

I’ve named the three clusters “Leisure With Responsibilities”, “Leisure Without Responsibilities”, and “Work Day.” A day of leisure with responsibilities has basically no work and a fair amount of relaxing, watching TV, or surfing the internet, but it also has time spent caring for children, preparing food, and doing housework. It also has a notable amount of socializing. A day of leisure without responsibilities is similar, but with none of the household responsibilities and none of the socializing. Almost all time not spent sleeping or eating is spent relaxing and in front of the TV or computer. A work day has about 8 hours of work, but otherwise looks more similar to the day of leisure with responsibilities, minus the housework and home maintenance.
How are the clusters distributed over the week?

Sure enough, workdays in America mostly happen from Monday through Friday. On weekends, Americans tend to have days of leisure with responsibilities. The proportion of leisure without responsibilities also goes up a bit on weekends, but not dramatically. I get the sense that the kind of people who have days of leisure without responsibilities are often not the kind of people who have day jobs. Can we see this with a breakdown by age group?

Sure enough, the most avid practitioners of leisure without responsibilities are senior citizens. Children under 18 also have a few more of those than the working-age adults, but not much. This suggests to me that the algorithm has identified school days as leisure with responsibilities. To investigate this possibility further, it may be worthwhile to try out a breakdown into more clusters. Before I do that, though, I’d like to recreate the first graph in this post, but now with types of days instead of averages for individual activities.

If you squint, the 2008 financial crisis is recognizable as a drop in workdays and rise in days of leisure without responsibilities. COVID is even more dramatic. The 2020 fall in days of leisure with responsibilities among children under 18 confirms my suspicion that school days count.
Interestingly, there looks to be a steady rise in days of leisure without responsibilities, especially among the elderly. This is accompanied by a steady decline in days of leisure with responsibilities across all age groups.
To see some finer detail, I’d like to re-cluster with a higher k and see what it comes up with. I’m not sure I could make sense of 18 different categories, so I’ll chose k = 8.

Here is the school day in its full glory! Now we also have two types of leisure days without responsibilities, a parenting day, a non-parental household responsibilities day, a day of socializing, and a day with extra sleep.

We can now see that school days and work days both tend to happen on Monday through Friday. Days of extra sleeping happen most often on Sundays, likely reflecting a sleep deficit from the work week. Social days are most common on Saturdays.

Again we see the prominent 2020 drop in both school days (for children) and work days (for working-age people), corresponding with a rise in days of leisure of all sorts. It looks like the younger people in 2020 spiked more dramatically toward days of TV/Radio/Computer use, whereas older people found other ways to kick back. The slow trends are more interesting though. Days of household responsibilities are clearly on a long and slow decline, as are days of socializing. Days of sleeping in look to be on the rise. Again the slow rise in days of leisure without responsibilities is most evident among the elderly, but not all days of leisure without responsibilities - only ones spend entirely on the TV/Radio/Computer. The other sort of leisure days are on the decline.
I’m curious to see the same graph, but broken up by income rather than age.

This looks like all the trends observed above are true of all income brackets. Looking at these together with the individual-activity graph from the beginning of this post, the most dramatic long-term trends are:
- A fall in days of household responsibilities, corresponding to the falls in average time spent on home maintenance, shopping, and housework (though food prep is on the rise).
- A fall in days of socializing, corresponding to a fall in time spent socializing generally. We should be careful with this one though, since “socializing” here only includes time explicitly devoted to socializing. In practice of course, a fair bit of socializing happens at work, school, or during other activities. In particular, the internet has become an increasingly social place during the time period under investigation here.
- Rise in days of TV/computer use, especially for the elderly. This corresponds to a rise in average time spent watching TV, listening to the radio, or on the computer.
- Also, it looks like people with higher income sleep in less. Is this because their jobs are less exhausting? Or because their jobs are more regular, so they have no time to sleep in?
Finally, I wonder how these clusters are distributed between sexes.

Males have more workdays, and more days spent entirely on the TV/Radio/Computer. Females have more parenting days and more days of household responsibilities. Females look to have slighty more social days than males do, but not by much. In 2020, males increased their days doing household responsibilities more than did females, presumably since household responsibilities replaced their workdays. This is all basically what I expected. Interestingly, it does look like females are increasing their days sleeping in faster than are males. I don’t have a good explanation for why that might be.