Just the Basic Stacks
Time for another episode of Hashtag TheEmailThatBecameABlogPost. Today, it’s some funky labelling of a stacked column chart in Deneb/Vega-lite.
The request was for a stacked column chart with grand totals per column, and data labels per section showing the percentage of the total of each, centre-aligned within the section.
A stacked column chart is pretty easy in Vega-lite (there’s an example in the doco after all). But for this chart, we’re going to have to use a few transforms to:
- Calculate the percentages of each category and the overall total
- Calculate some values to help centre-align our data labels
Usually, you’d encode
a colour (or color) channel to automagically stack values (see here). We’re going to do this manually to allow us to calculate the data label values and positions.
Data Prep
The test data for the chart looks something like:
Date | Category | Value |
---|---|---|
2021-01-01 | A | 12 |
2021-02-01 | A | 5 |
… | ||
2021-01-01 | B | 1 |
2021-02-01 | B | 3 |
… | ||
2021-01-01 | C | 10 |
2021-02-01 | C | 8 |
… |
First, we’ll need to aggregage
our values into a single sum per Category per Date:
"transform": [
{
"aggregate": [
{
"op": "sum",
"field": "Value",
"as": "sum_value"
}
],
"groupby": [
"Category",
"Date"
]
},
...
We need the grand totals per Date for one set of labels, and also to calculate the percentage of each Category per Date. We’ll use a joinaggregate
:
{
"joinaggregate": [
{
"op": "sum",
"field": "sum_value",
"as": "total_value"
}
],
"groupby": [
"Date"
]
},
Now we can calculate
the values for the percentage labels:
{
"calculate": "datum.sum_value / datum.total_value",
"as": "percent_value"
},
To centre align these percentage label, we need to find the mid point of each group. We can stack
by Date and sort by Category:
{
"stack": "sum_value",
"as": [
"mid_start",
"mid_end"
],
"groupby": [
"Date"
],
"sort": [
{
"field": "Category",
"order": "descending"
}
]
},
Finally, we’ll calculate
the centre of each section to position the percentage label nicely:
{
"calculate": "datum.mid_start + (datum.mid_end - datum.mid_start) / 2",
"as": "mid"
}
Layout
We’ll need three layers, one for the columns, one for the percent labels and one for the overall totals.
"encoding": {
"y": {
"title": "Percent & Total"
},
"x": {
"field": "Date",
"type": "ordinal",
"timeUnit": "yearmonth",
"title": "Date"
}
},
"layer": [
{
"mark": {
"type": "bar"
},
"encoding": {
"y": {
"field": "sum_value",
"type": "quantitative"
},
"color": {
"field": "Category"
}
}
},
{
"mark": {
"type": "text",
"color": "white"
},
"encoding": {
"y": {
"field": "mid",
"type": "quantitative"
},
"text": {
"field": "percent_value",
"format": ".1%"
}
}
},
{
"mark": {
"type": "text",
"color": "black",
"dy": -10
},
"encoding": {
"y": {
"field": "total_value",
"type": "quantitative"
},
"text": {
"field": "total_value"
}
}
}
]
Putting it all together gives:
{
"$schema": "https://vega.github.io/schema/vega-lite/v5.json",
"data": {"name": "dataset"},
"transform": [
{
"aggregate": [
{
"op": "sum",
"field": "Value",
"as": "sum_value"
}
],
"groupby": [
"Category",
"Date"
]
},
{
"joinaggregate": [
{
"op": "sum",
"field": "sum_value",
"as": "total_value"
}
],
"groupby": [
"Date"
]
},
{
"calculate": "datum.sum_value / datum.total_value",
"as": "percent_value"
},
{
"stack": "sum_value",
"as": [
"mid_start",
"mid_end"
],
"groupby": [
"Date"
],
"sort": [
{
"field": "Category",
"order": "descending"
}
]
},
{
"calculate": "datum.mid_start + (datum.mid_end - datum.mid_start) / 2",
"as": "mid"
}
],
"encoding": {
"y": {
"title": "Percent & Total"
},
"x": {
"field": "Date",
"type": "ordinal",
"timeUnit": "yearmonth",
"title": "Date"
}
},
"layer": [
{
"mark": {
"type": "bar"
},
"encoding": {
"y": {
"field": "sum_value",
"type": "quantitative"
},
"color": {
"field": "Category"
}
}
},
{
"mark": {
"type": "text",
"color": "white"
},
"encoding": {
"y": {
"field": "mid",
"type": "quantitative"
},
"text": {
"field": "percent_value",
"format": ".1%"
}
}
},
{
"mark": {
"type": "text",
"color": "black",
"dy": -10
},
"encoding": {
"y": {
"field": "total_value",
"type": "quantitative"
},
"text": {
"field": "total_value"
}
}
}
]
}
The final result: