pivot_wider.tbl_lazy {dbplyr} | R Documentation |
pivot_wider()
"widens" data, increasing the number of columns and
decreasing the number of rows. The inverse transformation is
pivot_longer()
.
Learn more in vignette("pivot", "tidyr")
.
Note that pivot_wider()
is not and cannot be lazy because we need to look
at the data to figure out what the new column names will be.
pivot_wider.tbl_lazy(
data,
id_cols = NULL,
names_from = name,
names_prefix = "",
names_sep = "_",
names_glue = NULL,
names_sort = FALSE,
names_vary = "fastest",
names_expand = FALSE,
names_repair = "check_unique",
values_from = value,
values_fill = NULL,
values_fn = ~max(.x, na.rm = TRUE),
unused_fn = NULL,
...
)
data |
A lazy data frame backed by a database query. |
id_cols |
A set of columns that uniquely identifies each observation. |
names_from, values_from |
A pair of
arguments describing which column (or columns) to get the name of the
output column ( If |
names_prefix |
String added to the start of every variable name. |
names_sep |
If |
names_glue |
Instead of |
names_sort |
Should the column names be sorted? If |
names_vary |
When
|
names_expand |
Should the values in the |
names_repair |
What happens if the output has invalid column names? |
values_fill |
Optionally, a (scalar) value that specifies what each
|
values_fn |
A function, the default is |
unused_fn |
Optionally, a function applied to summarize the values from
the unused columns (i.e. columns not identified by The default drops all unused columns from the result. This can be a named list if you want to apply different aggregations to different unused columns.
This is similar to grouping by the |
... |
Unused; included for compatibility with generic. |
The big difference to pivot_wider()
for local data frames is that
values_fn
must not be NULL
. By default it is max()
which yields
the same results as for local data frames if the combination of id_cols
and value
column uniquely identify an observation.
Mind that you also do not get a warning if an observation is not uniquely
identified.
The translation to SQL code basically works as follows:
Get unique keys in names_from
column.
For each key value generate an expression of the form:
value_fn( CASE WHEN (`names from column` == `key value`) THEN (`value column`) END ) AS `output column`
Group data by id columns.
Summarise the grouped data with the expressions from step 2.
memdb_frame(
id = 1,
key = c("x", "y"),
value = 1:2
) %>%
tidyr::pivot_wider(
id_cols = id,
names_from = key,
values_from = value
)