问题描述
数据框中至少包含两列,一列表示分组,其他列为数字,其中有一列的列名为“sum”。对该数据框基于某列进行分组求和时,提示:
Error in `summarise()`:
! Problem while computing `..1 = across(col_first:col_last, sum)`.
代码及结果
正常情况下,分组求和没有问题:
library(dplyr)
df <- data.frame(
grp = rep(c("a", "b"), 2),
col_first = c(1:4),
col_last = c(5:8)
)
df %>%
group_by(grp) %>%
summarise(across(col_first:col_last, sum)) %>%
ungroup()
# 结果为:
# # A tibble: 2 × 3
# grp col_first col_last
# <chr> <int> <int>
# 1 a 4 12
# 2 b 6 14
但是当数据框中有一列名为“sum”的数据时:
df <- data.frame(
grp = rep(c("a", "b"), 2),
col_first = c(1:4),
sum = c(1:4),
col_last = c(5:8)
)
df %>%
group_by(grp) %>%
summarise(across(col_first:col_last, sum)) %>%
ungroup()
# 提示错误:
# Error in `summarise()`:
# ! Problem while computing `..1 = across(col_first:col_last, sum)`.
#
The error occurred in group 0: character(0).
# Caused by error:
# ! attempt to select less than one element in integerOneIndex
# Run `rlang::last_error()` to see where the error occurred.
根据提示查找错误原因:
rlang::last_error()
# 结果为:
# <error/rlang_error>
# Error in `summarise()`:
# ! Problem while computing `..1 = across(col_first:col_last, sum)`.
#
The error occurred in group 0: character(0).
# Caused by error:
# ! attempt to select less than one element in integerOneIndex
# ---
# Backtrace:
# 1. ... %>% ungroup()
# 10. `<fn>`()
# Run `rlang::last_trace()` to see the full context.
潜在方案及问题
当然,可以通过逐项求和来解决,但是实际情况中,我的数据框包括很多列,逐项书写并不实际:
df %>%
group_by(grp) %>%
summarise(
col_first = sum(col_first),
sum = sum(sum),
col_last = sum(col_last)
) %>%
ungroup()
# 结果为:
# # A tibble: 2 × 4
# grp col_first sum col_last
# <chr> <int> <int> <int>
# 1 a 4 4 12
# 2 b 6 6 14
也可以通过其他包来解决,但是这样就要多加载个包:
library(purrrlyr)
df %>%
slice_rows("grp") %>%
dmap(sum)