来一个data.table的写法


library(data.table)

c1 <- c('a', 2015, 300)
c2 <- c('a', 2016, 400)
c3 <- c('b', 2015, 700)
c4 <- c('b', 2016, 600)
dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
names(dt) <- c('name', 'date', 'amount')

setDT(dt)

dt
#>    name date amount
#> 1:    a 2015    300
#> 2:    a 2016    400
#> 3:    b 2015    700
#> 4:    b 2016    600

# 一步到位
dt2 <- dt[,.SD[amount==max(amount),.(merged=paste0(date,"-",amount))],by=.(name)]

dt2
#>    name   merged
#> 1:    a 2016-400
#> 2:    b 2015-700

Created on 2020-01-02 by the reprex package (v0.3.0)

    dapengde

    结果貌似和我的不一样。

    你的结果是

    #>   name   merged
    #> a    a 2015-300
    #> b    b 2015-700

    a应该是2016-400

      借用上面的例子

      library(tidyverse)
      
      c1 <- c('a', 2015, 300)
      c2 <- c('a', 2016, 400)
      c3 <- c('b', 2015, 700)
      c4 <- c('b', 2016, 600)
      dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
      names(dt) <- c('name', 'date', 'amount')
      
      new.dt <- dt %>%
          group_by(name) %>%
          filter(amount == max(amount)) %>%
          mutate(merged = paste(date, amount, sep = "-")) %>%
          ungroup()

        dhfly 请教:继续这个例子,将其中一列拆分成2列怎么写?
        例:将id列分为q列和w列,以

        sep=“.”
        ex=data.frame(id=c("100.a","002.b","003.c"),aumount=c(100,200,300))

        separate(data = ex, col =id, into = c("q", "w"), sep = ".")报错是什么原因?

          dhfly

          去掉sep = "."

          separate(data = ex, col =id, into = c("q", "w")) 即可

          如果你一定要手动指定,

          separate(data = ex, col =id, into = c("q", "w"),sep="\\.")

          报错是因为你传进去的sep会被当做一个正则表达式,.在正则表达式里match任何字符(except new line)。

          默认的sep会match任何non-alphanumeric 的字符,绝大多数情况下使用默认的即可。

          sep
          Separator between columns.

          If character, is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values.

            frankzhang21 非常感谢您的热情帮助,原来是这么回事,感觉自己不适合学语言。大白这里再谢!!

            # 源数据
            c1 <- c('a', 2015, 300)
            c2 <- c('a', 2016, 400)
            c3 <- c('b', 2015, 700)
            c4 <- c('b', 2016, 600)
            dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
            names(dt) <- c('name', 'date', 'amount')
            
            library(dplyr)
            library(tidyr)
            dt %>% 
              group_by(name) %>% 
              filter(amount == max(amount)) %>% 
              unite("merged", date, amount, sep = "-") -> dt2
            dt2