源数据框:

c1<-c("小明",2015,300)
c2<-c("小明",2016,400)
c3<-c("小刘",2015,700)
c4<-c("小刘",2016,600)
dt<-as.data.frame(rbind(c1,c2,c3,c4))
names(dt)<-c("用户","日期","金额")

目标数据框:

a1 = c("小明","2016-400")
a2 = c("小刘","2015-700")
df = as.data.frame(rbind(a1,a2))
names(df) <- c("用户","匹配合并")

按照金额的最大值定位,然后将对应的第二列和第三列合并。请问能用ddpiy完成吗?有什么简洁的方法。

为了避免中文支持的隐患,我把代码里的中文都改成英文,意思到了即可。

没有使用其他包。

# 源数据
c1 <- c('a', 2015, 300)
c2 <- c('a', 2016, 400)
c3 <- c('b', 2015, 700)
c4 <- c('b', 2016, 600)
dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
names(dt) <- c('name', 'date', 'amount')
dt$merged <- paste0(dt$date, '-', dt$amount)

# 计算
tb <- tapply(dt$merged, dt$name, function(x) x[which.max(substr(x, 6, nchar(x)))])
tb
dt2 <- data.frame(name = names(tb), merged = tb)
dt2

    来一个data.table的写法

    
    library(data.table)
    
    c1 <- c('a', 2015, 300)
    c2 <- c('a', 2016, 400)
    c3 <- c('b', 2015, 700)
    c4 <- c('b', 2016, 600)
    dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
    names(dt) <- c('name', 'date', 'amount')
    
    setDT(dt)
    
    dt
    #>    name date amount
    #> 1:    a 2015    300
    #> 2:    a 2016    400
    #> 3:    b 2015    700
    #> 4:    b 2016    600
    
    # 一步到位
    dt2 <- dt[,.SD[amount==max(amount),.(merged=paste0(date,"-",amount))],by=.(name)]
    
    dt2
    #>    name   merged
    #> 1:    a 2016-400
    #> 2:    b 2015-700

    Created on 2020-01-02 by the reprex package (v0.3.0)

      dapengde

      结果貌似和我的不一样。

      你的结果是

      #>   name   merged
      #> a    a 2015-300
      #> b    b 2015-700

      a应该是2016-400

        借用上面的例子

        library(tidyverse)
        
        c1 <- c('a', 2015, 300)
        c2 <- c('a', 2016, 400)
        c3 <- c('b', 2015, 700)
        c4 <- c('b', 2016, 600)
        dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
        names(dt) <- c('name', 'date', 'amount')
        
        new.dt <- dt %>%
            group_by(name) %>%
            filter(amount == max(amount)) %>%
            mutate(merged = paste(date, amount, sep = "-")) %>%
            ungroup()

          dhfly 请教:继续这个例子,将其中一列拆分成2列怎么写?
          例:将id列分为q列和w列,以

          sep=“.”
          ex=data.frame(id=c("100.a","002.b","003.c"),aumount=c(100,200,300))

          separate(data = ex, col =id, into = c("q", "w"), sep = ".")报错是什么原因?

            dhfly

            去掉sep = "."

            separate(data = ex, col =id, into = c("q", "w")) 即可

            如果你一定要手动指定,

            separate(data = ex, col =id, into = c("q", "w"),sep="\\.")

            报错是因为你传进去的sep会被当做一个正则表达式,.在正则表达式里match任何字符(except new line)。

            默认的sep会match任何non-alphanumeric 的字符,绝大多数情况下使用默认的即可。

            sep
            Separator between columns.

            If character, is interpreted as a regular expression. The default value is a regular expression that matches any sequence of non-alphanumeric values.

              frankzhang21 非常感谢您的热情帮助,原来是这么回事,感觉自己不适合学语言。大白这里再谢!!

              # 源数据
              c1 <- c('a', 2015, 300)
              c2 <- c('a', 2016, 400)
              c3 <- c('b', 2015, 700)
              c4 <- c('b', 2016, 600)
              dt <- data.frame(rbind(c1, c2, c3, c4), stringsAsFactors = FALSE)
              names(dt) <- c('name', 'date', 'amount')
              
              library(dplyr)
              library(tidyr)
              dt %>% 
                group_by(name) %>% 
                filter(amount == max(amount)) %>% 
                unite("merged", date, amount, sep = "-") -> dt2
              dt2