R how to remove first row of duplicate values from a big column -


in r have file (df) consisting in 2 big columns, , b (aprox. 1000000 elements each). know have many duplicate values in a. know how remove duplicates (remove second rows of each duplicate):

df1 = df[!duplicated(df$a), ]  

but remove first rows in duplicate , keep second rows. instance, in following example, remove 71 t , keep 71 c, not other way around:

a   b  4   8   c 21  t 71  t 71  c 74  c 75  g 78  c 86  t 

thanks in advance

using dplyr, can this:

library(dplyr) df %>% group_by(a) %>% slice(-1) 

if need arrange column in specific way first, can incorporate arrange mix follows:

library(dplyr) df %>% arrange(a) %>% group_by(a) %>% slice(-1) # sorts in ascending order 

Comments

Popular posts from this blog

ruby - Trying to change last to "x"s to 23 -

jquery - Clone last and append item to closest class -

c - Unrecognised emulation mode: elf_i386 on MinGW32 -