Well, this is fairly easy one for if you are looking to remove duplicate entries. There are several queries online to remove duplicated rows or duplicated columns or remove duplicated rows/ and columns if there are any duplicated values. However, let us say you want to de duplicate a data frame where each column will have only those values that are not repeated in entire data frame. For eg. let us say column 1 has value x and requirement is that no other column should have x except column 1 and then order the values in each data frame. Let us start with a simple data or simple data frame and then work on it.
File is some thing like this and user wants each column should have only unique values that are not found elsewhere in the data frame:
A B C D E F
12 15 18 55 27 13
15 25 10 21 23 20
20 18 14 25 15 25
25 27 30 35 25 10
35 15
Given that output is:
A B C D E F
12 - 14 55 23 13
30 21
Now the code below is almost near to it. Output from code is:
A B C D E F
1 12 14 55 23 13
2 30 21
code is as follows:
===================================
df1 %>%
gather(k,v) %>%
mutate(k=as.factor(k)%>%
na.omit()) %>%
group_by(v) %>%
filter(n() == 1) %>%
group_by(k)%>%
mutate(g = row_number()) %>%
spread(k,v,drop = F, fill="") %>%
select(-g) %>%
as.data.frame()
=====================================
File is some thing like this and user wants each column should have only unique values that are not found elsewhere in the data frame:
A B C D E F
12 15 18 55 27 13
15 25 10 21 23 20
20 18 14 25 15 25
25 27 30 35 25 10
35 15
Given that output is:
A B C D E F
12 - 14 55 23 13
30 21
Now the code below is almost near to it. Output from code is:
A B C D E F
1 12 14 55 23 13
2 30 21
code is as follows:
===================================
df1 %>%
gather(k,v) %>%
mutate(k=as.factor(k)%>%
na.omit()) %>%
group_by(v) %>%
filter(n() == 1) %>%
group_by(k)%>%
mutate(g = row_number()) %>%
spread(k,v,drop = F, fill="") %>%
select(-g) %>%
as.data.frame()
=====================================