count - How do you summarize columns based on unique IDs without knowing IDs in R? -
I am going through posts about data compression, but I think what I'm looking for.
I would like to create an abstract "counting table" which will help me see how many times patients were given a specific medication. The fact is that some patients have got many medicines together, it does not matter, because I only want the essence of all the medicines and then calculate how the percentage of each drug class is given. The issue is that I Possibly the names given for the drug are not known, they are somewhere in "hidden" I hope that it competes against the My As you can see, what I want to do now is creating a list of unique characters, which serves as my reference list, through which r can summarize the count in each column. summarizes the calculations of each column but without the ID of each and without the percentage of all the unique numbers. I also tried the following, which goes in the right direction, but ideally, I have a list of unique characters, which I can feed on How can I do any ideas Am this Very appreciated help If you want a count for the entire dataframe, you can see the To compute for each column and total count, I wrote the following function: to implement the function with dplyr package with percentage: resulting in: data.frame , so I have to specify which column will be R. To create a "list" already See, Through which he can then summarize the columns.
plyr package but in my work it has not worked yet to use it.
df looks like something
x
data There are three columns in .frame in which
unique (x) unique (wi) ) Exclusive (Z)
summary (df)
length Logic
ddply (df. (X), summarized, calculation = length (unique (y)))
table (List (DF)) (See also Guckler's Answer) & amp; If you also want the possibilities:
prop.table (table (list (dsp)) . When you want to get count for individual columns, it becomes even more difficult.
# Some reproducible data: set.seed (1) x & Lt; - Sample (letter [1: 4], 20, instead of = TRUE) y
func (df) (id, x2) ("dat", x2, Envir = .globelEnv) dataframe
dat in your global environment:
> Id id xyz total 1 a 4 4 3 11 2b 5 5 2 12 3 c 5 4 4 13 4 d 6 4 5 15 5 e 3 5 8 6 f 0 1 1
Library (dplyr) dat <- dat%>% mutate ( Xperc = round (100 * x / yoga (total), 1), yperc = round (100 * y / yoga (total), 1), junior = round (100 * z / sum (total), 1), perc = Round 100 * total / total (total), 1))
& gt; Dat id xyz total xperc yperc zperc perc 1 a 4 4 3 11 6.7 6.7 5.0 18.3 2 b 5 5 2 12 8.3 8.3 3.3 20.0 3 c 5 4 4 13 8.3 6.7 6.7 21.7 4 d 6 4 5 15 10.0 6.7 8.3 25.0 5 E 0 3 5 8 0.0 5.0 8.3 13.3 6 F 0 1 1 0.0 0.0 0.0 1.7 1.7
Comments
Post a Comment