data.table assignment operator with lists in R -
I have a data.table which has the name column, and I get one regular by this name Expression In this case the most obvious way to do this is with the : = operator, because I am specifying this extracted string as the real name of the data. In doing so, I think that it does not really apply in the ceremony I hope for. I'm not sure it's deliberate, and I was thinking that if there's a reason he does or it's a bug.
Library (Data Eligible) DT < - Data table (name = c ('foo123', 'bar234')) Search for the desired expression in a simple character vector behaves as expected:
< Pre> name [1, name] pattern & lt; - '(. *?) \\ d +' regmatches (name, regexec (pattern, name)) [1]] [1] "foo123" "foo" I can easily subscribe to get it
regmatches (name, regexec (pattern, name)) [[1]] [2] [1] "foo"
However, when I try to apply it in whole data, I still have problems:
dt [, name_final: = Regmatches (Name, regexec (pattern, name)) [1]] [2]] DT name name_final 1: foo123 foo 2: bar234 foo
I do not know how data works internally Does, but i think Central functions have been implemented before the name column, and then the result is somehow assigned to forcibly and then the new name_final column in a vector. However, what I expect here will be the behavior on line-by-line basis. I can simulate this behavior by adding a dummy id column;
dt [id: = seq_along (name)] dt [, name_final: = regmatches (name, Regexec (pattern, name)) [[1]] [2], by = list Id]] dt name name_final id 1: foo123 foo 1 2: bar234 times 2 is a reason why it is not the default behavior? If so, then I guess that to enable the data instead of the rows, the column was atom with the atom, but I understand what is happening there.
There is not much on the R-in-line basis. It is always at the same time It is better to work with, so that you can greatly assume that the entire column vector of values will be passed in the form of the parameters of your function. Regmatches is a way to remove the second element for each item in the list
dt [, name_final: = sapply (regmatches (name, regexec (pattern, name)), `[`, 2] You can call "duplicate" per line type for functions functions like sapply () or Vectorize () at one time Data is not meant to run on the vector / list.
Comments
Post a Comment