data.table assignment operator with lists in R -


I have a data.table which has the name column, and I get one regular by this name Expression In this case the most obvious way to do this is with the : = operator, because I am specifying this extracted string as the real name of the data. In doing so, I think that it does not really apply in the ceremony I hope for. I'm not sure it's deliberate, and I was thinking that if there's a reason he does or it's a bug.

  Library (Data Eligible) DT < - Data table (name = c ('foo123', 'bar234'))  

Search for the desired expression in a simple character vector behaves as expected:

< Pre> name [1, name] pattern & lt; - '(. *?) \\ d +' regmatches (name, regexec (pattern, name)) [1]] [1] "foo123" "foo"

I can easily subscribe to get it

  regmatches (name, regexec (pattern, name)) [[1]] [2] [1] "foo"  

However, when I try to apply it in whole data, I still have problems:

  dt [, name_final: = Regmatches (Name, regexec (pattern, name)) [1]] [2]] DT name name_final 1: foo123 foo 2: bar234 foo  

I do not know how data works internally Does, but i think Central functions have been implemented before the name column, and then the result is somehow assigned to forcibly and then the new name_final column in a vector. However, what I expect here will be the behavior on line-by-line basis. I can simulate this behavior by adding a dummy id column;

  dt [id: = seq_along (name)] dt [, name_final: = regmatches (name, Regexec (pattern, name)) [[1]] [2], by = list Id]] dt name name_final id 1: foo123 foo 1 2: bar234 times 2 is a reason why it is not the default behavior? If so, then I guess that to enable the data instead of the rows, the column was atom with the atom, but I understand what is happening there.  

There is not much on the R-in-line basis. It is always at the same time It is better to work with, so that you can greatly assume that the entire column vector of values ​​will be passed in the form of the parameters of your function. Regmatches is a way to remove the second element for each item in the list

  dt [, name_final: = sapply (regmatches (name, regexec (pattern, name)), `[`, 2] You can call "duplicate" per line type for functions functions like   

sapply () or Vectorize () at one time Data is not meant to run on the vector / list.


Comments

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -