python - groupby to find row with max value is converting object to datetime -

May 15, 2012

I want to group two variables ['CIN', 'Calendar'] and return the group's line back Where the column MCelig is the largest in that particular group, it seems that there will be a maximum value in several rows, but I only need one line.

For example:

  AidCode CIN MCelig Calendar 0 None 1e 1 2014-03 -08 1 01 1e 2 2014-03-08 2 01 1e 3 2014- 05-08 3 None 2 E4 2014-06-08 4 01 2 E5 2014-06-08 Since the first two rows are a group, I line where MCelig = Want 2 
 came to me with this line    test = dfx.groupby (['CIN', 'calendar'], apali (lambda x: x.x [x.m.m.ilig.idxmax ( )]  
  And it seemed to work, except that when I have 'None' or 'np.nan' column for all values of a group , That column changes from time to time! Take a look at the examples given below and take the code from the object to the date on the date. 
  DT import as imported NPD = {'CIN': W CD series (['1E', '1e', '1e', '2e', '2e']), 'Adkoda': PD series ([NP NN, '01', '01', NPN , '01 ']),' Calendar ': PDSR ([DT Datetime (2014, 3, 8), DatetTime (2014, 3, 8), Datatetime (2014, 5, 8), DT Datetime (2014, 6, 8 ), Detdetime (2014, 6, 8)], 'MCLIG': PDSR ([1,2,3,4,5]) dfx = pd.DataFrame (d) # Checking that it is just Np.nan The problem was, it is not #dfx = dfx.where ((pd.notnull (dfx), none) test = dfx.groupby (['CIN', 'calendar'], group_keys = False) .apply (lambda x : X.ix [x.MCelig.idxmax ()])

output

  b Every [820]: AIDCode CIN Melining Calendar CIN Calendar 1e 2014-03-08 2015-01-01 1 E 2014-03-08 2014-05-08 2015-01-01 1e 3 2014-05-08 2e 2014- 06-08 2015-01-01 2 E5 2014-06-08

Update:

Just fix this simple solution Resolved

  x = dfx.sort (['CIN', 'Calendar', 'MCLIF']). Group (["CIN", 'Calendar'], as_index = Because it works, I think I had chosen it for simplicity sake.    
  Panda attempts to be helpful in identifying pillars and converting date to datetime64 dtype. It is very aggressive here.  
 Selecting Maximum Rows To generate  a bullion mask  for each grouping, go to  change  There will be an alternate solution to sum: 
   DIF Hemax (x): Masks = NP.Joros (Lane (X), DTEP = 'Balls') IDX = NPRGamax (XAV) Mask [IDX] = 1 Return Mask DFX lock [DFX. ['CIN', 'Calendar']) ['MCLIG'] Transform (Onmax) .Stitch (Bull)]  
  Production 
   AIDCID CIN MCelig Calendar 1 01 1e 2 2014-03-08 2 01 1e 3 2014-05-08 4 01 2e 5 2014-06-08  
 
  Technical Details: When Group-Aware Is used, when pasted back by Detafrem (applied function) back together in a Detafrem, Panda also objects such as columns date with that object dtype try guessing, and if so, when. If the values are string, then it tries to parse them as a date using  dateutil.parser : 
  better or worse,  dateutil. Parser  '01 'as a date:  
 in  [37]: import dated as DP, [27]: DP. Pars ('01') Out [38]: Datetime.datetime (2015, 1, 1, 0, 0)  
  For this reason Pandas tries to convert entire ADCODA column into dates is. Since no error has occurred, it seems that it helps you :)

Get link Facebook X Pinterest Email Other Apps

Comments Post a Comment

Search This Blog

Raj T

python - groupby to find row with max value is converting object to datetime -

Comments

Post a Comment

Popular posts from this blog

python - Overriding the save method in Django ModelForm -

html - CSS autoheight, but fit content to height of div -

qt - How to prevent QAudioInput from automatically boosting the master volume to 100%? -