dataframe - Re-structuring data based on time-stamps and unique IDs in R -
I am working with a large dataset (10 million + cases) where each case can be used to sell a given product The monthly transaction represents (there are 17 products). As such, each store is potentially represented in 204 cases (12 months * 17 product sales; note that all stores sell 17 products throughout the year).
I need to reconstruct the data so that a case for each product transaction The result is that only 17 cases will be represented in each shop.
Ideally, I would like the transaction value to be made within 12 months.
To be more specific, there are currently 5 variables in the dataset:
- Store location - a unique 6-point sequence
- Month - 2013_MM (
- Product Type - 17 Different Product Types (This is a String Variable)
I'm working in R I am here. It would be ideal to save this reproduced dataset in the data frame.
I am thinking that if loop can work for /, but I am uncertain how to work it.
Any suggestions or ideas are highly appreciated if you need more information, please ask!
Kind regards,
r
fact There was not much to work here, but this is the reason for my interpretation ... you want to summarize your data set, which is grouped by shop_location and product_type
#install.packages ('dplyr') Library (dplyr) your_data_set & lt; - xxx your_data_set% & gt;% group_by (shop_location, product_type)%> Summary (profit = sum (total_profit), count = n (), avg_profit = profit / count)
Comments
Post a Comment