A table contains values as below (download from link under figure):
Item | Category | Price | Profit | Actual Profit | Calories |
Beer | Beverages | $4.00 | 50% | $2.00 | 200 |
Soda | Beverages | $2.50 | 80% | $2.00 | 120 |
Chocolate Bar | Candy | $2.00 | 75% | $1.50 | 255 |
Ice Cream Sandwich | Frozen Treats | $3.00 | 67% | $2.00 | 240 |
Bottled Water | Beverages | $3.00 | 83% | $2.50 | 0 |
Gummy Bears | Candy | $2.00 | 50% | $1.00 | 300 |
Soda | Beverages | $2.50 | 80% | $2.00 | 120 |
Hamburger | Hot Food | $3.00 | 67% | $2.00 | 320 |
Popcorn | Hot Food | $5.00 | 80% | $4.00 | 500 |
Licorice Rope | Candy | $2.00 | 50% | $1.00 | 280 |
Hot Dog | Hot Food | $1.50 | 67% | $1.00 | 265 |
Licorice Rope | Candy | $2.00 | 50% | $1.00 | 280 |
Popcorn | Hot Food | $5.00 | 80% | $4.00 | 500 |
Popcorn | Hot Food | $5.00 | 80% | $4.00 | 500 |
It is very easy to summarize the data using the Excel PivotTable generator (even for multiple variables like "Profit" and "Actual Profit" simultaneously):
Example taken from the Data Smart book. |
It probably doesn't make a lot of sense to summarize the data by percentage, but this is just for illustration purposes.
A similar operation is the aggregate function in R (the Concessions.xlsx data frame is loaded as con):
> aggregate(cbind(Profit, Actual.Profit) ~ Category, data=con, FUN=sum)
Category Profit Actual.Profit
1 Beverages 31.23333 98.5
2 Candy 23.25000 46.5
3 Frozen Treats 23.00000 69.0
4 Hot Food 45.21667 142.0
The same is possible using the xtabs function:
> xtabs(cbind(Profit, Actual.Profit) ~ Category, data=con)
Category Profit Actual.Profit
Beverages 31.23333 98.50000
Candy 23.25000 46.50000
Frozen Treats 23.00000 69.00000
Hot Food 45.21667 142.00000
Note, no special operation FUN is specified in xtabs.
Occurence of items can be done when specifing "Count" in the PivotTable builder:
> count(con$Item)
x freq
1 Beer 20
2 Bottled Water 13
3 Chocolate Bar 13
4 Chocolate Dipped Cone 11
5 Gummy Bears 14
6 Hamburger 16
7 Hot Dog 15
8 Ice Cream Sandwich 10
9 Licorice Rope 13
10 Nachos 15
11 Pizza 17
12 Popcorn 16
13 Popsicle 13
14 Soda 13
This can also be done using the extremely powerful built-in functions of the data.table type:
> con[, .N, by=Item]
Item N
1: Beer 20
2: Bottled Water 13
3: Soda 13
4: Chocolate Bar 13
5: Gummy Bears 14
6: Licorice Rope 13
7: Popsicle 13
8: Ice Cream Sandwich 10
9: Chocolate Dipped Cone 11
10: Popcorn 16
11: Hamburger 16
12: Nachos 15
13: Pizza 17
14: Hot Dog 15
Here, .N is a built-in count function applied to the data.table object.
Or with a call to table where one can break down the items by profit:
> table(con$Item, con$Profit)
0.25 0.5 0.67 0.75 0.8 0.83
Beer 0 20 0 0 0 0
Bottled Water 0 0 0 0 0 13
Chocolate Bar 0 0 0 13 0 0
Chocolate Dipped Cone 0 11 0 0 0 0
Gummy Bears 0 14 0 0 0 0
Hamburger 0 0 16 0 0 0
Hot Dog 0 0 15 0 0 0
Ice Cream Sandwich 0 0 10 0 0 0
Licorice Rope 0 13 0 0 0 0
Nachos 0 15 0 0 0 0
Pizza 17 0 0 0 0 0
Popcorn 0 0 0 0 16 0
Popsicle 0 0 0 0 0 13
Soda 0 0 0 0 13 0
I.e. 13 items of "Bottled Water" giving a profit of 83% each have been sold.
Excel can then apply eg. summation when breaking down items by category:
The operation to carry out (sum, count, ...) is defined when clicking on the small "i" symbol in the Values field. So in total, beer earned 80 $.
Another plyr built-in function of the data.table object allows to quickly obtain the summarized prices of the category sales, broken down by category.
> con[, sum(Price), by=list(Category, Item)]
Category Item V1
1: Beverages Beer 80.0
2: Beverages Bottled Water 39.0
3: Beverages Soda 32.5
4: Candy Chocolate Bar 26.0
5: Candy Gummy Bears 28.0
6: Candy Licorice Rope 26.0
7: Frozen Treats Popsicle 39.0
8: Frozen Treats Ice Cream Sandwich 30.0
9: Frozen Treats Chocolate Dipped Cone 33.0
10: Hot Food Popcorn 80.0
11: Hot Food Hamburger 48.0
12: Hot Food Nachos 45.0
13: Hot Food Pizza 34.0
14: Hot Food Hot Dog 22.5