I have a list of key/value pairs and would like to convert it into a 2d matrix where the cells represent the counts for each key/value combination. Here is a sample data frame
doc_id,link
1,http://example.com
1,http://example.com
2,http://test1.net
2,http://test2.net
2,http://test5.net
3,http://test1.net
3,http://example.com
4,http://test5.net
At the moment, I am using R's plyr package and the following command for that kind of transformation:
link_matrix <- daply(link_list, .(doc_id, link), summarise, nrow(piece))
Here is the result matrix object:
doc_id http://example.com http://test1.net http://test2.net http://test5.net
1 List,1 NULL NULL NULL
2 NULL List,1 List,1 List,1
3 List,1 List,1 NULL NULL
4 NULL NULL NULL List,1
The resulting array entries are fine - they give me the key/value counts; but what I actually need are numeric values in the result matrix. It should look like this:
doc_id http://example.com http://test1.net http://test2.net http://test5.net
1 2 0 0 0
2 0 1 1 1
3 1 1 0 0
4 0 0 0 0
I could do this by iterating the matrix elements and performing the necessary conversions but I am pretty sure that there is a better solution which allows me to do that directly in the daply
function. I just haven't figured out how and appreciate help on this.