You are here

rank order analysis

'And the winner is..' The perils and pitfalls of rank order analysis

Folksonomy research has not developed a standardized toolbox of analytical strategies, generally relying on descriptive methods to investigate the structure, composition and evolution of tagging vocabularies. Rank order (RO) based on tag frequency has been used to study various aspects of folksonomies: how well tags categorize resources (Brooks & Montanez, 2005; Kipp & Campbell, 2006) and identification of tagging patterns (Munk & Mork, 2007), and trends in user interest (Ding et al., 2009). However, results of simple RO may be misleading. For example, a tag whose frequency of use increases across two years may actually account for a smaller percentage of total tags assigned in the second year, or a tag whose rank declines across two years may actually account for a greater percentage of tags in the second year than the first. This research addresses the validity of assumptions underlying RO analysis by investigating how it correlates with both frequency and percentage of tag frequency across various time periods. Effects of RO by frequency of tag occurrence, taggers and URLs were investigated in a targeted sample collected from for 2004 through 2007. After removing all singleton tags, the dataset consisted of 7,863 URLs to which 1,804,379 tags had been assigned by 186,075 taggers. Because of the non-normal distribution of tag frequency data, non-parametric statistical tests were used to calculate correlations. Results of the analysis found that the correlation between RO and frequency was high, but that correlation between RO and percentage of actual use was problematic.

This paper was not presented at the Conference due to illness of the authors.

Presentation Type: 
Subscribe to rank order analysis