Monday, January 10, 2011

Find Top N : Solution 2

The problem is described here. Solution 1 is here. Server-side Javascript in XPages in IBM Lotus Domino offers a function sort on an array.

Let S strings be stored in an Array names. Calling names.sort() returns sortedNames. Let topNames stores the top N strings with highest occurrence count found so far in descending order of occurrence and topCount keeps track of respective occurrence count. Following is the rest of the algorithm.

For each string s in sortedNames
Begin
   Let start,end store index of s in sortedNames
   Advance end till sortedNames[end] becomes different from s
   Set uniqueCount = end - start
   For each count c in topCount
      if uniqueCount is greater than c
         Begin
            move all the elements in topNames and topCount back by one
            insert uniqueCount in topCount and s in topNames
         End
   Advance sortedNames by uniqueCount
End

Since I don't have complexity numbers for Array.sort, I take the worst case and assume it to be of $O(S^2)$. After that the above algorithm is executed which, if all the strings are unique, take $O(S \times N)$ time. So the time complexity of this algorithm is $O(S^2)$. Space complexity is $O(S + N)$. Just remember that in both the solutions, the comparison of strings are counted. So the algorithms depend on not just the number of strings but on their length too.

No comments:

Post a Comment