views:

49

answers:

0

Hi,

I built a popularity cloud but it doesn't work properly. The txt file is;

1 Top Gear
3 Scrubs
3 The Office (US)
5 Heroes
5 How I Met Your Mother
5 Legend of the Seeker
5 Scrubs
.....

in my popularity cloud, names are written their frequency times. For xample, Legend of the Seeker is written 5 times and their size increases. Every word is supposed to written one times and the size must be according to popularity number (5). But every words should be written one time and tis size must be according to its popularity. How can I fix it?

And also my program shoud provide that condition;

Terms with the same frequency are typically displayed in the same colour e.g. Golf and Karate. Different frequencies are typically shown in different colours e.g. Basketball, Cricket and Hockey. At the bottom of each cloud output the frequency/count in the colour used to display the values in the cloud.

I am sending my code, if you help me, I really appreciate it. Thank you.

#!/usr/bin/python
import string

def main():
    # get the list of tags and their frequency from input file
    taglist = getTagListSortedByFrequency('tv.txt')
    # find max and min frequency
    ranges = getRanges(taglist)
    # write out results to output, tags are written out alphabetically
    # with size indicating the relative frequency of their occurence
    writeCloud(taglist, ranges, 'tv.html')

def getTagListSortedByFrequency(inputfile):
    inputf = open(inputfile, 'r')
    taglist = []
    while (True):
        line = inputf.readline()[:-1]
        if (line == ''):
            break
        (count, tag) = line.split(None, 1)
        taglist.append((tag, int(count)))
    inputf.close()
    # sort tagdict by count
    taglist.sort(lambda x, y: cmp(x[1], y[1]))
    return taglist

def getRanges(taglist):
    mincount = taglist[0][1]
    maxcount = taglist[len(taglist) - 1][1]
    distrib = (maxcount - mincount) / 4;
    index = mincount
    ranges = []
    while (index <= maxcount):
        range = (index, index + distrib-1)
        index = index + distrib
        ranges.append(range)
    return ranges

def writeCloud(taglist, ranges, outputfile):
    outputf = open(outputfile, 'w')
    outputf.write("<style type=\"text/css\">\n")
    outputf.write(".smallestTag {font-size: xx-small;}\n")
    outputf.write(".smallTag {font-size: small;}\n")
    outputf.write(".mediumTag {font-size: medium;}\n")
    outputf.write(".largeTag {font-size: large;}\n")
    outputf.write(".largestTag {font-size: xx-large;}\n")
    outputf.write("</style>\n")
    rangeStyle = ["smallestTag", "smallTag", "mediumTag", "largeTag", "largestTag"]
    # resort the tags alphabetically
    taglist.sort(lambda x, y: cmp(x[0], y[0]))
    for tag in taglist:
        rangeIndex = 0
        for range in ranges:
            url = "http://www.google.com/search?q=" + tag[0].replace(' ', '+') + "+site%3Asujitpal.blogspot.com"
            if (tag[1] >= range[0] and tag[1] <= range[1]):
                outputf.write("<span class=\"" + rangeStyle[rangeIndex] + "\"><a href=\"" + url + "\">" + tag[0] + "</a></span> ")
                break
            rangeIndex = rangeIndex + 1
    outputf.close()

if __name__ == "__main__":
    main()