tags:

views:

115

answers:

1

I have a data frame with (to simplify) judges, movies, and ratings (ratings are on a 1 star to 5 star scale):

d = data.frame(judge=c("alice","bob","alice"), movie=c("toy story", "inception", "inception"), rating=c(1,3,5))

I want to create a bar chart where the x-axis is the number of stars and the height of each bar is the number of ratings with that star.

If I do

ggplot(d, aes(rating)) + geom_bar()

this works fine, except that the bars aren't centered over each rating and the width of each bar isn't ideal.

If I do

ggplot(d, aes(factor(rating))) + geom_bar()

the order of the number of stars gets messed up on the x-axis. (On my Mac, at least; for some reason, the default ordering works on a Windows machine.) Here's what it looks like: alt text

I tried

ggplot(d, aes(factor(rating, ordered=T, levels=-3:3))) + geom_bar()

but this doesn't seem to help.

How can I get my bar chart to look like the above picture, but with the correct ordering on the x-axis?

+3  A: 

I'm not sure your sample data frame is representative of the images you put up. You mentioned your ratings are on a 1-5 scale, but your images show a -3 to 3 scale. With that said, I think this should get you going in the right direction:

Sample data:

d = data.frame(judge=sample(c("alice","bob","tony"), 100, replace = TRUE)
    , movie=sample(c("toy story", "inception", "a league of their own"), 100, replace = TRUE)
    , rating =  sample(1:5, 100, replace = TRUE))

You were closest with this:

ggplot(d, aes(rating)) + geom_bar()

and by adjusting the default binwidth in geom_bar we can make the bar widths more appropriate and treating rating as a factor centers them over the label:

ggplot(d, aes(x = factor(rating))) + geom_bar(binwidth = 1)

alt text

If you wanted to incorporate one of the other variables in the chart such as the movie, you can use fill:

ggplot(d, aes(x = factor(rating), fill = factor(movie))) + geom_bar(binwidth = 1)

alt text

It may make more sense to put the movies on the x axis and fill with the rating if you have a small number of movies to compare:

ggplot(d, aes(x = factor(movie), fill = factor(rating))) + geom_bar(binwidth = 1)

If this doesn't get you on your way, put up a more representative example of your dataset. I wasn't able to recreate the ordering problems, but that could be due to a difference in the sample data you posted and the data you are analyzing.

The ggplot website is also a great reference: http://had.co.nz/ggplot2/geom_bar.html

Chase