views:

83

answers:

2

I want to have a site that is a simple blog so I created a model:

   class Post < ActiveRecord::Base
     attr_accessible :title, :body
   end

I want to use Markdown but without HTML tags. I also want always to keep database clean and my idea is to use before_save()/before_update() callbacks to sanitise my input and escape HTML.

I don't care about caching and performance therefore I always want to render post when needed. My idea is toadd following to the model:

   def body_as_html 
     html_from_markdown(body)
   end

What do you think of such design? MVC and ActiveRecord are new for me and I am not sure of used callback.

+2  A: 

I see nothing obvious wrong with that method. Caching is a very simple thing to enable if performance becomes an issue... the important thing to make caching useful is to reduce or eliminate the dynamic content on the page, so that the cache doesn't constantly get obsolete. If you're just showing the blog post, then the cache only needs to be regenerated if the blog changes, or perhaps if someone adds a comment (if you have comments).

Myrddin Emrys
+1 for this answer. Additionally, +1 for not caring about performance until later - this will be fine. It's always best to store completely in raw form until you need to use the content.
aronchick
A: 

My general rule of thumb is to keep the data in your database as "pure" as possible, and do any sanitization, rendering, escaping or general munging as close to the user as possible - typically in a helper method or the view, in a Rails app.

This has served me well for several reasons:

  • Different representations of your data may have display requirements - if you implement a console interface at some point, you won't want to have all that html sanitization.
  • Keeping all munging as far out from the database as possible makes it clear whose responsibility it is to sanitize. Many tools or new developers maintaining your code may not realize that strings are already sanitized, leading to double-escaping and other formatting ugliness. This also applies to the "different representations" problem, as things can end up escaped in multiple different ways.
  • When you look in your database by hand, which will end up happening from time to time, it's nice to see things in their un-munged form.

So, to address your specific project, I would suggest having your users enter their text as Markdown and storing it straight in to the database, without the before_save hook (which, as an aside, would be called on creation or update, so you wouldn't also need a before_update hook unless there was something specific that you wanted on update but not creation). I would then create a helper method, maybe santize_markdown, to do your sanitization. You could then call your helper method on the raw markdown, and generate your body html from the sanitized markdown. This could go in the view or in another helper method according to your taste and how many different places you were doing it, but I probably wouldn't put it in the model since it's so display-specific.

John Hyland
I agree with the general approach re:markdown. But, I think that sanitization can and should be done before storage. Assuming you use a whitelist approach to input, there is some data you just don't want in your system at all. I tend to sanitize user input on create, altering the user-data accordingly, but escape and processing still occur on view.
Toby Hede
Hm. It seems to me that splitting your input munging to do some pre-storage and some post-storage is adding complexity and the possibility of confusion as to what happens when. What sort of data do you find universally and urgently bad enough to warrant that kind of split?
John Hyland