views:

151

answers:

3

Web developing isn't what it used to be. It used to consist of hacking together a few PHP scripts (I have nothing against PHP, actually it's currently my main programming language), uploading them via FTP to some webhost and that was that. Today, things are more complicated. As I can see by looking at a number of professional and modern websites (SO being the main one, I consider SO being a great example of good practice in web developing, even if it's made with ASP.NET and hosted on Windows), developing a website is much more than that:

  1. The website code is actually in a repository (that little svn revision in the footer makes my nerdy feelings tingle);
  2. Static files (CSS, JavaScript, images) are stored on a separate domain;

Ok, these were my observations. Now for my questions:

  1. What do you do with JavaScript and CSS files? Do you just not keep them under version control? That would seem stupid. Do you create a separate repository for them?
  2. How do you set up the repository? Do you just create one in the root of the web server? Or do you create some sort of post-commit trigger that copies the latest files to their appropriate destinations?
  3. What happens if you have multiple machines running the website and want to push some changes to all of them?
  4. Every such project has to have configuration files. These differ from the local repository to the remote one. For example, on my development machine I have no MySQL root password, while on the production server I certainly have a password. This password would be stored in a config file, amongst other such things, which would be completely different on my machine and on the server. Maybe they are different between production machines, too (like I said earlier, maybe the website runs on multiple machines for load balancing). How do I handle that?

I'm looking to start a new web project using:

  • Python + SQLAlchemy + Werkzeug + Jinja2
  • Apache httpd + modwsgi
  • MySQL
  • Mercurial

What I'd like is some best practice advice on using the aforementioned tools and answers to my questions above.

+2  A: 

While I have little experience working with the tools you've mentioned, except for MySQL, I can give you a few fairly standard answers for the questions you posted.

1) Depends on the details, but most often you keep them in the same repository but in a separate folder.

2) Just because something is commited to the repository doesn't mean that it's ready to go live - it's quite often an intermediary build that could be riddled with bugs. A publish is done manually, with an export from the repository. Setting up the webserver in the same folder as a svn checkout is a huge nono as the .svn folder contains quite a bit of sensitive information, such as how to push changes to the svn server.

3) You use some sort of NAS or SAN solution, or simply a network share on one of the servers, and read all your data from there. That way, when you push information to one place, it's accessible by all servers. If your network is slow, you set up scripts that pushes the files out to all the servers automatically from a single location. If you use a multi-server environment in ASP.NET, don't forget to update the machine key in the config files or your shared encrypted caches, like the viewstate, won't work across servers. Having a session store in a database is also a good idea.

4) I've got a post build step that only triggers on publish that replaces my database connectionstrings with production ones, and also changes my Production app config value from false to true in the published web.config/app.config files. I can't see any case where you'd want different config files for different servers serving the same content.

If something is unclear, just comment and I'll try to clarify.

Good luck! // Eric Johansson

CERIQ
Felix
A: 

I think you are mixing 2 different aspects, source control and deployment. Just because you have all your files in a single repository doesnt mean they have to be deployed that way. Its also arguable whether you should be deploying directly using source control or instead using a build/deploy script which could handle any number of configurations.

Also hosting static files on a seperate domain only really becomes worthwhile on high traffic websites. Are you sure you aren't prematurely optimising?

micmcg
When creating web applications with Python, there is no "document root" in your webserver configuration. All requests are handled by one application file (basically a Python script). This makes hosting static files on a separate domain even more useful, even if, at first, that separate domain is hosted on the same machine as the primary one.
Felix
+2  A: 

You're right, things can get complicated when trying to deploy a scalable website. Here are what I've found to be a few good guidelines (disclaimer: I'm a rails engineer):

  1. Most of the decisions regarding file structure for your code repository are largely based upon the convention of the language, framework and platform you choose to implement. Many of the questions you brought up (JS, CSS, assets, production vs development) is handled with Rails. However, that may differ from PHP to Python to whichever other language you want to use. I've found you should do some research about what language you're choosing to use, and try to find a way to fit the convention of that community. This will help you when you're trying to find help on an obstacle later. Your code will be organized like their code, and you'll be able to get answers more easily.

  2. I would version control everything that isn't very substantial in size. The only problem I've found with VC is when your repo gets large. Apart from that I've never regretted keeping a version of previous code.

  3. For deployment to multiple servers, there are many scripts that can help you accomplish what you need to do. For Ruby/Rails, the most widely used tool is Capistrano. There are comparable resources for other languages as well. Basically you just need to configure what your server setup is like, and then write or look to open source for a set of scripts that can deploy/rollback/manipulate your codebase to the servers you've outlined in your config file.

  4. Development vs Production is an important distinction to make. While you can operate without that distinction, it becomes cumbersome quickly when you're having to patch up code all over your repository. If I were you, I'd write some code that is run at the beginning of every request that determines what environment you're running in. Then you have that knowledge available to you as you process that request. This information can be used when you specify which configuration you want to use when you connect to your db, all the way to showing debug information in the browser only on development. It comes in handy.

  5. Being RESTful often dictates much of your design with regards to how your site's pages are discovered. Trying to keep your code within the restful framework helps you remember where your code is located, keeps your routing predictable, keeps your code from becoming too coupled, and follows a convention that is becoming more and more accepted. There are obviously other conventions that can accomplish these same goals, but I've had a great experience using REST and it's improved my code substantially.

All that being said. I've found that while you can have good intentions to make a pristine codebase that can scale infinitely and is nice and clean, it rarely turns out this way. If I were you, I'd do a small amount of research on what you feel the most comfortable with and what will help make your life easier, and go with that.

Hopefully that helps!

Matt
Thanks for the great answer. However, what do you mean by being RESTful? I though REST referred to how you design APIs, not how you code. Plus, REST mandates statelessness, and I really can't imagine a website that doesn't use sessions.
Felix
When strictly used with an API, yeah, you're not going to keep a state. However, there is nothing wrong with using that same methodology in designing how your application layer routes URLs. This ideology is heavily entrenched in how rails approaches routing. Considering each db table as a resource, using an ORM to access those resources and making that the Model portion of your MVC layout, you end up with urls that are predictable, code that is well organized, fits the convention, is very DRY, and when you do get to a day where you're incoporating an API. It's a much easier transition.
Matt