tags:

views:

1582

answers:

9

I decided to put some effort in building a web browser (from scratch), any recommendations on how to get started! not only coding, also design patterns, and best practices advices are highly appreciated.
PS: I'm using C# .NET

+4  A: 

You could start with well-formed and valid XHTML, which should be easier than the tag soup your browser will encounter in real "life".

Then you must find a way to bend the real HTML from the web to your needs.

But don't kid yourself: A browser isn't a small project.

stesch
+2  A: 

You mean as in writing your own rendering engine?

I can only say good luck. Many man years have gone into the current generation of the various browsers, If you want to do better than either of them you will need some serious skills. If you have to ask where to start, you probably have more than a few years of study to go before it would make any sense to attempt such a task.

That said, here are some (obvious) pointers:

  1. write lots of code that does small things, like solve all the www.euler.net problems
  2. learn everything you can about your toolkit and its community standards
  3. write lots more code
  4. get a real solid grasp of finite state machines
  5. write yet more code
  6. learn all about the tcp/ip stack and how it's used for http
  7. learn all you can about http
  8. learn the standards (html, xml, sgml, css)
  9. celebrate your 150th birthday.
  10. get started on the actual browser project.

edit below here

I didn't mean for it to be either motivating or demotivating, just an attempt to show you that a browser is a really big project and that really big projects require a whole lot of thought. Blunt honesty sprinkled with humour.

I've been programming for over two thirds of my life and I like to think that I am a pretty decent programmer, but it would be foolish of me to think that I'd stand half a chance at writing a decent web browser from scratch.

Ofcourse, if this is what you want to do, don't let my comment stand in your way. You can probably do better than Internet Explorer.

Kris
I should've mentioned: If you want to create an apple-pie from scratch, you have to start by creating a universe.
Kris
A: 

...then start worrying about security

(non-functional and cross cutting concerns should be generally considered up front though :) )

Matt
+10  A: 

It's an insanely ambitious project (especially for a single developer) but something I'd love to do someday - you could learn so much from it.

I don't know a lot about how the protocols work (something that you definitely need to research) or much about what goes on in a browser but a great place to start would be the source of the open-source browsers, primarily Chrome and Firefox. Chrome is an especially good project to look at as they only do what I'd expect you to start with: the chrome and the backend of the browser. Forget creating a rendering engine at first - use Webkit or Gekko.

Ross
Bill: Inanely is a word, but I suppose they're roughly the same really: http://dictionary.reference.com/browse/inanely
Ross
+23  A: 

Well break it down into pieces. What is a Web browser? What does it do? It:

  • Fetches external content. So you need a HTTP library or (not recommended) write this yourself. There's a lot of complexity/subtlety to the HTTP protocol eg handling of expires headers, different versions (although it's mostly 1.1 these days), etc;
  • Handles different content types. Theres a Windos registry for this kind of thing that you can piggyback. I'm talking interpreting content based on MIME type here;
  • Parses HTML and XML: to create a DOM (Document Object Model);
  • Parses and applies CSS: this entails understanding all the properties, all the units of measure and all the ways values can be specified (eg "border: 1px solid black" vs the separate border-width, etc properties);
  • Implements the W3C visual model (and this is the real kicker); and
  • Has a Javascript engine.

And that's basically a Web browser in a nutshell. Now some of these tasks are incredibly complex. Even the easy sounding ones can be hard. Take fetching external content. You need to deal with use cases like:

  • How many concurrent connections to use?
  • Error reporting to the user;
  • Proxies;
  • User options;
  • etc.

The reason I and others are colletively raising our eyebrows is the rendering engine is hard (and, as someone noted, man years have gone into their development). The major rendering engines around are:

  • Trident: developed by Microsoft for Internet Explorer;
  • Gecko: used in Firefox;
  • Webkit: used in Safari and Chrome;
  • KHTML: used in the KDE desktop environment. Webkit forked from KHTML some years ago;
  • Presto: used in Opera since version 7;
  • Elektra: used in Opera 4-6;

The top three have to be considered the major rendering engines used today.

Javascript engines are also hard. There are several of these that tend to be tied to the particular rendering engine:

  • SpiderMonkey: used in Gecko/Firefox;
  • TraceMonkey: will replace SpiderMonkey in Firefox 3.1 and introduces JIT (just-in-time) compilation;
  • KJS: used by Konqueror, tied to KHTML;
  • JScript: the Javascript engine of Trident, used in Internet Explorer;
  • JavascriptCore: used in Webkit by the Safari browser;
  • SquirrelFish: will be used in Webkit and adds JIT like TraceMonkey;
  • V8: Google's Javascript engine used in Chrome;
  • Opera also uses its own.

And of course there's all the user interface stuff: navigation between pages, page history, clearing temporary files, typing in a URL, autocompleting URLs and so on.

That a lot of work.

cletus
* Gecko :) Also agree. Main parts are HTML renderer and JavaScript engine.
abatishchev
Opera created their own. Presto is the current one, and Electra was their previous one.
Tim Sullivan
Great in-depth answer - I forgot about JavaScript parsing all together!
Ross
+2  A: 
CMS
Oh did GHC carry on? Think I unsubscribed when they started doing that Ross's lair thing.
Ross
@Ross: Yeah, they still deliver the comic, that new guy is called Boris from Russia, and he is a "Super Hacker" LOL
CMS
+1  A: 

As everyone else has already said, a web browser is a huge project. You've got to worry about tcp/ip&sockets, rendering html, using css, creating a DOM model, executing javascript, dealing with malformed markup and code, and handling all types of files before you can even think about all the things people expect from a browser (ie bookmarks, history, private browsing, security, etc.) It's a huge project.

That being said, it can be done. My suggestion would be to go look at the source of Firefox. I know that you said you want to build a browser from scratch, but it would be very helpful to learn from an open-source project, first.

I would download the Firefox source, and slowly strip it down. In other words, I would take the source and remove all bookmarking functionality. Then, I'd remove the ability to handle addons. Then, I'd delete all code regarding saving files. I would continue this process until I got a very basic web browser. I'd look over that code.

Then, I'd start building my own. I'd take the knowledge I'd gained from taking apart Firefox, and I'd put it into building a new browser.

A whole lot of luck to you!

stalepretzel
A: 

very ambitious project but one developer can't do this alone you need a team(project manager , testers ...) and maybe you should review your choise of language c# works only on windows(i know mono on linux but it is not the same) anyway i wish you good luck and i ll be happy to use your browser :D

Yassir
A: 

You really have a lot of free time in your hand, haven't you? AFAIK, most browsers were written in C++, not all users have the .NET framework installed on their computers and if they do it might not be the version you need.

This could take you years but anyway, there are many open source browsers out there, FireFox, Google Chrome .. etc, you could start by having a look on the code, good luck with that :)

Waleed Eissa