views:

952

answers:

8

What happens when you log into a website?

I know cookies are stored and some info (what info?) gets sent to the server...but maybe some more detail?

+5  A: 

This completely depends on the implementation of the website. Even the usage of cookies is not mandatory, but very common.

In most cases however, something like this happens:

  • You send in your username and password using an HTML form.
  • The server looks up the relevant user (using a database)
  • The server checks if the password matches the password that is stored in the database alongside the user.
  • If the password is correct, the server will store what user currently is active in the session. The identifier of this session is stored in a cookie, the actual data of this session (the current user) is stored on the server under this identifier.

Now you are logged in. You will remain logged in during the remainder of the session:

  • When you request another page from the server, you will send the cookie with the sesison identifier.
  • The server loads the session using this identifier. In this session, the current user is stored, so the server knows what user is logged in.
wvanbergen
Point 3 the server takes a hash of the password and checks to see if it matches the hash stored in a database, passwords should never be stored.
Tom
When requesting a page after having logged in, the server can perform additional cheacks, e.g. see if the request comes from the same ip address as the login.
Treb
+11  A: 

That's a pretty general question. What you're doing, over all, is establishing some kind of credentials with the site itself. If we take the simple version, you enter a user name and a password; that means you identify yourself to the website, and then show it a secret you and the website share that no one else knows (we hope). That establishes you as authentically the person with that user name, and so we say you have authenticated yourself.

Once you've done so, there are some design decisions the website designer has to make. most people don't want to log in for every page, so the web site wants to store a little information, a credential, on your end. This means that it can tell it's still you. Often, as you say, that's a "cookie", which is nothing more that a tiny text file named with the web site's URL. This file is stored by the browser.

On many web sites, like for banking, you also want to guarantee that the data being exchanged can't be intercepted by a third party. If so, you establish a secure connection using a protocol known as SSL or TLS. What this adds to the basic connection is an exchange of information that establishes a session key. This session key is then used to encrypt the communications. This usually happens before you exchange the user name and password, so that your password is never visible to a malicious third party either.

Under the covers, when you establish a secure connection, the web site sends your browser a block of formatted data called an x509 certificate. This is another form of authentication; the certificate will have been signed by an issuer (the certificate authority or "CA") and the browser can use stored data about the CA's to ensure that the certificate is authentic.

Charlie Martin
+49  A: 
Paul Dixon
I love the dialogue.
Gamecat
Making this one a favorite so I'll have it ready when grandma (or anyone else!) asks.
slothbear
See http://stackoverflow.com/questions/549/the-definitive-guide-to-website-authentication-beta for more details on what might be happening.
Emil
Good link, have expanded on it in answer
Paul Dixon
+3  A: 

When you log into a web site, first your credential are authenticated. If your credentials match, then something is put into the session (on the server) to keep track of who you are so you can access data that is yours without having to re-log-in. This is obviously useless on the web server unless the client can provide information about who it is on each request. Note that the "Session" is usually maintained entirely on the web server, with the client having only a key that allows access to the session.

Remember that HTTP itself is a stateless protocol. The HTTP standard contains no method for HTTP requests to keep or persist any state between individual HTTP requests. Thus, the state is usually kept entirely on the server and you just need a method for the client to identify which session the current HTTP request belongs to.

The two common ways this is done are:

  • Use a cookie (for example, Apache Tomcat uses the JSESSIONID cookie) to store some hashed authentication token that will successfully look up the web session, or
  • rewrite the URL so that every request has the session ID added to the end of the request. Still using Apache Tomcat as the example, if cookies are disabled then the URL will be rewritten to end with a string like ";jsessionid=....". Thus, every request, every HTTP GET and POST (and the rest) will end with this string.

Thus, on each request the client makes, the session ID is provided to the web server, allowing the persisted state for this client to be quickly looked up, allowing HTTP to act like a stateful protocol.

What information is sent to the server when you log in? Whatever information you provided on the login form. Some web servers also track the TCP/IP address the request came from to avoid session hijacking attacks. This is usually all the information that is needed by the server.

If you don't allow your browser to save cookies, then you will have to log in to the web server each time you open your browser and initially open the server's web page. However, if you allow your browser to save cookies, then many servers allow you the option of saving the cookie (that is, not just using a session cookie) so that each time you go to a web page of the server, the persisted cookie will identify you so you don't need to re-login. Here, the cookie will save enough information -- often in an encrypted form that only the server can understand -- to identify you. In this case, the Cookie is not a simple session ID.

Eddie
A: 

As others have mentioned, login procedures vary depending on implementation, but the basic case (simple web app authentication) uses something like the following pseudocode:

function login(username, password) {
    user = db->get_user(username)

    if (user == false) {
        report_error("Unknown username")
        exit
    }

    if (user->password != hash(password)) {
        report_error("Incorrect password")
        exit
    }

    // User authenticated, set session cookie
    session->set_data('current_user', user->username)
}

Of course, in most cases, it gets a little more involved than that, but every login function starts its life looking essentially like the above. Now, if we add autologin ("remember me") to the mix, we get something like this:

function login(username, password, remember_me) {
    user = db->get_user(username)

    if (user == false) {
        report_error("Unknown username")
        exit
    }

    if (user->password != hash(password)) {
        report_error("Incorrect password")
        exit
    }

    // User authenticated, set session cookie
    session->set_data('current_user', user->username)

    if (remember_me == true) {
        cookie_token = random_string(50)
        set_cookie('autologin_cookie', cookie_token, ONE_MONTH)
        // Finally, save a hash of the random token in the user table
        db->update_user(user, 'autologin_token', hash(cookie_token))
    }
}

Plus the function to perform the automatic login if there is a cookie present:

function cookie_login() {
    cookie = get_cookie('autologin_cookie')

    if (cookie == false) {
        return false
    }

    // Only for demonstration; cookie should always include username as well
    user = db->get_user_by_cookie(cookie)

    if (user == false) {
        // Corrupt cookie data or deleted user
        return false
    }

    // User authenticated, set session cookie
    session->set_data('current_user', user->username)
    return true
}

NOTE: The above isn't a 'best practices' approach, and it's not very secure. In production code, you would always include a user identifier in the cookie data, use several levels of throttling, store data on failed and successful logins, etc. All of this has been stripped away to make the basic structure of authentication simple to follow.

Anyway, I hope this is what you were looking for, koldfyre. I don't know your background, but if you're unsure of how sessions and cookies work, you should read up on them separately, and if you need more elaborate details, just ask.

P.S.: You may also want to check the question "The Definitive Guide To Website Authentication" for best practice approaches

Jens Roland
You are not clearly differentiating between what is done clientside vs. serverside. E.g. your login function is mixing the server and clientside parts of the password authentication.
Brian
The code is all server side, but should be seen as pseudocode. Meaning function calls such as report_error() would send headers and an error page to the client.
Jens Roland
It gets an upvote, if only for the link to the other SO question. I don't suppose we can close a question with a bounty as 'exact duplicate'. :D
Jonathan Leffler
+1  A: 

Very simply explained, what happens is mentioned below:

What goes in?

  • Username
  • Password

What happens inside?

  1. Password is converted to its hash
  2. Hash(password) is compared with the DB table or a Directory Service (unless someone is down-rightly foolish, the site won't save your password in clear text)
  3. If Authenticated, A status-token is stored in Session and/or cookie.
    • This token can just contain a status, Login Timestamps, your userId, userType(if any), et al.
    • This token is read and verified on every page you access if that page requires you to be logged with as a certain type of user.
  4. If authentication fails, you are redirected to a page displaying error asking you to re-login.

What comes out

  1. You are redirected your personal profile page/the page you were accesing to which verifies you with the help of the token.
  2. Additionally, a Digital Certificate may come in picture if you are accessing a banking site or other critically secure site
Mohit Nanda
A: 

Look, it's a little hard to give you a lot more information that you already have here; I'm not sure why you want to set a bounty on it. A cookie is just a little bit of named information, and you can put anything you like in it. For a session, you'd want some kind of session ID. There are conventions for that, or you can do it yourself. Whatever you do, when you set the cookie, you leave a little data lying about on the person's browser that is more or less like this:

mydomain.com:
    mystuff: this is my stuff, by golly.

When you come back, you retrieve the cookie and get that back.

If you want to see all the details of that protocol, have a look at the Wikipedia article.

Charlie Martin
+1  A: 

There are two main ways of performing authentication on the web, and a few less popular ways that are also worth knowing about.

The first is HTTP authentication, as defined by RFC 2617. When you request a protected page, the server responds with a 401 status code, signalling that you aren't permitted to access the resource. In addition to this, it also sends a WWW-Authenticate header, which instructs the browser on how it wants you to authorise yourself. The browser sees this status code and the header, and prompts you for your authentication details. When you enter them, your browser prepares them according to the specific authentication scheme the server specified, and requests the page again, including an Authorization header with the prepared details. The server checks these details against its user database, and either responds with another 401 (wrong details), or the protected page with an accompanying 200 status code to indicate success.

HTTP authentication is one of those ancient features that browsers didn't implement well to begin with and have never really been improved. Because of this, it has become much more popular for web developers to implement authentication themselves using cookies to persist state. In this case, the user is presented with a standard HTML form. When the user enters their credentials into the fields and submits the form, the browser encodes it and sends it to the server in the same way it encodes any normal HTML form. The server checks the credentials, and if they are legitimate, sets a cookie with a randomly-generated ID number, along with a corresponding database/filesystem entry that recognises that ID number as belonging to a particular user.

From this point on, every request the browser makes to the server includes this ID number cookie as an HTTP header. The server recognises the cookie, looks up the ID number, and knows which user you are. When you choose to log out, the server sends a response asking your browser to forget the ID number, at which point you are just another anonymous user.

A less commonly-used option is the use of SSL client certificates. Many people are familiar with the idea of using SSL to identify a server. A cryptographic keypair is generated, signed by a trusted authority, and used to prove that the data being sent originated with the owner of the keypair. What many people aren't aware of though, is that the same can be used by a client to prove its identity to a server. This is less convenient, however, as you need to carry your certificate around with you if you want to use it on more than one machine.

There are variations and lesser-known options available of course, but these are the most prominent ones.

Jim