views:

32

answers:

1

I discovered ASP.NET profiles a few years back and found it quick and easy to implement some powerful functionality in my web site. I ended up exclusively using "anonymous" profile properties because I am NOT using ASP.NET membership and wanted to track information and provide personalization for my site's users even though they weren't logged in.

Fast forward 3 years and now I am struggling with keeping the size of my aspnetdb database within the 3 GB limit of my hosting provider. I decided a year ago that I would only keep 6 months of profile data and every month I now have to run the aspnet_Profile_DeleteInactiveProfiles procedure followed by a support request to my host to truncate the database file.

The real downside is now that I have this functionality implemented, the web site relies on this database and there is downtime every time I need to truncate. Today really took the cake - the site was down for more than 12 hours while doing maintenance on aspnetdb and I ended up creating a failover copy of the aspnetdb database just so I could get the web site online again (the original aspnetdb database is still down as I write this).

Last year I rewrote some of the functionality because it was storing a small collection of binary objects and I converted it to store a comma delimeted string of IDs, making it smaller for storage.

I have looked into getting more disk space, but the reality is it simply isn't worth it to have a database that grows by 4 GB every year just to do simple things like keep a list of each user's most recently viewed products, keep track of what category they were in for a "continue shopping" button to take them back, and store the total amount (currency only) that is in their shopping cart. Realistically, less than 5% of the data is reused by repeat visitors to the site, however there is no way to tell in advance which of these users will make it back to access their profile information.

So my question is whether there is a more efficient way to store the profile data so it doesn't take up so much space or if there are alternatives to creating profile database records for every single user for supporting this functionality?

I took a look at the table profile provider, but can someone vouch whether using it will store the data using less disk space than the default provider?

Update:

I did some research and it seems that the nvarchar(max) field is a direct replacement for the ntext fields that are used by the default implementation of the aspnetdb, but will only require half of the disk space according to this article and this one. This means I should be able to create a new copy of aspnetdb, modify the datatype throughout its schema and code, and transfer the data into the new database. That should fix both the way the data is stored and any fragmentation that occurred when the shrink operation was performed.

I have now completed testing and moved this solution into production. At first I thought there would be a problem because Microsoft hard-coded the NText datatype into the stored procedure call inside of the SqlProfileProvider. However, as it turns out SQL Server will implicitly convert an NText parameter to nvarchar(MAX). In short, I didn't have to change anything other than the types on the PropertyNames and PropertyValuesString columns of the aspnet_Profiles table.

The data had to be rewritten to disk to free up the space, but I ended up saving about 30% overall just by changing the datatype.

Another Update:

I also discovered that although I am getting around 700 unique visitors every day, the average number of daily "users" in the aspnet_Users table averages about 4000-5000. I theororized that this was due to the fact that some users are browsing without cookies. I did some testing and discovered that a user won't be created if the profile is not updated, but if you write to the profile a user (and profile) will be created on every request if cookies are not enabled.

I am working on a workaround for this. I am attempting to write the javascript for an AJAX call to a webmethod. In theory, the second request (the AJAX call) should have the .ASPXANONYMOUS cookie present if cookies are enabled, and therefore I can safely write to the profile. The javascript will only be injected into the page if this is the beginning of the session (determined by the Session.IsNewSession property). The user should never be aware of the AJAX call - it is only there to tell the page whether cookies are enabled so it can update the profile.

Of course, every subsquent request can simply check the cookies collection to ensure the .ASPXANONYMOUS cookie is present. Most likely I will create a shared function that can be called both by an AJAX called webmethod and by the Page_Load event directly, and use the IsNewSession property to determine which request to allow to run the cookie check and subsequent profile update.

According to Microsoft, using session to keep track of whether cookies are enabled defeats the purpose (because session depends on cookies). They suggest to persist your AreCookiesEnabled value to a database, but they don't make mention of how you are supposed to keep track of where you kept the value - if cookies are not enabled, how can you link a request to a database record?

+1  A: 

I implemented the workaround I came up with in my original post, however it turned out to be a bit different than what I originally described. The fix can actually be broken into 2 parts - one to fix the problem with the database being updated when cookies are disabled, and the second to detect when cookies are disabled without doing a redirect.

I have already posted the solution to the anonymous profiles creating records when cookies are disabled.

So now I will focus on the second part - getting information into the profile from the first page requested. This only needs to be done if you are doing analytics tracking or something similar - the first part will take care of protecting the database from filling up with totally useless data when 1) cookies are disabled and 2) anonymous profile properties are used and works from the second request (or first postback) onwards.

When I researched the issue of checking to see if cookies are enabled, most solutions used a redirect either to the same page or a different page and back again. Interestingly, MSDN was the one who came up with the 2-redirect solution.

While in certain circumstances a redirect is acceptable, I didn't want the extra performance impact to affect the majority of our users. Instead, I opted for another approach - use AJAX to run code on the server after the first request has completed. While this has the advantage of not causing a redirect, it has the disadvantage of not functioning when JavaScript is disabled. However, I opted for this approach because the percentage of data that is being lost on initial request is insignificant and the application itself does not depend on this data.

So, walking through from the beginning of the process to the end...

Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

    If Not Me.IsPostBack Then

        If Session.IsNewSession Then
            Me.InjectProfileJavaScript()
        ElseIf AnonymousProfile.IsAnonymousCookieStored Then
            'If cookies are supported, and this isn't the first request, update the
            'profile using the current page data.
            UpdateProfile(Request.RawUrl, Request.UrlReferrer.OriginalString, CurrentProductID.ToString)
        End If

    End If

End Sub

This is the Page_Load method placed in my custom PageBase class, which all of the pages in the project inherit from. The first thing we check is whether this is a new session by checking the Session.IsNewSession property. This property is always true if cookies are disabled or if this is the first request. In both cases, we don't want to write to the database.

The "else if" section runs if the client accepted the session cookie and this is not the first request to the server. The thing to note about this code snippet is that both sections cannot run in the same request, meaning the profile can only be updated 1 (or 0) times per request.

The AnonymousProfile class is included in my other post.

Private Sub InjectProfileJavaScript()

    Dim sb As New StringBuilder

    sb.AppendLine("$(document).ready(function() {")
    sb.AppendLine("  if (areCookiesSupported() == true) {")
    sb.AppendLine("    $.ajax({")
    sb.AppendLine("      type: 'POST',")
    sb.AppendLine("      url: 'HttpHandlers/UpdateProfile.ashx',")
    sb.AppendLine("      contentType: 'application/json; charset=utf-8',")
    sb.AppendFormat("      data: ""{3}'RawUrl':'{0}', 'ReferralUrl':'{1}', 'ProductID':{2}{4}"",", Request.RawUrl, Request.UrlReferrer, CurrentProductID.ToString, "{", "}")
    sb.AppendLine()
    sb.AppendLine("      dataType: 'json'")
    sb.AppendLine("    });")
    sb.AppendLine("  }")
    sb.AppendLine("});")

    Page.ClientScript.RegisterClientScriptBlock(GetType(Page), "UpdateProfile", sb.ToString, True)

End Sub

Public Shared Sub UpdateProfile(ByVal RawUrl As String, ByVal ReferralUrl As String, ByVal ProductID As Integer)
    Dim context As HttpContext = HttpContext.Current
    Dim profile As ProfileCommon = CType(context.Profile, ProfileCommon)

    Dim CurrentUrl As New System.Uri("http://www.test.com" & RawUrl)
    Dim query As NameValueCollection = HttpUtility.ParseQueryString(CurrentUrl.Query)
    Dim source As String = query.Item("source")
    Dim search As String = query.Item("search")
    Dim OVKEY As String = query.Item("OVKEY")

    'Update the profile
    profile.TestValue1 = source
    profile.TestValue2 = search

End Sub

Next, we have our method to inject an AJAX call into the page. Keep in mind, this is still the base class so regardless of the page that the user lands on this code will run on the first page request.

Inside the JavaScript, we first test to see if cookies are enabled and, if so, call a custom handler on the server using AJAX and JQuery. We are passing the parameters from the server into this code (although 2 of them could have just been supplied by the client, the extra bytes aren't that significant).

The second method updates the profile and will contain my custom logic to do so. I included a snippet on how to parse the querystring values from a partial URL. But the only thing that really needs to be known here is that this is the shared method that updates the profile.

Important: For the AJAX call to function, the following handler must be added to the system.web section of the web.config file:

<httpModules>
    <add name="ScriptModule" type="System.Web.Handlers.ScriptModule, System.Web.Extensions, Version=3.5.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/>
</httpModules>

I decided it would be best to test for cookies on the client and not make the extra AJAX call if cookies are disabled. To test the cookies, use this code:

function areCookiesSupported() {
    var c='c';var ret = false;
    document.cookie = 'c=2;';
    if (document.cookie.indexOf(c,0) > -1) {
        ret = true;
    } else {
        ret = false;
    }
    deleteCookie(c);
    return ret
}
function deleteCookie(name) {
    var d = new Date();
    document.cookie = name + '=1;expires=' + d.toGMTString() + ';' + ';';
}

These are 2 JavaScript functions (in a custom .js file) that simply write a cookie and read it back to determine if cookies can be read. It then cleans up the cookie by setting an expiration date in the past.

<%@ WebHandler Language="VB" Class="Handlers.UpdateProfile" %>

Imports System
Imports System.Web
Imports System.Web.SessionState
Imports Newtonsoft.Json
Imports System.Collections.Generic
Imports System.IO

Namespace Handlers

    Public Class UpdateProfile : Implements IHttpHandler : Implements IRequiresSessionState

        Public Sub ProcessRequest(ByVal context As HttpContext) Implements IHttpHandler.ProcessRequest

            If AnonymousProfile.IsAnonymousCookieStored Then

                If context.Session.IsNewSession Then
                    'Writing to session state will reset the IsNewSession flag on the
                    'next request. This will fix a problem if there is no Session_Start
                    'defined in global.asax and no other session variables are written.
                    context.Session("ActivateSession") = ""
                End If

                Dim reader As New StreamReader(context.Request.InputStream)
                Dim params As Dictionary(Of String, String) = JsonConvert.DeserializeObject(Of Dictionary(Of String, String))(reader.ReadToEnd())

                Dim RawUrl As String = params("RawUrl")
                Dim ReferralUrl As String = params("ReferralUrl")
                Dim ProductID As Integer = params("ProductID")

                PageBase.UpdateProfile(RawUrl, ReferralUrl, ProductID)
            End If
        End Sub

        Public ReadOnly Property IsReusable() As Boolean Implements IHttpHandler.IsReusable
            Get
                Return False
            End Get
        End Property

    End Class

End Namespace

This is our Custom HttpHandler class that receives the AJAX request. The request is processed only if the .ASPXANONYMOUS cookie is passed in (checked once again by utilizing the AnonymousProfile class from my other post), which will prevent robots and other scripts from executing it.

Next we run some code to update the session object if it is required. For some strange reason, the IsNewSession value will stay true until the session is actually updated, but only if a handler for Session_Start doesn't exist in the Global.asax. So to make this code work both with and without a Global.asax file and without any other code that updates the session object, we run an update here.

The next bit of code I grabbed from this post and contains a dependency to the JSON.NET serializer. I was torn about using this approach because of the extra dependency, but ultimately decided that the JSON serializer will likely be valuable in the future as I continue adding AJAX and JQuery to the site.

Then we simply get the parameters and pass them to our shared UpdateProfile method in the PageBase class that was defined previously.

<!-- Required for anonymous profiles -->
<anonymousIdentification enabled="true"/>
<profile defaultProvider="SqlProvider" inherits="AnonymousProfile">
    <providers>
        <clear/>
        <add name="SqlProvider" type="System.Web.Profile.SqlProfileProvider" connectionStringName="SqlServices" applicationName="MyApp" description="SqlProfileProvider for profile test web site"/>
    </providers>
    <properties>
        <add name="TestValue1" allowAnonymous="true"/>
        <add name="TestValue2" allowAnonymous="true"/>
    </properties>
</profile>

Lastly, we have our configuration section for the profile properties, set up to be used anonymously (I purposely ommitted the connection string section, but a corresponding connection string and database are also required). The main thing to note here is the inclusion of the inherits attribute on the profile. This once again is for the AnonymousProfile class that is defined in my other post.

NightOwl888