Blog

A catalogue of my discoveries in software development and related subjects, that I think might be of use or interest to everyone else, or to me when I forget what I did!

Keyword Density Cloud

April 24, 2009

The 'keyword density cloud' was a nice little widget I worked on at work, to provide a graphical 'tag cloud like' analysis of the keyword density for a given web page. It can be used by SEO concious writers to quickly examine what a page of content may appear to be about (to search engines) based on the 'top words' and the relative strength of those top words, but is also a nice overview of the page content to end users. For example:
SEO Keyword Density by WMG

You can use the widget on any page that has a publicly accessible URL by simply pasting in the following javascript:
<div id="wmgcumulus">
   <!--This link needs to be included for the keyword density tool to work properly.-->
   <a href="http://www.webmarketinggroup.co.uk/" target="_blank">SEO</a> Keyword Density by WMG
   <script language="javascript" type="text/javascript" src="http://cumulus.wmg.uk.com/wmgcumulus.js"></script>
   <script language="javascript" type="text/javascript">
       DrawTagCloud(400,200,'ffffff','0x000000','0xc0c0c0',30);
   </script>
</div>
You can modify the parameters for 'DrawTagCloud', which are as follows: DrawTagCloud(width, height, bgColor, textColor1, textColor2, maxWords) Yes even I spell 'colour' as 'color' in code :\ As you may have guessed from the names used, the output is given using the WP-Cumulus WordPress plugin Flash component. Care should be taken when pasting the code into blogging software, such as Blogger, that no '<br>' tags are inserted into the javascript.
Permalink: Keyword Density Cloud

Preventing Uppercase URLs in ASP.NET

March 13, 2009

According to RFC3986, the path portion of a URI is case-sensitive. Google and other search engines use this ruling in their crawlers, so that pages which differ by case are treated as different pages. This can be problematic when working in a non-case sensitive environment, such as IIS on Windows, because this rule is ignored and any variation in case will still return the same page. This leads to duplicate content issues and can also dilute the number of inbound links detected to a particular page. As a general rule, I always keep my internal site links in lower case, but you cannot control how 3rd parties link to you, so if someone links to you using upper case and this link is spidered by a search engine, the page could be mistaken as a duplicate of its lower case alternative. E.g. http://www.craigwardman.com/blog/ http://www.craigwardman.com/Blog/ http://www.craigwardman.com/bLoG/ These all serve up the same page, but as per the RFC should be treated as 3 different pages. We want to prevent these links from being mistaken as different pages, by redirecting requests for pages with uppercase letters to the lowercase equivelant. In my particular case, I have two other problems to overcome because I am using a shared hosting environment. Firstly, I cannot use HttpHandlers on the server so my fix had to be in the code-behind of any pages I want to apply it to (limiting it to aspx files). Secondly, the shared host uses 'Default.aspx' and not 'default.aspx' as the default page, so this causes an uppercase character in the RawUrl when the root URL is requested. The following code should be placed in your page_load event:
'prevent wrong case urls
If Text.RegularExpressions.Regex.IsMatch(Request.Url.PathAndQuery.Replace("Default.aspx", "default.aspx"), "[A-Z]") Then
 Response.StatusCode = 301
 Response.Status = "301 Moved Permanently"

 Dim loc As String = Request.Url.PathAndQuery.ToLower()
 If loc.EndsWith("default.aspx") Then loc = loc.Remove(loc.LastIndexOf("default.aspx"), "default.aspx".Length)
 Response.AddHeader("Location", loc)
End If
The code basically uses a 301 redirect to inform the client (spider/browser) to use the lower case equivelant of the URL. Another SEO measure is to not reveal 'default.aspx' as part of the URL (because that would be a duplicate of '/') - so the code also makes sure not to use this page name in the URL.
Permalink: Preventing Uppercase URLs in ASP.NET

Non-Databound DropDownList SelectedValue Problem

March 09, 2009

When you have an asp:DropDownList which is *not* databound, (i.e. you add the items manually) there are a few things you have to do to get 'SelectedValue' to work. ASP.NET will ignore the 'SelectedValue' when you manually add items unless you make a call to 'DataBind' on the drop down list, so you need to add this line of code to force ASP.NET to select the correct item. However, because the 'DropDownList.Items.Add()' function will accept a string, its all too tempting to pass in the string you want to display when adding the items, but this will cause an error when you try to databind! e.g.
ddlYear.Items.Clear()
For yr As Integer = Now.Date.Year - 100 To Now.Date.Year - 18
   ddlYear.Items.Add(yr.ToString())
Next

ddlYear.DataBind() '<-- This part tries to load SelectedValue - but will error!
Error: 'DropDownList' has a SelectedValue which is invalid because it does not exist in the list of items. What you need to do, is pass in 'new ListItem()' to the Items.Add() function and specify that the string is the 'text' and 'value' for the list item. e.g.
ddlYear.Items.Clear()
For yr As Integer = Now.Date.Year - 100 To Now.Date.Year - 18
  ddlYear.Items.Add(New ListItem(yr.ToString(), yr.ToString()))
Next

ddlYear.DataBind()
You should find that the drop down list will now load with your manually added items and will display the correct 'SelectedValue'!
Permalink: Non-Databound DropDownList SelectedValue Problem

Automatic Unique Names for Session/ViewState Data

February 24, 2009

When you store information using a key based storage mechanism, such as is provided by the 'Session' and 'ViewState' objects, you want to avoid referencing these directly by key name, as this means all references to the data need to be casted, leaving you liable to invalid cast exceptions, it means it is up to the developer to remember the key names and also leaves you liable to typos, spelling mistakes and accidental re-use of the same key more than once. The first step to solving this problem is to create a strongly typed representation of the data, using classes and properties and then have those properties simply persist their data to/from the chosen data store. As a very basic example, lets say you want to ask a user for their name and age and store it in the Session.
Namespace Example.SessionObjects
    Public Class CurrentUserDetails
        Public Shared Property Name() As String
            Get
                Return CStr(HttpContext.Current.Session("CurrentUserName"))
            End Get
            Set(ByVal value As String)
                HttpContext.Current.Session("CurrentUserName") = value
            End Set
        End Property

        Public Shared Property Age() As Integer
            Get
                Return CInt(HttpContext.Current.Session("CurrentUserAge"))
            End Get
            Set(ByVal value As Integer)
                HttpContext.Current.Session("CurrentUserAge") = value
            End Set
        End Property
    End Class
End Namespace
If you are using ViewState, you will declare these properties as part of your UserControl, for example. Because you are wrapping access to the data as you would a private field, you gain all the benefits of that, such as validation, initialization, default values etc. As you can see, the above example still relies on the developer coming up with a unique key for each object, which is the problem addressed by this article. By nature, all of your strongly typed properties will be unique, since they live in a unique namespace, classname, propertyname - anything else would give you a compiler error. So you can use this unique name as your key name. The following code shows how you can use the 'StackFrame' object to get a reference to the currectly executing method (which will get get_PropertyName, or set_PropertyName) and use (along with name of the declaring type) to generate a unique key for the data store:
Public Property Thing() As Integer
        Get
                Dim sfm As Reflection.MethodBase = New StackFrame().GetMethod()
                Return CType(HttpContext.Current.Session(sfm.DeclaringType.FullName &amp; sfm.Name.Remove(0, 4)), Integer)
        End Get
        Set(ByVal value As Integer)
                Dim sfm As Reflection.MethodBase = New StackFrame().GetMethod()
                HttpContext.Current.Session(sfm.DeclaringType.FullName &amp; sfm.Name.Remove(0, 4)) = value
        End Set
End Property
The reason I have used '.Remove(0, 4)' is to get rid of the 'get_' and 'set_' prefixes, so that the getter and setter function refer to the same key.
Permalink: Automatic Unique Names for Session/ViewState Data

Textbox CrLf in Firefox using AJAX UpdatePanel

February 17, 2009

When using a MultiLine textbox inside an ASP.NET AJAX update panel, you may encounter problems with carriage return line feeds in your text on the server using Firefox (and potentially other browsers). Internet Explorer uses the Windows style CrLf (13 10) for newlines in a textarea but Firefox uses Unix style Lf (10) only. On a synchronous postback it seems ASP.NET fixes this and you will get CrLf in your text on the server. However, when you are posting back asynchronously using AJAX, you only get Lf in your text when Firefox is used. In order to clean this up and have consistant data, I wrote a simple regex replace to make sure all Lf are preceded by a Cr.
public static string CleanUnixCrLf(string textIn)
{
   //firefox only uses Lf and not CrLf
   return System.Text.RegularExpressions.Regex.Replace(textIn, "([^\r])[\n]", "$1\r\n");
}
Permalink: Textbox CrLf in Firefox using AJAX UpdatePanel

Regular Expression Tools

February 02, 2009

A useful thing to have as a developer is a tool for testing your regular expressions as you build them.

There are many out there, but I have two particular favourites.

  • For simple testing when building a regex, I like to use 'RegexBuilder'. It has a very simple and quick off the mark interface where you can start typing your regex with 2 testing texts to see if it is working properly:

    REB Screenshot

  • For more advanced testing data, or to check the output based on replaces, splits, plus many other useful features, I like to use 'The Regulator'. This has a lot of useful functionality and will help you building regular expression syntax if you are unsure.

    Regulator Screenshot


I would recommend having both of these programs, as they both have their uses, they are both easy to use and between them can cater for most (if not all) of your regular expression testing needs!
Permalink: Regular Expression Tools

ASP.NET Multiple Page Load Problem

January 13, 2009

I've had this problem many a time and every time I get it again I forget what the cause was; This time I'm going to blog it!

I have encountered three similar situations where ASP.NET fires the page_load event twice, or more.

  1. In one scenario (on a postback) your page load gets called once for the postback (IsPostback = true) and then gets called again, with IsPostback = false, which can cause a world of mistery until you realize what is happening!



    This is normally caused when using a GridView with an asp:ButtonField using ButtonType=Image. This is bug in the .NET framework which can be solved by using an asp:TemplateField with an asp:ImageButton defined within it.

  2. Another scenario is you get two or more page loads everytime you load the page, whether it is a postback or not. This is usually caused by one of your HTML elements referencing "" (blank) or "#" as a source for its data (e.g. image src="", or background="#ff0000"). This is because this source string is interpreted by the browser as a relative url (which will be the current page) and therefore the browser makes a request for the page for each HTML element that 'references' it.

    To workaround this issue, make sure your images etc. reference at least something, even if it is '%20'. Also use CSS for any colouring, not HTML attributes.

  3. Another example I have found more recently, is when using as asp:ImageButton within a template field of a databound control, where the imageUrl (src) does not exist on the server (404) will cause firstly two postbacks both with IsPostback=true, but will not raise the click event. This problem does not exist in IE, but does in Firefox and potentially other browsers. To fix this, make sure that the image source of an image button exists on the server.

Permalink: ASP.NET Multiple Page Load Problem

Visual Studio, Cassini, IIS and Debugging

January 07, 2009

If you develop ASP.NET websites, you are probably aware that instead of using the built in Visual Studio Web Server (Cassini) for debugging, you can use IIS. If you have configured this for your project, Visual Studio creates a virtual directory in IIS and pressing F5 to debug won't start the built in server but will instead attach to the IIS website. Using IIS over Cassini has several advantages, however, it still doesn't allow you to edit the source while you browse the site and virtual directories can be a pain when the live site will be running from root. I prefer to leave these project settings alone and instead add an entirely new site in IIS, pointing at the codebase. This gives you more control over how the development site can be accessed and you can configure each site to use different SSL certificates, filter on host header values etc. This allows easy browsing of the latest version of the development site, and allows you to quickly make changes to the source code, recompile and refresh. Since you are doing all of this outside of the debugger you can edit source code while keeping an eye on how the changes are affecting the site. The only problem with this is if you are in the middle of using the site and you find a bug that you need to trace using the debugger, which of course isn't running in the current context. You can overcome by using the 'Debug > Attach to Process' menu and then selecting w3wp.exe. This will attach to the IIS process to and Visual Studio will load the debug symbols for the sites you are running, thus allowing you to set breakpoints in the code which will be fired by any browser triggering that line of code (also useful when you are testing accross multiple browsers). This is quite a few keystrokes to get the debugger up and running, but fortunately Visual Studio allows you to create macros and assign shortcuts to them. I found a tutorial on how to create a macro that will attach to IIS and how to set up the shortcut. I set up my shortcut key as CTRL+0 which is an easy sequence to remember. The macro that is listed on the above link did not work on my machine, because I am on a domain. I edited the script to tidy it up and to make it more robust, below:
Option Strict Off
Option Explicit Off
Imports System
Imports EnvDTE
Imports EnvDTE80
Imports EnvDTE90
Imports System.Diagnostics

Public Module AttachToIIS
    Sub Attach()
        Try
            Dim dbg2 As EnvDTE80.Debugger2 = DTE.Debugger
            Dim trans As EnvDTE80.Transport = dbg2.Transports.Item("Default")
            Dim dbgeng(3) As EnvDTE80.Engine
            dbgeng(0) = trans.Engines.Item("Managed")
            dbgeng(1) = trans.Engines.Item("Native")
            dbgeng(2) = trans.Engines.Item("T-SQL")

            Dim proc2 As EnvDTE80.Process2 = dbg2.GetProcesses(trans, Environment.MachineName).Item("w3wp.exe")
            proc2.Attach2(dbgeng)
        Catch ex As System.Exception
            MsgBox(ex.Message)
        End Try
    End Sub
End Module
If you get the error 'Invalid Index' it generally means that IIS has not loaded the application since a recompile, so you need to refresh the page in your browser first to reload it.
Permalink: Visual Studio, Cassini, IIS and Debugging

Data Binding to an Extension Method

December 10, 2008

If you have defined extension methods to your entities in your BLL, (for example to format the output, or amalgamate some result) you may wish to display the return value of the extension method in a bound control, such as an ASP.NET GridView. In order to access an extension method, you have to import the containing namespace into the codefile you are working with. The problem is that the bindings are being evaluated by the DataBinder, so even if you import the namespace using the @import directive into the page, the extension methods are only available to your code-infront code blocks, not the ASP.NET DataBinder. In order to achieve this goal, you should Import the namespace as above using @import and instead of using a BoundField, use a TemplateField (but don't use Eval or Bind), simply cast the Container.DataItem back to the extended type and manually add the call to the extension method.
<%@ Import Namespace="eCommerceFramework.BLL.EntityExtensions" %>

---------------

<asp:TemplateField HeaderText="Price">
        <ItemTemplate>
                <%# FormatCurrency(DirectCast(Container.DataItem, DDL.DTOs.ShopBundle).TotalPrice(),2) %>
        </ItemTemplate>
</asp:TemplateField>
Generally speaking, I wouldn't imagine you would be 2 way data-binding on an extension method, but if you need to make the value retreivable, put it into a runat="server" control (label/textbox) and give it an ID you can use for FindControl.
Permalink: Data Binding to an Extension Method

Matching Tags using Regular Expressions Balancing Groups

November 28, 2008

The regular expressions engine provided by the .NET Framework includes a new feature known as 'balancing groups'. This feature allows you to increment/decrement the match count of a named capturing group by giving the group a positive and negative match context. You can then test to see you have an equal number of matches, by testing if the group has a value (i.e. an effective zero result means the group was balanced). You can include this syntax in your match pattern, so that only the balanced result is considered a match. Microsoft don't really go into this much and only show a small example of matching opening and closing paranthesis. In my case, I wanted to match a specific chunk of HTML code in a file and then find the closing tag to matching the name of the opening tag. For example:
<div>
  <div class="targetContent">
   Something in here
   <div> Something else in here</div>
  </div>
</div>
Using standard regular expressions, searching for <div class="targetContent"> to </div> can work in two ways. Non-greedy mode, matches on the </div> of the inner div. In greedy mode, it matches all the way to end of the outer div. What I wanted to do, is match on the last </div> that makes the tags balance, which can be done using balancing groups! C# Code:
pattern = "<div class=\"targetContent\">.*?((?<TAG><div).*?(?<-TAG></div>))?(?(TAG)(?!))</div>";
Effectively, what the expression does is:
  1. Start the match from the div with class="targetContent"
  2. Match any internal content
  3. Whenever it encounters another div tag, it increments the TAG count
  4. Match any nested content
  5. Whenever it encounters another closing div tag, it decrements the TAG count
  6. It becomes a match when the tag count is equal
  7. Finally match on the closing tag of our outer div
This can be applied to any XML style markup, where you have the notation of opening and closing tags.
Permalink: Matching Tags using Regular Expressions Balancing Groups