Where I work, we use a heavy amount of URL rewriting for a couple goals.
- SEO (Search Engine Optimization) - We use 301-redirects to correct URLs that are not in a format that we want indexed. For example, with my domain, http://obishawn.com would cause a 301-redirect to http://www.obishawn.com/ ensuring that the search engines do not index http://obishawn.com separately from http://www.obishawn.com/ (unfortunately, my blog software does not employ these techniques).
- Legacy URLs - Pages from the old version of a site need to be 301-redirected to a new page due to a site reorganization. From an SEO standpoint, you should never rename a page, but sometimes it's necessary. So we have a URL rewriter that redirects a certain list of URLs to new URLs.
- Virtual URLs - Dynamic pages that are loaded from a virtual URL. For example, http://www.mydomain.com/products/a12345/ would actually be hosted by http://www.mydomain.com/product.aspx?sku=a12345.
The framework that we wrote for the URL rewriters works great, are very efficient (well as efficient as processing the URL of every single request can be) and we can turn certain rewriters on or off via the web.config in case we need to do some testing or bug hunting. The problem lies with the Virtual URLs and specifically when those pages do postbacks to the server.
The rewriters for SEO and the legacy URLs perform 301-redirects. This means that a response is sent back to the client when it requests a URL informing the client that URL does not exist and that file has been permanently moved to a new URL and then supplies the client with the new URL. The client can then request the new URL. In your browser, this will cause the URL in your address bar to change. The virtual URLs however are a different story. This is a server side transfer of the request on to a page other than the one that you would expect based on the URL.
Take my example from above (http://www.mydomain.com/products/a12345/). I would expect the page '/products/a12345/default.aspx' (or some other default page based on the type of web application - PHP, ASP, HTML, etc.) to process that request. Little do I know from a client machine that the directory 'products' doesn't even exist on the server let alone the page '/products/a12345/default.aspx'. Instead, the server sees that request and realizes that it should be processed by a different URL, namely http://www.mydomain.com/product.aspx?sku=a12345. Product.aspx can then render the response that should be sent to the user. All without changing the URL in the client browser's address bar. The virtual URL looks a little nicer than the product.aspx URL with a query string.
The issue is that if product.aspx has to postback to the server for whatever reason (maybe you have a pricing table that updates based on an option in a dropdown box). Because of how the request was transferred to product.aspx, the postback URL is 'product.aspx?sku=a12345'. So the URL looks really nice and can hide what programming language the server is using, but as soon as you postback, all that niceness and hiding goes out the window as the URL in the address bar changes to the product.aspx?sku=a12345 URL. The source of this problem is the 'action' attribute on the 'form' element of the rendered page. This attribute is automatically set to the URL servicing the request and not the URL in the address bar. So how do you get around this?
Easy! With the ActionlessForm!
The trick is to have a custom form control that does not render the 'action' attribute. If there's no 'action' attribute, the page just posts back to the URL in the address bar. Here's the ActionlessForm solution from straight from Microsoft. There are many problems with this solution. The biggest that I ran into is the 'onsubmit' attribute and the fact that in Microsoft's solution, this attribute does not get rendered, ever. This attribute holds the JavaScript postback scripts that have to run in order for client-side validation to be done on a form. That's kind of a big problem. If you take a look at the source code for the HtmlForm control's RenderAttributes function, there's a lot that this solution is missing. I've come up with a better solution. Two classes are needed. MyHtmlTextWriter which inherits from HtmlTextWriter and the ActionlessForm class which inherits from HtmlForm.
Simply put, the ActionlessForm creates his own custom HtmlTextWriter from the one passed into it's RenderAttributes function. The custom HtmlTextWriter then ignores the action attribute as long as it does not begin with default.aspx. The reason for the default.aspx exclusion is I ran into issues when http://www.mydomain.com/ tried to postback to the server without an action attribute on the form element. This will have to be modified if you use other default pages such as index.aspx.
public class ActionlessForm : HtmlForm
{
private class MyHtmlTextWriter : HtmlTextWriter
{
public MyHtmlTextWriter(TextWriter writer)
: base(writer)
{
}
public override void WriteAttribute(string name, string value)
{
if (name != "action" || value.StartsWith("default.aspx"))
base.WriteAttribute(name, value);
}
public override void WriteAttribute(string name, string value, bool fEncode)
{
if (name != "action" || value.StartsWith("default.aspx"))
base.WriteAttribute(name, value, fEncode);
}
}
protected override void RenderAttributes(HtmlTextWriter writer)
{
base.RenderAttributes(new MyHtmlTextWriter(writer));
}
}
0 comments:
Post a Comment