July 2007 - Posts
From the blog posting of Scott Guthrie, Unit Testing will be available on the Pro edition of Visual Studio 2008 instead of only on the Team Suite Editions.
Unit testing support is now much faster and included in VS Professional (and no longer just VSTS)
Read here for more info on Visual Studio 2008 Unit Testing.
This post continues from part one. In this previous post we created a Http Response Filter that translated content based on a Token replacement technique.
In this article we'll discuss how to apply this technique when you are using an Ajax Update Panel. The problem with the update panel is that the XmlHttp request made to the server is responded with partial html fragments. Nothing wrong with that in essence, however, our Http Response Filter from part one requires a full html document, more exactly it looks for the end html tag (</html>) to know when all of the content has been received.
Ajax update panel Specifics
Requests made by Ajax for the Updatepanel are a bit different than regular requests more exactly:
- They are of content-type "text/plain"
- They are requested with a Request Header "x-microsoftajax"
Also the responses server by the Ajax server side code are different:
- They are not wrapped in <html> tags because they are partial html fragments
- They contain a 'special header' (read here and here for very useful info).
- They contain two parts, one part with the content and another one containing things like the viewstate .
Mmh, so if we don't have any end html tag, how will we know when we have received all of our content? Well the answer lies in the "special header" which looks like :
Header: Length + ‘|’ + type + ‘|’ + id + ‘|’
Body: html content
Footer: ‘|’
The first part in this header contains the length of the body so that's really interesting. As you might recall from the previous part, ASP.Net serves our data in chunks of +-28K by calling the write method several times. So, we'll keep on collecting the data until we have received as much as bytes as specified in the length.
If you intercept the response of an Update panel by using for example Fiddler, you'll notice that besides the content there's also a second part that contains data like the viewstate. This is surely not something you want to translate. So, our goal is to only process part one "the content".
So, after all of the content has been collected, we can translate the tokens. However, when changing the content, the length of it will change as well. So, we need to recalculate the new length and set it accordingly into the special header.
The filter
Knowing all of this, we can start by creating our filter.
/// <summary>
/// The <c>AjaxTranslationFilter</c> class
/// </summary>
public class AjaxTranslationFilter : HttpFilter
{
private StringBuilder _responseHtml;
private int _contentLength = 0;
private bool _partOne = true;
/// <summary>
/// Initializes a new instance of the <see cref="AjaxTranslationFilter"/> class.
/// </summary>
/// <param name="stream">The stream on which the filter will work.</param>
public AjaxTranslationFilter(Stream stream)
: base(stream)
{
}
/// <summary>
/// When overridden in a derived class, writes a sequence of bytes to the current stream and advances
/// the current position within this stream by the number of bytes written.
/// </summary>
/// <param name="buffer">An array of bytes. This method copies count bytes from buffer to the current stream.</param>
/// <param name="offset">The zero-based byte offset in buffer at which to begin copying bytes to the current stream.</param>
/// <param name="count">The number of bytes to be written to the current stream.</param>
public override void Write(byte[] buffer, int offset, int count)
{
string newContent = null;
// Note that this method is potentially called several times by ASP.NET
// The buffer is not written at once, but depending on the size, in blocks of 27~29 Kbytes
//
// Since this is an Ajax call, we'll find the total number of bytes in the 'special' Ajax header
//
// Info
// * http://weblogs.asp.net/leftslipper/archive/2007/02/26/sys-webforms-pagerequestmanagerparsererrorexception-what-it-is-and-how-to-avoid-it.aspx
// * http://www.manuelabadia.com/blog/SyndicationService.asmx/GetRssCategory?categoryName=Ajax
// get buffer content
string strBuffer = System.Text.UTF8Encoding.UTF8.GetString(buffer, offset, count);
// This filter is called in two parts but only the first part contains content that could be translated
if (_partOne)
{
// determine the content length during first run
if (_contentLength == 0)
{
// Check for a valid Ajax header
Regex regEx = new Regex(@"^(?<length>\d+)\|[^\|]*\|[^\|]*\|", RegexOptions.Singleline);
Match m = regEx.Match(strBuffer);
if (m.Success)
{
// Read the length
Group group = m.Groups["length"];
_contentLength = Convert.ToInt32(group.Value);
// initialise the StringBuilder (we assume that translations increase
// the size by 20%
_responseHtml = new StringBuilder((int)(_contentLength * 1.2));
}
else
throw new SystemException("Unable to parse content length from Ajax header");
}
// Add buffer to total buffer
_responseHtml.Append(strBuffer);
// Is all data received?
if (_responseHtml.Length >= _contentLength)
{
//we have received all the data by now, so we can translate the content
string ajaxContent = _responseHtml.ToString();
// Translate the tokens in the html
string translatedContent = TranslateContent(ajaxContent);
// Calculate new content length
int newContentLength = translatedContent.Length - (_responseHtml.Length - _contentLength);
// Set new content length
Regex regex2 = new Regex(@"^(?<length>\d+)(?<rest>\|[^\|]*\|[^\|]*\|)", RegexOptions.Singleline);
newContent = regEx2.Replace(translatedContent, newContentLength + "${rest}");
if (translatedContent != null)
{
byte[] data = System.Text.UTF8Encoding.UTF8.GetBytes(newContent);
// Write to the stream
BaseStream.Write(data, 0, data.Length);
}
_partOne = false;
}
}
else
{
// After the first part has been processed, just forward the other content to the browser.
// this can also occur in multiple times if this 'rest'-data is totally larger than +-28K
BaseStream.Write(buffer, offset, count);
}
}
}
When to register the Filter
We know that the content contains 2 parts. The first part where the actual content is provided and in the second part stuff like viewstate and client side event binding is produced. As mentioned, for our translations, we are only interested in the first part.
If we look at the points in time when these two parts are rendered by Ajax then we can see the following order:
PreRequestHandlerExecute
- - Part One
PostRequestHandlerExecute
ReleaseRequestState
- - Part two
So I figured, I just unregister the filter on the PostRequestHandlerExecute event, but I didn't found any way to do that. Because of this, I had to deal with the concept of part one and two in my filter code.
As opposed to HtmlTranslationFilter of my previous article we can't register the Filter on the ReleaseRequestState, since it would be too late to receive Part One. However when we register our filter before the Page/Control Request Handler (as explained here), then we don't know our content type and we can't check for "text/plain" content types. Too solve this dilemma, we check to register our AjaxTranslationFilter based on the Header=x-microsoftajax. Since the header is part of the request we can register our filter at any point in time before the PreRequestHandlerExecute in our case the BeginRequest:
/// <summary>
/// Handles the BeginRequest event of the httpApplication.
/// </summary>
private void httpApplication_BeginRequest(object sender, EventArgs e)
{
// Check if we need an AjaxTranslationFilter.
// Note that the check requires an Request header, therefore we can register this filter early
// if we would register it too late then we would miss out
// some data that is handled by the Ajax ScriptModule HttpModule
if (!string.IsNullOrEmpty(_context.Request.Headers["x-microsoftajax"]))
{
// Create a new filter and insert it onto the page
// This filter will later on, translate all content
_context.Response.Filter = new AjaxTranslationFilter(_context.Response.Filter);
}
}
To execution chain is then as this:
BeginRequest
- - Ajax Filter Registered
PreRequestHandlerExecute
- - Ajax Filter - Part One
PostRequestHandlerExecute
ReleaseRequestState
- - Ajax Filter - Part two
- Enjoy
Suppose you want to translate content by replacing tokens with new words that you for example maintain in a central location (db). For maximum flexibility, you decided to create a Http Response Filter.
Basically you write a Http response Filter like below. Note that the filter is based on base class HttpFilter from Ben Lowery which you can download here.
public class HtmlTranslationFilter : HttpFilter
{
private Stream _stream;
private StringBuilder _responseHtml;
/// <summary>
/// Initializes a new instance of the <see cref="HtmlTranslationFilter"/> class.
/// </summary>
/// <param name="stream">The stream on which the filter will work.</param>
public HtmlTranslationFilter(Stream stream)
: base(stream)
{
_responseHtml = new StringBuilder();
_stream = stream;
}
/// <summary>
/// When overridden in a derived class, writes a sequence of bytes to the current stream and advances
/// the current position within this stream by the number of bytes written.
/// </summary>
/// <param name="buffer">An array of bytes. This method copies count bytes from buffer to the current stream.</param>
/// <param name="offset">The zero-based byte offset in buffer at which to begin copying bytes to the current stream.</param>
/// <param name="count">The number of bytes to be written to the current stream.</param>
public override void Write(byte[] buffer, int offset, int count)
{
// Note that this method is potentially called several times by ASP.NET
// The buffer is not written at once, but depending on the size, in blocks of 27~29 Kbytes
// So the big question is when do we know that we have received all the content?
//
// if the content is Html, then we scan for the </html> tag
// Get string from the buffer
string strBuffer = System.Text.UTF8Encoding.UTF8.GetString(buffer, offset, count);
// ---------------------------------
// Wait for the closing </html> tag
// ---------------------------------
Regex eof = new Regex("</html>", RegexOptions.IgnoreCase);
if (!eof.IsMatch(strBuffer))
{
_responseHtml.Append(strBuffer);
}
else
{
_responseHtml.Append(strBuffer);
string htmlContent = _responseHtml.ToString();
// Translate the html in the buffer
string translatedContent = TranslateContent(htmlContent);
if (translatedContent != null)
{
byte[] data = System.Text.UTF8Encoding.UTF8.GetBytes(translatedContent);
// Write to the stream
_stream.Write(data, 0, data.Length);
}
}
}
}
If the html send to the browser is small - let's say 2 K bytes - then ASP.Net will pass the complete HTML to our write method. However if the HTML is larger than +- 28K, then ASP.Net will call our Write method several times. With every call, we'll get a chunk of the HTML content. So, if you look at the code, you'll notice that we scan for the end html tag (</html>). As long as we don't saw any end html tag in the chunk of html, we save up the content into a StringBuilder object. At the end, we'll find our end html tag and then we can assume that all the content is provided by ASP.Net.
I looked into other techniques to "know" when we have received all of the content, but I didn't found one immediately. I also looked for a setting somewhere where I could change the buffer size to a value larger than 28K, but nothing found in that area either.
You might ask, why all the fuss to know when we have received the complete Html. Suppose you have a Token somewhere like "#token#" and by coincidence, the first chunk served by ASP.Net ends with "#to" and the next chunk starts with "ken# ". This would eventually mess up your translation system.
Register the Filter
You register the filter in a HttpModule on the ReleaseRequestState event as described here in my previous post. The ReleaseRequestState event is an event that occurs after the Page has rendered the html content and just before the filters are executed. This way we can be sure that we can pickup the ContentType of the Response which we'll use to filter out html pages only.
/// <summary>
/// Handles the ReleaseRequestState event
/// </summary>
/// <param name="sender">The source of the event.</param>
/// <param name="e">The <see cref="System.EventArgs"/> instance containing the event data.</param>
void httpApplication_ReleaseRequestState(object sender, EventArgs e)
{
// Check if we need an HtmlTranslationFilter.
// Note that the check requires the ContentType which is only known when the content was rendered
// therefore, we register the Html filter on the ReleaseRequestState event
// See also: http://www.dotnet6.com/blogs/erik_lenaerts/archive/2007/07/17/don-t-register-your-http-response-filter-too-soon.aspx
if (_context.Response.ContentType == "text/html")
{
// Create a new filter and insert it onto the page
// This filter will later on, translate all html content
_context.Response.Filter = new HtmlTranslationFilter(_context.Response.Filter);
}
}
In the next posting I'll explain how to write a filter that also works with the AJax Update panel.
Since a while we encounter more and more problems for one of the websites I'm working on (www.morres.com). Somewhere in the first Quarter of 2007, the site dissapeared from the radar of Google entirely.
Question is, what wen't wrong? We spent a great deal of the project budget into an Search Engine friendly site like:
- Url rewriting
- Inclusing of industry keywords
- Meta Tags
- Correct use of propper HTML (like H1, H2, etc)
- Fully use of alt and title attributes for hyperlinks and images
- a google friendly sitemap
- HTTP 301 redirection of alternative domain names like www.morres.nl, etc
- ...
All of this effort resulted into no listing in Google... a frustrating period I must say :S.
I started with Google Webmaster tools and verified the site by uploading a verification file. After this verification process I got lots of information from the tools.
On the Diagnostics page, I saw that the last "successful" crawl was like 6 months ago and the reason state was "We can't currently access your home page because of an unreachable error".
In the list of Unreachable Url's, the home page www.morres.com showed up with an HTTP 500 code from a crawl a few days ago. Strangely enough, in my browser, the page www.morres.com just showed up fine... strange, strange. I used Fiddler to check the HTTP results codes; no HTTP 500's?
So, I started dgging arround in news groups, blogs, etc.
Broken Links?
In my search I stumbled upon a tool Xenu, to report broken links on your site, a very usefull tool indeed. Although I found some broken links, which we fixed off course, none of the links resulted into a HTTP 500.
Canonical server name issues?
In this thread, I learned that google might access our site without the host name so http://morres.com. Lucky we can straigthen out this problem by means of an HTTP 301 (redirection) from the http://morres.com to http://www.morres.com on our webservers. Noneteless, this didn't worked either.
Validators
After some additional surfing, I came to the idea to validate the HTML output from our web site.
I used the following validators:
- Markup validator: This is the W3C Markup Validation Service, a free service that checks Web documents in formats like HTML and XHTML for conformance to W3C Recommendations and other standards.
- Link Checker: Checks anchors (hyperlinks) in a HTML/XHTML document. Useful to find broken links, etc.
- CSS Validator: validates CSS stylesheets or documents using CSS stylesheets
We corrected some of the HTML and CSS issues (which were in my opinion very tiny little details, but hey, after a while you'll try everything) again, we no notable change for Google's problems.
The link checker is in fact similar with the tool from Xenu. I ran it on the home page and not following any 2e level links to keep the results limited. The only problem that was reported was this link:
BLOCKED SCRIPThistory.go(-1)
We provide this back button for the user in our navigation bar left from our breadcrumb. I searched the internet if this type of javascript used in an HREF could cause any troubles, however no one has complained about this. For the save side, we removed this "feature" temporary. (I actually wonder if people ever use it all).
Spider simulator
I wanted to know how googlebot (the spider/crawler) from google sees our pages and therefore I ran this spider simulator. A very nice simulator and especially interesting for keyword analysis. However, I could see any problems reported by the simulator.
ASP.Net 2.0 redirections
EUREKA It seems that ASP.Net 2.0 contains an error when it comes to its Url Redirection technique based on RewritePath. This method seems to work well for certain User Agents but not for all of them. Guess what, Googlebot was one of the User Agents where things went wrong.
You can read detailed information on this subject here.
I'm working on a Translation system for one of my customers and I used a Http response Filter. The filter basically looks for certain patterns and replaces these with new values. This way we translate some content of the pages.
The filter is registered using an HttpModule like shown in the code below. The goal of this filter is to only translate HTML pages and ignore the others like images, javascripts, css, ...
// Only apply the translation filter on html content, since content like javascrip shouldn't be
// translated
if (_context.Response.ContentType == "text/html")
{
// Create a new filter and insert it onto the page
// This filter will later on, translate all content
_context.Response.Filter = new TranslationFilter(_context.Response.Filter);
}
Note that in the example code above that _context is a class level variable initialized in the Init method of the IHttpModule interface
/// <summary>
/// Inits the specified application.
/// </summary>
/// <param name="application">The current Http Application</param>
public void Init(System.Web.HttpApplication application)
{
_context = application;
//...
}
The point of this post is to underline WHEN you should register the filter. If we look at the Application Life Cycle like in the picture below. If you would register the filter during the BeginRequest as I did initially then you won't have any information on the ContentType via _context.Response.ContentType. At the BeginRequest point in time, the page isn't executed yet and hence no content type information is set. So, you need to register this later on like on the ReleaseRequestState.
(Image referenced from this article)
Complete code:
using System;
using System.Collections.Generic;
using System.Text;
using System.Web;
using System.Text.RegularExpressions;
namespace Translation
{
/// <summary>
/// Translation HttpModule
/// </summary>
public class TranslationHttpModule : IHttpModule
{
private System.Web.HttpApplication _context;
/// <summary>
/// Disposes of the resources (other than memory) used by the module that implements <see cref="T:System.Web.IHttpModule"></see>.
/// </summary>
public void Dispose() { }
/// <summary>
/// Inits the specified application.
/// </summary>
/// <param name="application">The current Http Application</param>
public void Init(System.Web.HttpApplication application)
{
_context = application;
_context.ReleaseRequestState += new EventHandler(_context_ReleaseRequestState);
}
void _context_ReleaseRequestState(object sender, EventArgs e)
{
// Only apply the translation filter on html content, since content like javascrip shouldn't be
// translated
if (_context.Response.ContentType == "text/html")
{
// Create a new filter and insert it onto the page
// This filter will later on, translate all content
_context.Response.Filter = new TranslationFilter(_context.Response.Filter);
}
}
}
}
On a regular basis we needed the thickbox functionalities on our ASP.Net web sites. I'm a fan of the thickbox implementation from Codey Lindley. We wanted to make this functionality easier to consume in ASP.Net so we decided to package this functionality into an Server Control as we did last time with the tooltip control.
Again, I'm not taking any credit here, just spreading the word :)
Thnx Wannes & Didr for helping me out here to develop this control.
Download the complete code (Visual Studio 2005 Solution) here
Download the control's Assembly here
Edit : Source code including button support from Madas
i recently finished a new website in my spare time. The web site is for a friend who owns a interior design store. you can take a look at the site here:
http://www.nieuwinckel.com
To be honest this posting is more to get some link popularity for that site :) anyway, hope it helps ...
Very handy if you always need to start the VS Help to lookup on those date or number formats....
http://john-sheehan.com/blog/wp-content/uploads/msnet-formatting-strings.pdf