Erik Lenaerts

Do, or do not. There is no try. - Yoda

Translate content using a HttpResponse filter

Suppose you want to translate content by replacing tokens with new words that you for example maintain in a central location (db). For maximum flexibility, you decided to create a Http Response Filter.

Basically you write a Http response Filter like below. Note that the filter is based on base class HttpFilter from Ben Lowery which you can download here.

    public class HtmlTranslationFilter HttpFilter
    { 
        private Stream _stream
        
private StringBuilder _responseHtml

        
/// <summary>
        /// Initializes a new instance of the <see cref="HtmlTranslationFilter"/> class.
        /// </summary>
        /// <param name="stream">The stream on which the filter will work.</param>
        
public HtmlTranslationFilter(Stream stream)
            : 
base(stream)
        
{
            _responseHtml 
= new StringBuilder();
            
_stream stream;
        


        /// <summary>
        /// When overridden in a derived class, writes a sequence of bytes to the current stream and advances 
        /// the current position within this stream by the number of bytes written.
        /// </summary>
        /// <param name="buffer">An array of bytes. This method copies count bytes from buffer to the current stream.</param>
        /// <param name="offset">The zero-based byte offset in buffer at which to begin copying bytes to the current stream.</param>
        /// <param name="count">The number of bytes to be written to the current stream.</param> 
        
public override void Write(byte[] bufferint offsetint count)
        
{
            
// Note that this method is potentially called several times by ASP.NET
            // The buffer is not written at once, but depending on the size, in blocks of 27~29 Kbytes
            // So the big question is when do we know that we have received all the content?
            // 
            // if the content is Html, then we scan for the </html> tag 


            
// Get string from the buffer
            
string strBuffer System.Text.UTF8Encoding.UTF8.GetString(bufferoffsetcount);

            
// ---------------------------------
            // Wait for the closing </html> tag
            // ---------------------------------
            
Regex eof = new Regex("</html>"RegexOptions.IgnoreCase);

            
if (!eof.IsMatch(strBuffer))
            
{
                _responseHtml.Append
(strBuffer);
            
}
            
else
            

                
_responseHtml.Append(strBuffer);
                
string htmlContent _responseHtml.ToString();

                
// Translate the html in the buffer
                
string translatedContent TranslateContent(htmlContent); 

                if (translatedContent != null)
                
{
                    
byte[] data System.Text.UTF8Encoding.UTF8.GetBytes(translatedContent);

                    
// Write to the stream
                    
_stream.Write(data0data.Length);
                
}
            }
        } 
    }

If the html send to the browser is small - let's say 2 K bytes - then ASP.Net will pass the complete HTML to our write method. However if the HTML is larger than +- 28K, then ASP.Net will call our Write method several times. With every call, we'll get a chunk of the HTML content. So, if you look at the code, you'll notice that we scan for the end html tag (</html>). As long as we don't saw any end html tag in the chunk of html, we save up the content into a StringBuilder object. At the end, we'll find our end html tag and then we can assume that all the content is provided by ASP.Net.

I looked into other techniques to "know" when we have received all of the content, but I didn't found one immediately. I also looked for a setting somewhere where I could change the buffer size to a value larger than 28K, but nothing found in that area either.

You might ask, why all the fuss to know when we have received the complete Html. Suppose you have a Token somewhere like "#token#" and by coincidence, the first chunk served by ASP.Net ends with "#to" and the next chunk starts with "ken# ". This would eventually mess up your translation system.

 

Register the Filter

You register the filter in a HttpModule on the ReleaseRequestState event as described here in my previous post. The ReleaseRequestState event is an event that occurs after the Page has rendered the html content and just before the filters are executed. This way we can be sure that we can pickup the ContentType of the Response which we'll use to filter out html pages only.

/// <summary>
/// Handles the ReleaseRequestState event 
/// </summary>
/// <param name="sender">The source of the event.</param>
/// <param name="e">The <see cref="System.EventArgs"/> instance containing the event data.</param>
void httpApplication_ReleaseRequestState(object senderEventArgs e)
{
    
// Check if we need an HtmlTranslationFilter.
    // Note that the check requires the ContentType which is only known when the content was rendered
    // therefore, we register the Html filter on the ReleaseRequestState event
    // See also: http://www.dotnet6.com/blogs/erik_lenaerts/archive/2007/07/17/don-t-register-your-http-response-filter-too-soon.aspx
    
if (_context.Response.ContentType == "text/html")
    
{
        
// Create a new filter and insert it onto the page
        // This filter will later on, translate all html content
        
_context.Response.Filter = new HtmlTranslationFilter(_context.Response.Filter); 
    }
}

 

In the next posting I'll explain how to write a filter that also works with the AJax Update panel.

Comments

Run a Http Response Filter together with an Ajax Update Panel - Erik Lenaerts said:

Pingback from  Run a Http Response Filter together with an Ajax Update Panel - Erik Lenaerts

# July 21, 2007 6:11 AM

Erik Lenaerts said:

This post continues from part one . In this previous post we created a Http Response Filter that translated

# July 21, 2007 6:13 AM

dajo said:

Erik, thanks for these posts about using HttpResponse to handle translations.  I'm going to use this for a multilingual site I am working on.  

Because I'll be farming out the translation, I'm thinking about loading the filter's translation pairs from standard XLIFF documents when the application starts.  XLIFF is XML, so it can easily be updated.  I need to learn more about XLIFF from my translation people.

I'm also thinking about the best way to tag the text in the document that needs translating.  One way might be to surround text with pound signs.  Another might be to surround the text with a <span class="xlat">, or use a custom tag like <xlat>..</xlat>.  The pound signs would be easier to find and match than any tags, because there could be embedded <span> tags.

# December 27, 2007 11:16 AM

dajo said:

Erik, thanks for these posts about using HttpResponse to handle translations.  I'm going to use this for a multilingual site I am working on.  

Because I'll be farming out the translation, I'm thinking about loading the filter's translation pairs from standard XLIFF documents when the application starts.  XLIFF is XML, so it can easily be updated.  I need to learn more about XLIFF from my translation people.

I'm also thinking about the best way to tag the text in the document that needs translating.  One way might be to surround text with pound signs.  Another might be to surround the text with a <span class="xlat">, or use a custom tag like <xlat>..</xlat>.  The pound signs would be easier to find and match than any tags, because there could be embedded <span> tags.

# December 27, 2007 11:21 AM

ErikL said:

hi dajo,

in my implementation we used translation content from a database, basically key/balue pairs. Tho when we retrieve these values, we keep the culture into account so we can translate the placeholders into culture specific content.

Our placeholders are based on the following syntax:

@@NameOfCode:[FormatSpecifier]:[length]:[Param1[&Param2][...]]#

-where NameOfCode is like the Key.

-A formatspecifier gives us the possibility to format the content for example in a short or long format. Sometimes when translating for example column headers, we want a short version of the label instead of the long one. something like "custnr." instead of "customer number".

- we use the length to truncate the content if gets too long and add an elipse to it, very usefull for tooltips.

- params are much like the string.format where we use parameter substitution in the content.

happy to hear the posting was helpfull :)

cheers

# January 9, 2008 5:05 AM

Fernando said:

Thanks man!!

# June 21, 2008 5:25 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 


Enter the numbers above: