Basic idea is to create "secondary site" - static one - showing EXACTLY the same content as primary one. If bot gets to secondary site, it can index it well because of static nature.
If user gets to the specific page crawled by bot, it just gets redirected to the dynamic version of the site, showing the same content.
Here are some key tools I'm going to use:
- OpenSymphony SiteMesh - good decorating filter. It would allow us to wrap the entire page into some kind of template. For example, we could put navigation header/footer for bots and old browsers and redirect statement for modern browsers, deciding on user-agent.
- Some Java-based CMS for generating/accessing content. Not decided yet. Even better is to use Wiki engine.
- Google Web Toolkit, surely.
- Tomcat as a servlet/jsp container.
So, waiting for a demo. Hope I'll have several spare hours to get hands on this interesting stuff.
By the way, there's one interesting problem I can see now: If I put static content inside HTML panel, this would cause problems with navigation and links. For wiki, especially. Any link displayed would lead us out of the dynamic application, which is not desired.
Alternate solution is to put content to the IFRAME element. Don't think I really like it, but...
 

 
 Posts
Posts
 
 
3 comments:
Sometimes google bot verifies web-content sending instead google-bot header, for example, mozilla one.
How will this work with it?
Think, I'll need to develop (or better search for) valid and easy automated Turing test.
:)
Decision on User-Agent should really say 'Decision of whether this user is human or bot.'
That is good task,
Thank you for idea, Dima!
Your post was really admirable thank you so much for such an informative piece of content. Please check out some more information regarding what is serps.
Post a Comment