Friday 17 February 2012

Search result URLs not correctly mapped to the current zone when the content is crawled in a different farm

This one is just a nice-to-know. I didn't know about this setting, so I'll briefly document it here for future reference.

Situation
The set-up consists of a typical enterprise services farm (http://technet.microsoft.com/en-us/library/cc560988.aspx#Enterprise):

  • Farm A hosts the collaboration web-applications, in which the users upload their documents and other content.
  • Farm B is a remote services farm and hosts the cross-farm service applications. This includes all of our search service application(s). These service applications are published and then consumed on Farm A. 
Farm B crawls farm A using the web-application's default zone url, e.g. http://sps2010. When a user performs a search in farm A on the default zone, all of his search results link back to the default zone. So far so good.

The problem
Now lets say we add a new Intranet AAM zone on farm A's web-applications, which points to http://sps2010intra. When a user performs a search from that zone, all of his search results will again refer to the default zone!

The explanation
This is actually pretty logical: the content is crawled on farm B, using the default zone's URL.
In a single farm setup, the search service application will have access to the crawled web-application's alternate access mappings and will take those into account when a query is done on the crawled content.
Here however, the search service application is hosted on a different farm (farm B) and that search service application doesn't know how the alternate access mapping is set up on farm A!

The solution
When you google this problem, you will get a lot of posts saying that you need to use the Server Name Mapping setting on your search service application. There's a good overview of this setting here.

Personally I would not recommend using that approach, as it only works for one AAM zone.  There's no way to map multiple zones (intranet AND internet for instance) using the server name mapping route.

A much more flexible solution is to mimic the crawled content farm's alternate access mappings on the remote services farm. If you did set-up server name mappings to fix this: delete them and recrawl! ;)
  1. Open the central admin site on your remote services farm (farm B).
  2. Click on Application Management and select Configure alternate access mappings (in the web applications section).
  3. Now click on Map to external resource in the toolbar. This little bugger allows you to create an alternate access mapping for resources that aren't in the farm.
  4. Give it a meaningful name, like Collab {Name of the crawled webApp}. Enter the crawled web-application's default zone URL in the URL protocol, host and port field.
  5. Click on Save to create the external resource AAM. Now you can set this resource up as if it were a web-application on your remote services farm!
  6. Use Edit Public URLs and Add Internal URLs to configure the external resource's AAM just like it is set up on your crawled farm (farm A).
That's it ... no need to recrawl or anything. If all was done well, search results on the content farm (Farm A) will now be correctly translated according to their zone, even while the content was indexed on another farm.





No comments:

Post a Comment