Monday, May 11, 2009

google search appliance experience

Over the past few months I've gotten intimately acquainted with the Google Search Appliance(gsa), which is basically a version of Google you can purchase to search and index your organization's content. The Google search feature we are currently offering our users is good but it could be better and road here was filled with many obstacles.

When we first got the appliance there was this sense that not only was it "plug and play" but that as soon as we plugged it into our network it was just going to out and search everything, potentially choking our network. While the gsa can generate a good deal of traffic, this just couldn't be farther from the truth. We actually spent months, researching and attempting through trial and error to get the gsa to index our content including a legacy document management system.

Our system happens to rely on a combination of windows network authentication and cookies, which at first glance of the documentation it looks like the gsa could support. What they don't seem to tell you is these 2 authentication methods can not be used in tandom. You have to pick one or the other. The other thing they don't tell you is that when they say that the gsa supports cookies what they really mean is that it doesn't support cookies but if you really have to use them they will let you through a convoluted shceme where you direct the gsa to a login form, which may or may not exist on your system, which then generates a cookie which the gsa will store and use from then untill some set point in time in the future.

What we ended up doing was giving the gsa a windows network login and then coding around the system's use of cookies for that user. This was one of the many code changes we have had to make to accomodate the gsa. More on this later but for now I'm off to the to Silverstripe cms talk and indyhall classic.