Hybrid cloud search solution concept is not new in SharePoint world. This feature was introduced in SP 2013 to enable the hybrid search scenarios between SharePoint online and SharePoint on premise environment.
I worked on one SharePoint Hybrid cloud solution last year with one of the Western Europe customers. Configuring the search solution in SP 2013 with Office 365 (SharePoint Online) was so complex. In this search solution, there are two configuration required. One is inbound search and another is outbound search.
Inbound search means, users are searching from SharePoint Online environment where as outbound search means users are searching the content from SharePoint On-Premise environment. Relatively, outbound search configuration is easy whereas inbound search configuration requires an additional step to know about the reverse proxy etc.
With these complexities, this is not really solving the business scenario as well. This search solutions only federates the search results from one environment which we call it as query federation. It just enables the seamless integration. Other disadvantages with this solution is, it doesn’t provide the unified search results. When I was doing the demo with customer, some of the stakeholders asked us, how to enable the unified search results with given default ranking, so that end users doesn’t need to search in multiple place, even though we enabled it through the one single interface.
New Cloud Hybrid Search
In this Architecture, there are number of advantages
- All the content including the on-premise content get indexed in SharePoint online. On-Premise content is secured with encryption while transfer the content from SharePoint On-premise to SharePoint Online.
- Admins doesn’t need to worry about the size of the search index. Only crawling component needs to plan in SharePoint on-premise, the remaining content processing and querying component managed by SharePoint Online.
- From user’s perspective, users get the newest SharePoint online experience without need to upgrade the on-premise environment.
- During the Upgradation of the next version, no need to migrate the search index.
How Hybrid Cloud Search works?
To enable the Hybrid cloud in SP 2016, first need to configure the Cloud Search service application. This is very similar to the SP 2013 cloud search service application, but this does only the crawling and parsing the content. This component doesn’t do any content processing component and indexing. This will be taken care by the SharePoint online environment.
What’s happening inside the crawling and indexing?
Once the content source crawling is started from the central administration, crawler components does the following
- Cloud SSA – Crawler components goes to content source and downloads the content.
- It parses the content based on the File Type. Generally, parsing the content will happen during the content processing in SP 2013. Now this will happen during the crawling process. All the ifilters are required to be installed in this crawling environment. Otherwise, this will be skipped by the system because of unknown file type.
- Crawlers extracts the metadata and text of the content.
- Crawler encrypts the content and sends the metadata and text of the content to Office 365 in batches with secured connection.
- SharePoint Online content processing remains the same as SP 2013, other than ACL mapping of On-premise Identity to AAD Online identity. All the SID of the users in On-Premise environment are mapped to PUID’s of the AAD. Groups ID are mapped to Object Ids.
Note: Everyone and Authenticated users are mapped to everyone except external users.
- Now the processed content with security trimming has put into the index.
- The acknowledgement sent back to the on-premise crawler component.
What’s happening inside the Query component?
The function of query components is same as in SP 2013/SharePoint Online. There is no change. However, it is very important to understand that how the on-premise user account is mapped and query the search results.
Scenario 1: User searching content in the SharePoint Online.
This is a straight forward scenario, when user searches the content, SharePoint Online Query component serve the results and results will be displayed in the Search results Page. All the security will be honored by the SP online.
Scenario 2: User searching on-premise content.
When user issued a query to search the on-premise content, query has to be executed in SPO Index and also it should take care of the access of the content. What happens inside is, when query has been issued, the user claims also send across to SPO. Then user token gets rehydrated with online claims as user is authenticated against office 365.
Configure the Hybrid Cloud SSA
Configure the Hybrid cloud SSA is not as difficult to the previous Hybrid search setup. The high levels steps are
- Synchronize the on-premise user’s accounts with Office 365. The Azure AD connect is the latest version tool to synchronize the on-premise users with Azure AD. The detailed steps documented here.
- Create Cloud SSA. – As already explained, Cloud SSA is Search service application which push the crawled metadata properties and text to SPO online with secured connection. Microsoft provided the scripts, I have executed in my environment, its very straight forward script.
Note: when Cloud Index is true, then it will not index the content in the on-premise environment.
Example : $searchApp = New-SPEnterpriseSearchServiceApplication -Name $SearchServiceAppName -ApplicationPool $appPool -DatabaseServer $DatabaseServerName -CloudIndex $true
- Connect Cloud SSA to Office 365
To establish the connection, there is a script called OnBoard-CloudHybridSearch.ps1. Execute the script with following parameters (Office 365 URL, Cloud SSA Name, tenant admin credentials).
About IsExternalContent Property
The general question arise how to differentiate the content comes from On-Premise and Online. This is easily identified the managed property called IsExternalContent.
For only On-Premise content -> “(? (searchTerms) IsExternalContent:true)”.
For Only Online content -> “(? (searchTerms) NOT IsExternalContent:true)”.
With this approach, there are some supported and unsupported customization.
- BCS Connector – Many enterprise customer are using the BCS connector to index the LOB System. This continues to Support in this model as well.
- Custom Ifilters – Yes, There are many third parties Ifilters to crawl and extract the content.
- Partner Connectors – BA Insight has lot of connectors to various LOB System. That continues to support.
- Custom security trimming
- Custom Entity extraction
- Content enrichment web service.
The above three service will not be supported in the hybrid cloud search scenarios.