To create or edit a crawl rule
1. Verify that the user account that is performing this
procedure is an administrator for the Search service application.
2. In Central Administration, in the Application
Management section, click Manage Service Applications.
3. On the Manage Service Applications page, in the list of
service applications, click the Search service application.
4. On the Search Administration page, in the Crawling
section, click Crawl Rules. The Manage Crawl Rules page appears.
5. To create a new crawl rule, click New Crawl Rule.
To edit an existing crawl rule, in the list of crawl rules, point to the name
of the crawl rule that you want to edit, click the arrow that appears, and then
click Edit.
6. On the Add Crawl Rule page, in the Path section:
- In the Path box, type the path to which the crawl rule will apply. You can use standard wildcard characters in the path.
- To use regular expressions instead of wildcard characters, select Use regular expression syntax for matching this rule.
I. Exclude all
items in this path. Select this option if you want to exclude all items in the specified path
from crawls. If you select this option, you can refine the exclusion by
selecting the following:
Exclude complex URLs (URLs that contain question marks
Select this option if you want to exclude URLs that
contain parameters that use the question
mark (?) notation.
II. Include all
items in this path. Select this option if you want all items in the path to be crawled. If
you select this option, you can further refine the inclusion by selecting any
combination of the following:
Follow links on
the URL without crawling the URL itself. Select this option if you want to
crawl links contained within the URL, but not the starting URL itself.
Crawl complex
URLs (URLs that contain a question mark (?)). Select this option if you want to
crawl URLs that contain parameters that use the question mark (?) notation.
Crawl
SharePoint content as http pages. Normally, SharePoint sites are crawled by using a
special protocol. Select this option if you want SharePoint sites to be crawled
as HTTP pages instead. When the content is crawled by using the HTTP protocol,
item permissions are not stored.
8. In the Specify Authentication section, perform one
of the following actions:
- To use the default content access account, select Use the default content access account.
- If you want to use a different account, select Specify a different content access account and then perform the following actions:
1. In the Account
box, type the user account name that can access the paths that are defined in
this crawl rule.
2. In the Password
and Confirm Password boxes, type the password for this user account.
3. To prevent
basic authentication from being used, select the Do not allow Basic
Authentication check box. The server attempts to use NTLM authentication.
If NTLM authentication fails, the server attempts to use basic authentication
unless the Do not allow Basic Authentication check box is selected.
- To use a client certificate for authentication, select Specify client certificate, expand the Certificate menu, and then select a certificate.
- To use form credentials for authentication, select Specify form credentials, type the form URL (the location of the page that accepts credentials information) in the Form URL box, and then click Enter Credentials. When the logon prompt from the remote server opens in a new window, type the form credentials with which you want to log on. You are prompted if the logon was successful. If the logon was successful, the credentials that are required for authentication are stored on the remote site.
- To use cookies, select Use cookie for crawling, and then select either of the following options:
1. Obtain cookie
from a URL. Select this
option to obtain a cookie from a website or server.
2. Specify cookie
for crawling. Select this
option to import a cookie from your local file system or a file share. You can
optionally specify error pages in the Error pages (semi-colon delimited)
box.
- To allow anonymous access, select Anonymous access.
Click OK.
To test a crawl rule on a URL
1.
Verify that the user account that is performing this
procedure is an administrator for the Search service application.
2.
In Central Administration, in the Application
Management section, click Manage Service Applications.
3.
On the Manage Service Applications page, in the list of
service applications, click the Search service application.
4.
On the Search Administration page, in the Crawling
section, click Crawl Rules.
5.
On the Manage Crawl Rules page, in the Type a URL and
click test to find out if it matches a rule box, type the URL that you want
to test.
6.
Click Test. The result of the test appears below
the Type a URL and click test to find out if it matches a rule box.
To delete a crawl rule
1.
Verify that the user account that is performing this
procedure is an administrator for the Search service application.
2.
In Central Administration, in the Application
Management section, click Manage Service Applications.
3.
On the Manage Service Applications page, in the list of
service applications, click the Search service application.
4.
On the Search Administration page, in the Crawling
section, click Crawl Rules.
5.
On the Manage Crawl Rules page, in the list of crawl
rules, point to the name of the crawl rule that you want to delete, click the
arrow that appears, and then click Delete.
6.
Click OK to confirm that you want to delete this
crawl rule.
To reorder crawl rules
1.
Verify that the user account that is performing this
procedure is an administrator for the Search service application.
2.
In Central Administration, in the Application
Management section, click Manage Service Applications.
3.
On the Manage Service Applications page, in the list of
service applications, click the Search service application.
4.
On the Search Administration page, in the Crawling
section, click Crawl Rules.
5.
On the Manage Crawl Rules page, in the list of crawl
rules, in the Order column, specify the crawl rule position that you
want the rule to occupy. Other values shift accordingly.
No comments:
Post a Comment