The Indexing Service on Windows 2000 allows us to create a search engine for our site. Documentation on it though, is amazingly scarce and scattered. There’s plenty out there on how to use it with IDQ/HTX templates, but as far as I see it, there are three fundamental problems with those:
- they are hard to use as they require you to learn a separate language syntax.
- they don’t allow you to use standard ASP code, as they pass through a special DLL filter and not the ASP.DLL. For example, you cannot use include files.
- they are old technology.
In this article, I will take you through what is necessary to get it working on your site. I will go through:
- installing the service,
- pointing it to your web site,
- tuning it for speed and efficiency,
- the ASP code which makes it work,
- and what to do if it fails.
The Indexing Service version 3 is the only one that will run on Windows 2000 at this time (April 19, 2002). It is installed by default. If you have not already installed the Indexing Service during the Windows setup (or are not sure if you have or not and you want to check), here’s how you would do it:
Start > Settings > Control Panel > Add/Remove Programs > Add/Remove Windows Components
If you see Indexing Service checked, then it’s already installed. Let’s assume it’s not installed yet. Check the Indexing Service and then press Next. Windows will install the necessary files and when it’s done, you need to click on Finish. You might be asked for the Windows 2000 Server installation CD to copy the necessary files.
The service should now be installed. To check on the service, go to:
Start > Programs > Administrative Tools > Computer Management
Then click on:
Services and Applications > Indexing Service
You should be able to expand the service and see the 2 Catalogs which were created by default: a System catalog and a Web catalog (this one only if it finds an instance of Internet Information Server (IIS) already running on the server).
Creating a new Catalog
A catalog is like a database where the service stores all the information after it is done indexing your files. My recommendation is to erase both catalogs which are created by default. The Web points to C:\Inetpub (potential security hole), and the System is only useful if you are going to do local searches on the server. If you are using the service only through the internet then it’s safe to erase both of them.
So, let’s say you want to use the service to search against your web site. First, you have to create a web site through the IIS console. I will assume that you know how to do that already and you have a web site up and running. Once you have that running, then you have to tell IIS to Index the web site. You do that through the Home Directory tab of the site Properties. Make sure Index this Resource is checked. If not, check it.
All subfolders of your web site will be indexed if you do this. If you want to exclude a certain folder from being indexed (for example, your images folder), navigate to that folder in the console, go to its properties and uncheck the Index this resource. There is another way to remove folders from a catalog, which I will go through later, but this is the recommended way for websites. This is where a good design of your website is necessary. Put all your images into a folder called images for example, and turn off indexing for it. Do the same for all your content that you want indexed or not indexed. That way, you can just check/uncheck the Index this resource on that folder’s properties and it propagates to all the folders/files under it.
By default, the Indexing Service will index HTML files, text files, Office 95 and later files, internet mail and news, and any other document that a filter is provided. For example, Adobe makes its own IFilter which once installed, helps the service index Acrobat (pdf) files.
The next step is to create a new catalog to house all the information. It’s probably a good idea to create a new folder to use exclusively for your catalog(s). Do not save your catalog under a folder that is being indexed by the same catalog or any other. Your English site could be under C:\catalogs\english, your French at C:\catalogs\french, etc. First create those folders. Then open your Computer Management Console, and right click on the Indexing Service, or click on Action on top, and go to New > Catalog.
Type a name for your catalog, and pick a location where you want to save the catalog (our english catalog would go under C:\catalogs\english).
After you create it, you need to specify what to include or not include in it so that the service will start indexing that content. Right click on the catalog you want to edit, click on Properties and move to the Tracking tab. In this case we want it to point to a web site, so you have to tell it what web server to associate it with. Pick one from the pull-down list.
Now when you start the service, it will start indexing your web site. Under the Generation tab, you can select whether to inherit parent attributes or uncheck it so that you can customize it. I chose only to Generate Abstracts and not index files with unknown extensions. Abstracts are another word for the HTML description meta tag, which goes in the HEAD of the document. When indexing an HTML page, it will look to see if there is one in the HEAD. If there is none, then it will pick the first 320 characters from the body to create the abstract. The maximum number for this string length is 500. To define your own abstract in an HTML document, add a DESCRIPTION meta tag in the head of your file, like this:
As promised earlier, here is another way to add or delete folders to be indexed. Go to the Indexing Service Console and right click on Directories and go to New > Directory.
Doing so, gets you to the Add Directory dialog box:
Choose the path of the folder you want to add to your catalog, and choose from the radio button whether you want it included or excluded from the index. You can add folders on remote computers as long as they are correctly mapped in your system. The Alias (UNC) is not necessary.
The “noise” file
Inside your C:\WINNT\system32 folder you should find a file called noise.eng. Open it with a text editor like Notepad. You will see that its contents are single words or numbers, one under the other, each on its own line. This is the word exception file, and the Indexing Service uses this file when it indexes a file to exclude the words that are there. These are common words, like and, or, or numbers. You can edit this file, adding or deleting your own words. If you edit these files, you will need to empty the catalogs and restart the Indexing Service, so that the updated exception list can take effect.
There is a different noise file for every language: noise.enu is specifically for the U.S.A. as opposed to noise.eng which is for U.K. english. The French file is called noise.fra, the German noise.deu, and so on. You can see a list of all your files in the registry. Run regedit from Start and navigate to: HKEY_LOCAL_MACHINE > SYSTEM > CurrentControlSet > Control > ContentIndex > Language. You will see a listing of all the languages, and the key name is NoiseFile.
First stop the Indexing Service. Once you do that, you can tune the performance of the engine. Go to All Tasks > Tune Performace:
You will see the following menu:
You can choose Dedicated Server if you want to make this catalog and this service immediately responsive to changes on the file system. You can also select Customize and then click on the button, which will give you this dialog:
Move the Indexing slider to Lazy for less immediate indexing or to Instant for immediate indexing of new and changed documents. Lazy indexing uses fewer resources; Instant indexing uses as much of the computer’s resources as it can. Move the Querying slider to Low load if you expect to process only a few queries at a time or to High load if you expect to process many queries at a time. Low load uses fewer resources; high load uses more. You can increase or decrease these settings are you see fit. Keep in mind that doing so will cause your server to use more resources for this activity. I have used the above setting for large sites with thousands of documents with success.
One thing about the Indexing Service’s resources you should know about: It is very demanding on the OS when it is first started as it tries to index everything in the catalog. It moves through pretty quickly, indexing thousands of html documents in just a few minutes. But once it finishes it just sits there, not really using many resources. It responds to file changes through the OS, so it knows to index a file once it’s changed/created/deleted. This way, you can keep the service running on a small computer and you still get good performance out of it.
The search input form(index.html)
This file is the form that accepts your search arguments.
<form action="runsearch.asp" method="post" name="form1"> <table cellpadding="2" cellspacing="0" border="0" align="left"> <tr> <td width="50"> </td> <td colspan="2"> <b>Enter your query below:</b><br> <input type="text" name="Query" size="45" maxlength="100" value=""></td> </tr> ...
I limit the query string that a user can input by 100 characters. That should be enough for everybody and helps prevent hacking. You can change this if you like.
... <tr> <td> </td> <td align="right"> <select name="Scope"> <option value="/" selected>Entire Site</option> <option value="/products/">Products</option> <option value="/services/">Services</option> <option value="/news_events/">News & Events</option> <option value="/about_us/">About us</option> </select></td> <td>Search where?</td> </tr> ...
The Scope is another word for folder. It is used to tell the Indexing Service if it’s going to search everything (/), or just under a specific folder of the site(/products/). You can go as deep as you like and it will only search under that folder: for example /products/bicycles/electric/kids/ would search for documents only under the kids folder.
... <tr> <td> </td> <td align="right"> <select name="RecordsPerPage"> <option value="10" selected>10</option> <option value="25">25</option> <option value="50">50</option> <option value="100">100</option> </select></td> <td>Number of results per page</td> </tr> ...
Sometimes users want to set how many results they see on one page: fewer or maybe more. You can allow them to do this through a simple pulldown menu as shown above, or through a text box where they type the number themselves.
... <tr> <td> </td> <td align="right"> <select name="Order"> <option value="Rank" selected>Ranking Result</option> <option value="Size">Size</option> <option value="Write">Last Date updated</option> </select></td> <td>Arrange in order</td> </tr> ...
It’s common to list results in order of best match. However, it’s possible to rank the resultset under any of the properties in the catalog. You can theferore allow the user to choose the ranking order. Above, I give them the option of ranking the results in order of simply Rank, Size or Date Last Updated.
... <tr> <td> </td> <td align="right"><input type="SUBMIT" value="Search" name="SUBMIT"></td> <td> </td> </tr> <tr> <td> </td> <td> </td> <td> </td> </tr> </table> </form>
And last, the submit button.
Finally, show me some code! (runsearch.asp)
The picture above shows what the returned results should look like. This file, runsearch.asp is responsible for issuing the search against the catalog and properly displaying the returned results. The code should work right out of the box for you, as long as you change a few variables.
It consists of:
|Global variables||Look for a section called EDIT THESE…END EDITsomewhere in the beginning of this file and change those parameters to fit your system. Those should be the only ones you need to change, the rest is up to you.
|Sub RunSearch()||This is the main sub that gets called when the page loads and then calls everything else. It creates a connection to the Indexing Service, gets records through GetRows(), loops through, checks, validates and formats the output.|
|Function BuildQuery(strScope, strQuery)||This function returns a full SQL command to use against the Indexing Service with ADO. Out of the box you get searches against htm, html, asp, ppt, doc, xls, txt, and pdf files, and does not support boolean searches.|
|Sub WriteNavigation(strNavigation, intTotalRecords, intTotalPages)||This sub creates the text for the top navigation links that you see in the picture, i.e. moving from page to page.|
|Function FileSize(intFileSize)||Formats the size output of a file to KB, MB, GB, etc.|
|Function myFixDate(datWrite)||Formats the date last modified output to an international date format.|
Let’s go through the code here in detail to help you understand what’s going on. I have added some error catching, to account for mistakes as well as for malicious users trying to break your site.
<%@Language="VBScript"%> <%Option Explicit%> <%Response.Buffer = True%> <html> <head> <title>Search Results</title> </head> <body> <% On Error Goto 0 Dim strQuery 'user entered text for search Dim intPage 'page number we are on Dim intStartingRecord 'point to start selecting from the recordset Dim intRecordsPerPage 'developer defined Dim strOrder 'developer defined: what to order against Dim strScope 'Scope to search against Dim QUOT 'character 32 for ease of coding Dim strNavigation 'HTML string for navigation links/info Dim starslocation 'folder path for search images Dim strCatalog 'developer defined catalog name: query against this Dim strCustomTitle 'starting string to remove from the title of html pages '***** EDIT THESE ********************** starslocation = "images/" strCatalog = "english" strCustomTitle = "Xefteri - " '***** END EDIT ************************ ...
Between EDIT THESE…END EDIT is what you have to change to make it work for your site. One of these variables is called strCustomTitle. This is a little trick that I use to increase the ratings of my site, and you can do it too. Here’s how it works: when one of the public search engines visits your site to index it, one of the most important factors in rating your site is the <title> tags in your pages. You can increase your ranking by including the name of your site in your titles. Let’s say your site’s name is XYZ. Your titles could then all start with “XYZ – ” and then continue with a more descriptive title of the page.
<html> <head> <title>XYZ - Welcome to our site!</title> </head>
This accomplishes 2 things:
- It boosts your rankings when it comes to your name
- It improves the readability of someone’s bookmarks to your site.
However, when I display the results of the search I use the title of the page as a link to the actual page. At this point, we do not want to show all our titles beginning with the same thing, so we simply edit the title before displaying it. Edit that variable in the code if you are going to use this, and leave it blank (strCustomTitle = “”) if you are not going to use it. If that string is not empty, it will check and remove a matching string from the beginning of each title in the displayed results. If it’s not, then it will display the whole title as is.
... '-- collect values from request ' leave request object open to account for both post and get strQuery = Request("Query") strQuery = Server.HTMLEncode(strQuery) intPage = Request("PAGE") intRecordsPerPage = Request("RecordsPerPage") strOrder = Request("Order") strScope = Request("Scope") '-- define values QUOT = Chr(34) strNavigation = "" '-- account for people trying to hack '-- set max and min values for URL values Select Case intPage Case "" intPage = 1 Case intPage > 32767 Response.Write("Page number out of limit!") Response.End Case intPage < 0 Response.Write("Page number out of limit!") Response.End Case Else intPage = CInt(intPage) End Select If intRecordsPerPage > 1000 OR intRecordsPerPage < 0 Then Response.Write("Records Per Page out of limit!") Response.End Else intRecordsPerPage = CInt(intRecordsPerPage) End If strOrder = Server.HTMLEncode(strOrder) strScope = Server.HTMLEncode(strScope) If InStr(strScope, "..") Then Response.Write("Invalid Scope!") Response.End End If '-- if bad query string supplied (less than 2 characters), show message If Len(strQuery) < 2 Then Response.Write("<p><b>Sorry, but the search text must be at least two characters long.</b></p>") Response.End '-- if the user is trying to cause an overflow in the query string catch it Elseif Len(strQuery) > 100 Then Response.Write("<p><b>Sorry, but the search text must be less than 100 characters long.</b></p>") Response.End End If '-- evaluate starting record in the recordset intStartingRecord = ((intPage - 1) * intRecordsPerPage) + 1 '-- main sub that calls everything else Call RunSearch() ...
The code above collects the user’s inputs and tries to make sure that they fall within certain limits. This also helps prevent hacker attacks. If everything is ok, it calls the main sub RunSearch() which does all the work.
... Sub RunSearch() Dim strSearch 'function-returned SQL query Dim objConn 'Connection object Dim objRS 'Recordset object Dim intTotalRecords 'Recordset.RecordCount Dim intTotalPages 'objRS.PageCount Dim arrAllData 'Recordset.GetRows() Dim numrows 'UBound of arrAllData to get the total rows in objRS Dim rowcounter 'simple counter used in the loop Dim strDocTitle 'objRS("DocTitle") Dim lengthstrDocTitle 'Len(objRS("DocTitle")) Dim strFilename 'objRS("Filename") Dim strVPath 'objRS("VPath") Dim intSize 'objRS("Size") Dim datWrite 'objRS("Write") Dim strCharacterization 'objRS("Characterization") Dim numRank 'objRS("Rank") Dim NormRank 'Rank/10 = change to a percentage Dim stars 'image to display for Ranking '-- build up the query string by calling the BuildQuery function strSearch = BuildQuery(strScope, strQuery) '-- create a connection object to execute the query Set objConn = Server.CreateObject("ADODB.Connection") objConn.ConnectionString = "provider=msidxs; Data Source=" & strCatalog objConn.Open '-- create a recordset to hold the data Set objRS = Server.CreateObject("ADODB.RecordSet") objRS.CursorLocation = 3 'adUseClient objRS.Open strSearch, objConn, 0, 1 'adOpenForwardOnly, adLockReadOnly '-- if errors occured If Err.Number <> 0 Then Response.Clear Response.Write("<p><b>There was an error processing your request.<br>Please go back and try again.</b></p>") '-- close all objects to free up resources objRS.Close Set objRS = Nothing objConn.Close Set objConn = Nothing Response.End Else '-- no errors but no records returned If objRS.EOF and objRS.BOF Then Response.Clear Response.Write("<p><b>No pages that matched your query </b>[<b>" & strQuery & "</b>]<b> were found.</b></p>") '-- close all objects to free up resources objRS.Close Set objRS = Nothing objConn.Close Set objConn = Nothing Response.End '-- or if there was no error and some records were successfully returned then Else '-- set the recordset starting position so that we can get the number ' of records we want from this point on using the GetRows() function objRS.AbsolutePosition = intStartingRecord '-- set the pagesize through the object so we can count # of pages returned objRS.PageSize = intRecordsPerPage '-- # of total records found intTotalRecords = objRS.RecordCount '-- # of total pages found intTotalPages = objRS.PageCount '-- create a 2 simensional array of the records using GetRows() ' and only select how many records we want to see per page arrAllData = objRS.GetRows(intRecordsPerPage) '-- close all objects to free up resources objRS.Close Set objRS = Nothing objConn.Close Set objConn = Nothing '-- write table to wrap contents with a margin equal to the cellpadding Response.Write("<div align=""left""><table border=""0"" cellspacing=""0"" cellpadding=""10"" align=""left""><tr><td>") '-- write top/bottom navigation links/info ' by calling the WriteNavigation() sub Call WriteNavigation(strNavigation, intTotalRecords, intTotalPages) '-- table with contents of search inside Response.Write("<br><table border=""0"" cellspacing=""0"" cellpadding=""0"" width=""100%"">") '-- find out how many rows we have ' this should be the same as the intRecordsPerPage but not always ' an exception would be when the last page does not have enough left numrows = UBound(arrAllData,2) '-- now loop through the records For rowcounter= 0 To numrows '-- row values held in variables for ease of use strDocTitle = arrAllData(0, rowcounter) strFilename = arrAllData(1, rowcounter) strVPath = arrAllData(2, rowcounter) intSize = FormatNumber(arrAllData(3, rowcounter)) datWrite = arrAllData(4, rowcounter) strCharacterization = arrAllData(5, rowcounter) numRank = arrAllData(6, rowcounter) '-- create an empty space if the field is empty ' for proper display of the table <td></td> If IsNull(strCharacterization) Or Trim(strCharacterization) = "" Then strCharacterization = " " End If Response.Write("<tr><td bgcolor=""#AACCEE"" align=""right"">" _ & intStartingRecord & ")</td>" _ & "<td bgcolor=""#AACCEE"" width=""5""> </td>" _ & "<td bgcolor=""#AACCEE""><a href=""" & strVPath & """>") '-- if title found in header is bigger than 2 characters ' it probably means that there is a <title> for this document If Len(strDocTitle) > 2 Then '-- look for and get rid of custom title words used for search engines ' only if your strCustomTitle string is not empty ' and those words are in the beginning of the title If strCustomTitle <> "" Then If LCase(Left(strDocTitle, Len(strCustomTitle))) = LCase(strCustomTitle) Then lengthstrDocTitle = Len(strDocTitle) strDocTitle = Mid(strDocTitle,Len(strCustomTitle), lengthstrDocTitle) End If End If Response.Write(Server.HTMLEncode(strDocTitle)) '-- no title found in header or could not pick it up ' write filename instead so users have something to click on Else Response.Write(Server.HTMLEncode(strFilename)) End If Response.Write("</a></td></tr>" _ & "<tr><td align=""left"" valign=""top"">") '-- show proper image for ranking NormRank = numRank/10 If NormRank > 80 Then stars = "rankbtn5.gif" ElseIf NormRank > 60 Then stars = "rankbtn4.gif" ElseIf NormRank > 40 Then stars = "rankbtn3.gif" ElseIf NormRank > 20 Then stars = "rankbtn2.gif" Else stars = "rankbtn1.gif" End If '-- Chr(37) = % '-- write correct image and percentage ranking Response.Write("<img src=""" & starslocation & stars & """><br>" _ & NormRank & Chr(37) & "</td><td> </td>" _ & "<td align=""left"" valign=""top"">") '-- write summary of the page Response.Write(strCharacterization & "<br><br><i>") '-- write file size or show error in case ' we have a NULL value returned If Trim(intSize) = "" Or IsNull(intSize) Then Response.Write("(size unknown) - ") Else Response.Write("size " & FileSize(intSize) & " - ") End If '-- write date last modified or show error in case ' we have a NULL value returned for DateLastModified If Trim(datWrite) = "" Or IsNull(datWrite) Then Response.Write("(time unknown)") Else Response.Write(myFixDate(datWrite) & " GMT") End If Response.Write("</i></td></tr>" _ & "<tr><td colspan=""3""> </td></tr>") '-- increment the number listing showing on the left by one intStartingRecord = intStartingRecord + 1 Next 'rowcounter= 0 To numrows '-- end of table with search contents Response.Write("</table><hr width=""100%"" size=""2"" noshade>") '-- now write again the top navigation menu we generated ' we don't need to call the sub again because ' it's now in a local variable Response.Write(strNavigation) '-- close wrapping table Response.Write("<br></td></tr></table></div>") End If 'objRS.EOF and objRS.BOF End If 'Err.Number <> 0 End Sub ...
Plainly, this sub calls everything else. It connects to the catalog, issues the query, returns a recordset, and then it formats it appropriately and writes it out. The rest of the functions are responsible for formatting the resultset.
... '-- build SQL query string for Index Server ADO query Function BuildQuery(strScope, strQuery) Dim strPropertyName Dim SQL 'SQL string to search against Dim strQText Dim blnAddedQ Dim intQPos SQL = "SELECT DocTitle, Filename, Vpath, Size, Write, Characterization, Rank FROM " If strScope = "" Then SQL = SQL & "SCOPE() " Else SQL = SQL & "SCOPE('DEEP TRAVERSAL OF " & QUOT & strScope & QUOT & "')" End if strQText = strQuery If InStr(strQText, " ") > 0 Or InStr(strQText, "'") > 0 Then blnAddedQ = False If Left(strQText, 1) <> QUOT Then strQText = QUOT & strQText blnAddedQ = True End If If Right(strQText, 1) <> QUOT Then strQText = strQText & QUOT blnAddedQ = True End If If blnAddedQ Then intQPos = Instr(2, strQText, QUOT) Do While intQPos > 0 And intQPos < Len(strQText) strQText = Left(strQText, intQPos - 1) & " " & Mid(strQText, intQPos + 1) intQPos = Instr(2, strQText, QUOT) Loop End If End If SQL = SQL & "WHERE CONTAINS ('" & strQText & "') > 0" '-- If you want to add your files here, like asp for example ' then add another line like this: ' SQL = SQL & " OR Filename LIKE '%.asp'" SQL = SQL & " AND (Filename LIKE '%.html'" '-- comment any of next lines to exclude certain files SQL = SQL & " OR Filename LIKE '%.asp'" SQL = SQL & " OR Filename LIKE '%.pdf'" SQL = SQL & " OR Filename LIKE '%.doc'" SQL = SQL & " OR Filename LIKE '%.xls'" SQL = SQL & " OR Filename LIKE '%.ppt'" SQL = SQL & " OR Filename LIKE '%.txt'" SQL = SQL & " OR Filename LIKE '%.htm')" SQL = SQL & " ORDER BY " & strOrder & " DESC" BuildQuery = SQL End Function '-- make HTML string for navigation links ' on the top and bottom of the page ' this sub first creates the navigation, ' then stores it in a local variable (strNavigation) ' so we can use it again without needing to call the sub, ' and then writes it to the response Sub WriteNavigation(strNavigation, intTotalRecords, intTotalPages) Dim strScriptName strScriptName = Request.ServerVariables("SCRIPT_NAME") '-- controls to scroll to next or previous pages strNavigation = "<center>" _ & "<a href=""index.html"">New Query</a><br>" _ & intTotalRecords & " total documents matching the query """ _ & strQuery & """<br>" _ & "Page " & intPage & " of " & intTotalPages & "<br>" '-- if we are on the first page then the First and Previous Page ' do not need to be active If intPage = 1 Then strNavigation = strNavigation & "First Page Previous Page " '-- else if we are not on the first page make those links active Else strNavigation = strNavigation & "<a href=""" & strScriptName _ & "?Query=" & strQuery & "&PAGE=1" _ & "&RecordsPerPage=" & intRecordsPerPage _ & "&Order=" & strOrder & "&Scope=" & strScope & """>First Page</a> " _ & " <a href=""" & strScriptName _ & "?Query=" & strQuery & "&PAGE=" & intPage - 1 _ & "&RecordsPerPage=" & intRecordsPerPage _ & "&Order=" & strOrder & "&Scope=" & strScope & """>Previous Page</a> " End If '-- if we are on the last page then there is no need ' to make the Next and Last Page active If intPage = intTotalPages Then strNavigation = strNavigation & " Next Page Last Page" '-- else if we are not on the last page, then make them active Else strNavigation = strNavigation & " <a href=" & QUOT & strScriptName _ & "?Query=" & strQuery & "&PAGE=" & intPage + 1 _ & "&RecordsPerPage=" & intRecordsPerPage _ & "&Order=" & strOrder & "&Scope=" & strScope & """>Next Page</a> " _ & " <a href=" & QUOT & strScriptName _ & "?Query=" & strQuery & "&PAGE=" & intTotalPages _ & "&RecordsPerPage=" & intRecordsPerPage _ & "&Order=" & strOrder & "&Scope=" & strScope & """>Last Page</a>" End If strNavigation = strNavigation & "</center>" Response.Write(strNavigation) End Sub '-- format filesize Function FileSize(intFileSize) const DecimalPlaces = 1 const FileSizeBytes = 1 const FileSizeKiloByte = 1024 const FileSizeMegaByte = 1048576 const FileSizeGigaByte = 1073741824 const FileSizeTeraByte = 1099511627776 Dim strFileSize, newFilesize If (Int(intFileSize / FileSizeTeraByte) <> 0) Then newFilesize = Round(intFileSize / FileSizeTeraByte, DecimalPlaces) strFileSize = newFilesize & " TB" ElseIf (Int(intFileSize / FileSizeGigaByte) <> 0) Then newFilesize = Round(intFileSize / FileSizeGigaByte, DecimalPlaces) strFileSize = newFilesize & " GB" ElseIf (Int(intFileSize / FileSizeMegaByte) <> 0) Then newFilesize = Round(intFileSize / FileSizeMegaByte, DecimalPlaces) strFileSize = newFilesize & " MB" ElseIf (Int(intFileSize / FileSizeKiloByte) <> 0) Then newFilesize = Round(intFileSize / FileSizeKiloByte, DecimalPlaces) strFileSize = newFilesize & " KB" ElseIf (Int(intFileSize / FileSizeBytes) <> 0) Then newFilesize = intFilesize strFileSize = newFilesize & " Bytes" ElseIf Int(intFileSize) = 0 Then strFilesize = 0 & " Bytes" End If FileSize = strFileSize End Function '-- format date properly for international viewing Function myFixDate(datWrite) Dim strHTMLout strHTMLout = FormatDateTime((datWrite), 1) & " at " & FormatDateTime((datWrite), 3) myFixDate = strHTMLout End Function %> </body> </html>
The WHERE clause
The query that you create against the Indexing Service can be as complex as you want it. Here are some more things you can do with it:
|CONTAINS||The following line matches documents that contain toys or factories:WHERE CONTAINS(“toys” OR “factories”)Toys is within 50 words or less of factories: WHERE CONTAINS(“toys” NEAR “factories”) –>this feature was cut from the RTM version of IS 3 at the last minute. The documentation has not been reflected to account for this cut. So the NEAR syntax is ignored, but there is no error message. The 50-word window is built into the FreeText ranking algorithm. –>.NET: the proximity operator works, but you still can’t specify the distance. To match toys, toy, toyed, etc.: WHERE CONTAINS(‘FORMSOF(INFLECTIONAL, “toy”)’)|
|FREETEXT||When you want to search for the best match for a word or a phrase:WHERE FREETEXT(‘toys for kids’)|
|LIKE||Wildcards to perform matches:WHERE DocTitle LIKE ‘%toy%’|
|MATCHES||Uses regular expressions to perform matches. For example, all entries where DocAuthor starts with any character between a and e:WHERE MATCHES (DocAuthor, “[a-e]*”)|
|NULL||Matching of null values:WHERE DocTitle IS NULL WHERE DocTitle IS NOT NULL|
Well, you got the search engine working and everybody is happy. You are receiving kudos from everyone around. But if the search functionality on your site is vital, how can you ever know if something is wrong? Let’s talk about how we can take precautionary action to attempt to fix the service automatically, and be warned if something is wrong. Then you can really sit back and enjoy.
Go to Start > Programs > Administrative Tools > Services:
Double click, or right click and go to Properties, on the Indexing Service to open the Indexing Service Properties dialog. Click on the Recovery tab.
Here, you can define the actions to take once your service fails. You can try different scenarios that best fit your needs. I decided to try a restart on the First failure and then to Run a File on the Second failure. I created a folder called ServerScripts and placed my custom script files to run in there. The SendEmailOnServiceFail.vbs file first makes sure the service is down by attempting to shut it down again, and then tries to bring it back on. It then sends an email to a person to notify them that the service had to be restarted, and may still not work fine. This file uses WSH and the CDONTS to send the email, and you need to have correct access on the windows system to do this (for example Administrative).
'-- Declare variables Dim objSendMail Dim objAdminISS '-- The following stops and restarts Indexing Service ' comment the following 4 lines not to use this feature. ' To use this object you need administrative access Set objAdminIS = CreateObject("Microsoft.ISAdm") objAdminIS.Stop() 'Make sure it's off first objAdminIS.Start() 'And then restart it Set objAdminIS = Nothing '-- Send email when service fails Set objSendMail = CreateObject("CDONTS.NewMail") 'change the FROM and TO below objSendMail.From = "email@example.com" objSendMail.To = "firstname.lastname@example.org" objSendMail.Subject = "Indexing Service has failed!" objSendMail.Body = "<H2><FONT COLOR=Red>" & Date() & " - " & Time() & "</FONT></H2>" & "The Indexing Service has failed. Please check your server!" objSendMail.BodyFormat = 0 'Body property is HTML objSendMail.MailFormat = 0 'MIME format objSendMail.Importance = 2 'High Importance objSendMail.Send Set objSendMail = Nothing
The default timeout time for vbs files like these is 10 seconds. If you want to change that, right click on the vbs file, and click on Properties. Go to the Script tab, and then click on Stop script after specified number of seconds. When you do that, it will allow you to change the 10 seconds default value to whatever you want. When you click OK, the dialog will create a file in the same folder as your vbs file, give it the same name with the extension “.wsh“. This is what a sample file would look like:
[ScriptFile] Path=C:\ServerScripts\SendEmailOnServiceFail.vbs [Options] Timeout=20 DisplayLogo=1
You can test it by double clicking on the vbs file to run it. So, in the future, if the service fails, you will receive an email alerting you of the fact. The email should look something like this:
Putting it all together
To summarize, here are the steps you need to take to make this work for you:
- create your website
- turn indexing on for the site
- install the Indexing Service
- create a catalog
- associate the new catalog with your site
- add/remove folders from catalog
- tune performance
- edit noise files
- copy search input file on your site (index.html)
- copy images for ranking on your site
- copy runsearch.asp on your site
- make sure index.html file is posting to the runsearch.asp
- edit the 3 variables in runsearch.asp file
- create recovery procedures
Running an ADO query may not be the most flexible way to work with the Indexing Service, but it is the simplest by far. For most cases this is good enough. You can do a lot more with this service. For example, you can have it index custom meta tags and then add those to your queries/results. Or you can export the catalog into a relational database like SQL Server, and then combine it with a content management system for a more advanced search. This article’s intention was to give you a quick way to get it up and running. Feel free to alter this as much as possible, and give me feedback. Maybe you can make my code faster or more reliable, or simply expand on it.