Daboo Local Search Engine
|
Version 0.05
Copyright 1997-2004 David Dienhart All Rights Reserved.
Release Date: 04-04-2004
http://www.dienhart.com |
| |
License Agreement
|
|
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
See GNU-GPL_2.html for
complete license
|
|
| |
Files
|
- DLSearch.pl 0.04 (Search Script)
- DLBot.pl 0.04 (Spider Script)
- DLSearch.html (Documentation)
- searchdb.txt (search index (autogenerated))
- search_results.html (Search Results HTML Template)
- /cgi-bin/daboo/localanalysis/localsearch.txt (automatically generated
usage log file, requires localanalysis)
- GNU-GPL_2.html (License Agreement)
|
| |
Requirements
|
- Linux, UNIX, or Windows
- PERL 5.003_07
- HTML::HeadParser (Required by DLBot)
- HTML::LinkExtor (Required by DLBot)
- LWP::UserAgent (Required by DLBot)
- HTML::TokeParser (Required by DLBot)
|
| |
Description
|
- DLBot.pl indexes the selected site and generates an index based on
the page titles, keywords, and description tags, and Page Content.
- DLSearch.pl searches for keywords in searchdb.txt that have been
entered into the form on your search page and returns the results in
the HTML template. The major benefit for this search utility is that
it requires nothing more than having access to your cgi-bin. No additional
cost is incurred to have MySQL or any other database utilities. Not
that I have anything against MySQL, I like it very much, but when It
comes to how much I pay for hosting my site, if I don't absolutely need
to pay extra for it, I won't.
- The following information is stored for further analysis with Local
Analysis: Date, Time, IP Address, Item, (Times Found where applicable).
Only if the following directory exists: /cgi-bin/daboo/localanalysis/localsearch.txt
- The primary purpose for storing this data is to see what people are
searching for when they come to your site, and to be able to see if
it was found.
|
| |
Setup
|
- Copy all of the files to /cgi-bin/daboo/DLSearch/ directory.
- CHMOD everything to 755
- create the following directory:
- /cgi-bin/daboo/localanalysis/ (This is the directory your log files
will be stored in.)
- Add The Following to your HTML document:
rpp = Number of Results Per Page Returned, the default value is 10.
- If you have not already downloaded Local Analysis, download and install
it so that you may analyze the log files generated by DLSearch.
|
| |
Usage
|
- http://yourdomain.com/cgi-bin/daboo/DLSearch/DLBot.pl. Fill in form
from browser to generate the ASCII database. If your browser times out,
you may need to enter it from the command line as follows:
PERL DLBot.pl http://yourhost.com/ (be sure to include trailing '/')
Note: do not type index.html or any specified page, the generator will
not function correctly.
- The ASCII database has now been generated
- Create an HTML template using my template as an example.
- The following items listed are used in the HTML template to return
the results:
$CurrentPage = Current Page
$TotalPages = Total Pages
$BackNext = Back and Next Buttons
$BackNext1 = Back and Next Text Links (Works with Opera 7.x Web Browser
next and back buttons)
$find = Item(s) searched for
$numberfound = number of times searched for item found
$ResultsTable = Search Results Table
$Attempt=Number of attempts to find a match. (One to Eight periods)
$TypeOfSearch = Type of search used to gather obtain search results.
The search automatically tries 8 different methods to obtain results.
They are executed in the following order until results are obtained:
1. Exact Phrase Search
2. Exact Word(s) Search
3. Phrase Begins With Search
4. Word(s) Begin With Search
5. Phrase Ends With Search
6. Word(s) Ends With Search
7. Phrase Contains Search
8. Word(s) Contains Search
- SITE SEARCH = Value returned to Item Search in HTML document.
- $FindInfo = Total times each word searched for was found throughout
the web site.
- Be sure to save the template as search_results.html in the same directory
with the DLSearch.pl.
- localsearch.txt shows the date query and number of matches found,
so you don't need to muddle through the server log files.
- Call DLSearch.pl?find=What you want to find&rpp=Number of results
per page
|
| |
Notes
|
- For more accurate results, you may type a phrase and, or use
Boolean AND logic.
Phrase example: Electronics Engineering
Boolean AND Logic example: Electronics and Engineering and Technology
Combined example: Electronics Engineering and Technology
Boolean OR Logic example: Product Development or Design Engineer
All Combined Example: Product Development or Design Engineer
and Electronics
- Daboo Local Search Engine is case insensitive.
- Daboo Local Search Engine supports Boolean OR logic in combination
with phrases and AND logic.
- Recommend installing Local Analysis to analyze search engine queries.
- As I have not included any security to prevent just anyone from executing
DLBot.pl, I would highly recommend that you chmod 644 it after you have
indexed your site. I do plan to eventually have security built in, but
do not have a timeline for implementing it.
|
| |
History
|
| 0.01 (01-01-2003) |
- Initial Release
- Many improvements have been made over the previously released engine/spider
(Local Search).
- Removed the option to include Anchors in the index, as it is pretty
much a useless function that will cause the same pages to be added to
the index numerous times.
|
| |
| 0.02 (01-04-2003) |
- DLSearch - Bug Fixed that caused the query result next page button
to reference a nonexistent document.
- DLBot - Made some modifications to improve the generated index.
|
| 0.04 (01-20-2003) |
- DLSearch - Major Rewrite, Speed improvements, increased HTML template
versatility. Better Special Character Handling, Improved Weighting routines.
Added Results Bolding of found words/phrases.
|
| |
| 0.05 (04-04-2004) |
- DLBot - Updated to revision 0.04.
- Made some major improvements to the way omitting duplicate entries
was handled, especially in the case where queries result in the
same destination even though the addresses are different.
- Improved resource handling.
- DLSearch - No Changes.
|
| |