From: Kir Kolyshkin (no email)
Date: Wed Jul 03 2002 - 06:59:04 EDT
Well, the subject line tells it all. This is really the most-awaited
ASPseek release. It has a number of changes; hope that release notes
below will guide you through that stuff. Please note the upgrade info
in release notes.
I would like to thank Matt Sullivan <> for a number
of valueable patches he sent that are included in this release. Thanks
also goes to Jeff Watts <> for implementing \N
functionality in Replace command of aspseek.conf - a feature that was
requested by a number of users.
Being a release manager, I am taking an opportunity and dedicating
this release to my dearest wife Elena <>, who is
celebrating her Birthday today. Happy Birthday to You!
Below are release notes and change log for 1.2.9:
Quite a lot of changes. Several bugs were fixed, including two rare memleaks
in searchd and several coredumps, thus lead to improved stability. This
release should also compile cleanly on FreeBSD.
This release also contains several fixes from Matt Sullivan
<>. Below is description of patches from the author:
* Fixed non thread safe use of scanner typeTable which caused corruption
of the table in medium to high load situations (in particular this
permanently broke use of "-" in queries (until next searchd restart)
i.e. 'abc -xyz' would become 'abc AND xyz').
* Fixed a small bug in templates.cpp which caused newlines to be added
before ending font tag during cached page hilighting (effect was that
cached page would not appear exactly as original in some cases).
* Fixed rare segfault resulting from buffer overflow when creating query
key for query cache (*many* stemmed words could overflow buffer).
* Improved tag parsing to handle omitted quotes, fixes cases such as
<A HREF="http://www.server.com"; TARGET=_new"> Side effect is more URLs
are discovered. Previously remainder of document would be ignored
(resulting in URLs were not added).
* Fixed problem where script name has no suffix (exacerbated by addition
of host to script_name) also removes prepending of hostname to script_name
(not removed in mod_aspseek although it should be optional here also).
* Initial URL insertion (via "Server" config parameters or use of '-i') and
URL deletion ('-C'; was this by design?) does not use delmap.
* Fixes order of logging of "Adding URL" in single threaded mode to be
consistant with both realtime and threaded index modes i.e. log before
call to HTTPGetUrlAndStore() rather than after (in past I think this
has been a source of some user confusion when messages such as
"URL deleted" appear before rather than after "Adding URL".
* Adds support for HTTP method POST to s.cgi.
* Adds feature which allows non-incrementing of hops value when redirects
encountered. Adds two config options: IncrementHopsOnRedirect and
Great work, Matt!
Since this version index implements new strategy of indexing "dead" sites
(sites that does not respond to requests). Now number of threads that are
processing such sites are limited to quarter of total number of threads,
as long as there are enough non-dead sites to process.
Also, a new nice feature was added. The "Replace" aspseek.conf directive
now works as sed's "s" command and so can accept \( and \) constructions
in search expression, and \1 to \9 - in replacement. See aspseek.conf(5)
man page for more details. Code was contributed by Jeff Watts
If you are upgrading from 1.2.8 or earlier versions, please note the
1). If you had many sites and reindexed them, it is advisable to run
index -H to re-create citation index files. Versions of ASPseek prior to
this had a bug that caused extra bytes to be written to the above mentioned
files in the process of merging of direct citation index.
2). If you have used "Cache" feature, please rename the SQL table "cache"
to "rescache". This is done with "ALTER TABLE cache RENAME TO rescache"
03 Jul 2002: v.1.2.9 (stable)
* Implemented \1 to \9 sequences in Replace command in aspseek.conf
* Fixes to work on FreeBSD
* Improved /etc/init.d/aspseek script to return proper exit code
and shut down index while stopping
* Fixed s.cgi coredump while working with several search daemons
* Renamed SQL table 'cache' to 'rescache' to avoid collision with some DBMSes
* Fixed bug that caused -a -f file to mark all URLs for re-index (bug #8)
* Fixed bug in merging of direct citation index which resulted in incorrect
* Fixed rare memleak in searchd which occurred when one word was used more
than once in a query
* Fixed 2 bugs in merging of reverse citations which could potentially
lead to index coredump
* Fixed memleak in searchd when Cache was on and number of query page (np*ps)
was greater than number of cached results
* Fixed search of words with uppercased letters with two-byte charsets
* Man pages improvements and fixes
* Fixed limiting results by range of dates (db and de) in s.cgi
* Fixed connect problem when resolvers were used on big endian architecture
* Fixed installation of docs and configs on some platforms
* "site:" (site limit) is taken into account in results cache
* Fixed rare searchd coredump which occurred when "site:" was used
* Fixed incorrect search when more than one "-site:" was used
* Fixed non-threadsafeness of query parser in searchd (it broke use
of "-" in queries in high-load situations)
* Fixed bug of inserting newlines before ending font tag during
cached page highlighting
* Fixed rare coredump in searchd if results cache was used
* Improved tag parsing in index to handle omitted quotes, fixes cases
such as <A HREF="http://www.server.com"; TARGET=_new">
* Initial URL insertion (via "Server" or '-i') and URL deletion procedures
in index now uses delmap
* Fixed problem with s.cgi when it has no suffix
* s.cgi does not prepend hostname to script_name
* Added support for HTTP method POST to s.cgi
* Number of index threads processing alive hosts is guaranteed to be no less
than 3/4 of total number of threads
* Fixed rare coredump in searchd when cached page was queried and record
for that page existed in "urlword" but not existed in "urlwordsNN"
* Added IncrementHopsOnRedirect and RedirectLoopLimit options to aspseek.conf
* Added stopword lists for Catalan and Hungarian languages
* Fixed stopwords.conf and manual pages to include charset parameter
-- ICQ UIN 7551596 Phone +7 903 6722750 --
Guinness a Day Keeps a Doctor Away (people's wisdom)