This is the README for HtmlIndex.0.13.N.bs.tar.gz [Download] [Browse] [Up]
HtmlIndex (HtmlFilter) [V0.1 by Juergen Sell, js@euler.han.de] Based on NewsIndex by Izumi Ohzawa, izumi@pinoko.berkeley.edu. HIs work, my errors. Html filtering and description services for DL indexing of html articles. The following two services are implemented. [1] HtmlDescribe Service: Describes html articles based currently on TITLE and H1 tags. With this service, when you search in DigitalLibrarian, titles are listed in the format: title - header1 [2] HtmlFilter Service: Starting with Version 0.1, another service -HtmlFilter:... has been added. The purpose of this filter is to remove junk, such as html lines before the article text is handed over to indexing scanner. This should reduce the size of .index.store somewhat (upto 20% compared with Version 0.91). Code is a quick hack, but it should serve a purpose as a starting point for writing other description filter daemons for DL. Advantage of this Listener daemon scheme over the Unix stdio filter (invoked via NXUNIXSTDIO port) is that the daemon based filter is invoked only once per DL indexing session, while stdio filter is invoked for every article indexed. Daemons can keep running indefinitely, but this one quits after some duration of inactivity. No Copyright is claimed. This program is hereby released into the public domain. Benoät GrangÝ [ben@fizz.fdn.org] distributed a similar daemon free of charge, but no source code was included in the distribution. This version has been developed based on NewsIndex by myself, and comes with sources. BTW, this thing works with html articles. js --- To Build FAT binary: -------------------------------------------------------- Launch ProjectBuilder.app, do Project->Open Makefile, and open the Makefile in this directory. Select target <Default>, and build! If there is an error, first select target "clean", build, and then select target <Default> and build. --- Installation Procedure ------------------------------------------------------ [1] Copy "HtmlIndexing.service" folder into /LocalLibrary/Services or ~/Library/Services. [2] Copy .index.ftype, and .index.swords into ~/Library/HtmlGrazer/HtmlFolders directory. (Enable Unix Expert mode in Preferences, if you don't see these files.) Replace these files with new ones, even if you used HtmlIndex0.9 - 0.91 and already have these in the HtmlFolders directory. [4] Do "make_services" or, logout/relogin or whatever necessary to make WorkSpace recreate its services cache. Try doing Command-u in WorkSpace while ~/Library/Services/HtmlIndexing.service (or /LocalLibrary...) is selected. [5] Cd to ~/Library/HtmlGrazer/HtmlFolders, and do: rm .index.store ixbuild -gsv -LEnglish . This will create the first usable index for DL. [6] Start DL and drag ~/Library/HtmlGrazer/HtmlFolders onto shelf. Save. [7] From this point on, you should be able to update the index from within DL via the inspector. Have fun. -Izumi --- Izumi Ohzawa [ $@Bg_78^=;(J ] USMail: University of California, 360 Minor Hall, Berkeley, CA 94720 Telephone: (510) 642-6440 Fax: (510) 642-3323 Internet: izumi@pinoko.berkeley.edu (NeXTMail OK)
These are the contents of the former NiCE NeXT User Group NeXTSTEP/OpenStep software archive, currently hosted by Netfuture.ch.