ftp.nice.ch/pub/next/connectivity/infosystems/WAIStation.1.9.6.N.b.tar.gz#/WAIS/doc/source.txt

This is source.txt in view mode; [Download] [Up]


		        Source Description Structures

		       Brewster Kahle brewster@Think.com
		         Harry Morris morris@think.com
				  2/20/91


A "source description structure" is used by a client of the WAIS system to
contact a database on a server.  This document describes a text encoding of
the source description structure.  This specification is compatible with
the clients that are given out with the WAIS release (version 8 and higher)
and with the Directory of Servers maintained by Thinking Machines.  Example
C code to read and write these structures is available free of charge from
the authors.

Note that WAIS end users should never see this structure, since a typical
user will use the source structures retrieved from the directory of servers
or those created by waisindex, and interact with them through a client user
interface.

Philosophy:
-----------

Sources form the conceptual link between an information user and a database
vendor.  As such, they serve two functions.  First, they provide a means
for the database vendor to inform the user of database specific
information, in particular, how to contact the database, and how much it
costs to do so.  Second, they serve as repositories for the user's
customizations.

The fields which describe how to contact a server, what database name to
use, and how much the service costs are required.  The other fields are
strictly optional.  Servers may choose to use them to make suggestions
about how their service might be used, or clients may choose to let the
user modify certain options.  While not all clients may want or need all
the optional fields, it is in everyone's best interest to have any
particular function expressed in a single format that all can share.  The
optional fields presented here are ones we have found useful.  If you have
suggestions for fields that should be added to the specification, please
contact the authors.

Clients should always be prepared to ignore the non-required fields.

An example source structure is:

(:source 
  :version  		3 
  :ip-name 		"quake.think.com"
  :ip-address 		"192.31.181.1"
  :tcp-port 		210
  :maintainer 		"brewster@think.com"
  :database-name 	"directory-of-servers"
  :cost  		0.00
  :cost-unit 		:free
  :description 		
"The directory of servers is a white pages of servers maintained by many
others.  This one is maintained by Thinking Machines Corporation on the
internet.  To submit new entries to the directory of servers, mail to
wais-directory-of-servers@quake.think.com. -brewster"
  :update-time 		(:time-interval :interval :daily 
 			 	:day 0 :hour 1 :min 30)
  )

Syntax:
-------

The syntax is based on a subset the Common Lisp printer format (see Common
Lisp, the Manual, by Guy Steele).  We chose this format because is easy to
parse on many different platforms, and from many different languages.  The
format is summarized below.

A source is a structure but written as a list.  Every structure is written
in the form

	(<structure-type>
	 <keyword> <value>
	 <keyword> <value>
	 ...
	)

Structure types and keywords are always symbols.  Value may be a symbol,
decimal, integer, string, array, ANY, or another structure.  The Common
Lisp structure format was not used because many lisp implementations do not
handle unknown keywords well.

Symbols are preceded with a colon (in the keyword package), and consist of
all lower case letters with embedded hyphens. Strictly speaking, symbols
are not case sensitive, but some clients seem to be ignoring this.
Sticking with lower case is safe.  Example symbols :2foo and :bar-baz3.

Strings start and end with '"'; if a '"' is to appear in the string, then
it a '\' is inserted before it.  If a '\' is to appear then, '\\' is used.

Integers are written without a decimal point.  Floating point numbers are
written with a decimal point.

Arrays are written as #(value value value ...).  For example, a 1
dimensional array of numbers looks like #(115 101 120).

Lists are formed by simply enclosing enumerated items in parentheses.  The
items are separated by white space.  For example, a list of numbers looks
like (3 4 57).

Z39.50 byte arrays (ANYs) are written as structs with length and byte keys
or as a string.  For example, (:any :length 3 :bytes #(115 101 120)) or
"foo".  The string form may only used when every character is a printable
character; this is used to reduce the structure size and increase
readability, but is optional.
 


Required Fields (set by vendor):
-------------------------------

:version <integer>  
The number is the major version number of the source structure format.  The
current version is 3.  This must be the first slot.

:database-name <string> 
These are the names of the databases on the server.  It is a list of
databases, separated by <XXX> characters

at least one of :ip-name, :ip-address.  Both combination may be specified.
XXX what about modems

	:ip-address  <string> a legal internet address such as
	 "192.31.181.1"  

	:ip-name <string> a legal internet name such as "quake.think.com" 

:cost <float>
Cost of using this server. see :cost-unit for units

:cost-unit <cost-unit-type>
one of the following values.

	:free, 
	:dollars-per-session, 
	:dollars-per-minute, 
	:dollars-per-query, 
	:dollars-per-retrieval,
	:other


Optional Fields set by vendor:
------------------------------

:tcp-port <integer>
The tcp port to connect to.  The default is 210, the official Z39.50 port
number.

:configuration <string>
(XXX weird stuff, plus what contact method to use! XXX)

:script <string>
A string of commands to be interpreted after establishing the connection to
the server.  This may involve loging in, authentication, configuring the
connection etc.  The commands are those used by White Knight (yuk).  They
are currently only used in making modem connections (XXX it is possible
that tcp might want to use this too, and that we may want to run scripts
just before starting the connection, just after starting the connection,
and just before closing the connection).

:maintainer <string>
An english description of who to contact for information.

:description <string>
An english description of the database.

:update-time <time-interval>
The interval at which the server's data is updated.  This is a structure of
the form

    (:time-interval 
      	:interval  	<interval-type>
      	:day  		<integer>
      	:hour  		<integer>
      	:min  		<integer> 
      	) 
 
    where interval-type is one of 
	
	:continuous - day, hour, min are ignored
	:hourly - min is used
	:daily - hour and min are used
	:weekly - day is day of week, hour and min are used
	:monthly - day is day of month, hour and min are used
	:unschedualed - day, hour, min are ignored
	(XXX what about yearly)

Optional Fields set by user:
----------------------------

:font <string>
Font for displaying this source's documents.  This is machine specific,
clients should use their default if they don't understand the value.  Note
that a vendor may wish to suggest a font for use with its database.

:font-size <integer>	
The size to use for displaying the font.  The units are machine dependent.

:window-geometry  <rect>
The positioning of the source editing window (if there is one) in machine
dependent units.  It is of the form
	
	(:rect  
		:left	<integer>
		:top  	<integer>  
		:right  <integer>  
		:bottom <integer>
		) 

:confidence  <integer>
A subjective evaluation of how "good" the server is.  This may be used by
clients to help score documents or allocate funds.

:num-docs-to-request  <integer>
How many documents to ask for from this source.

:contact-at <time-interval (see :update-time)>
Clients which can automatically contact a server on behalf of their user
use this field to determine when.  Note, multitasking clients may have
system level utilities for this purpose.

:last-contacted <absolute-time>
A record of the last time the source was contacted.  The format is

	 (:absolute-time 
	      	:year  	<integer>
	      	:month  <integer>
	      	:mday  	<integer>
	      	:hour  	<integer>
	      	:minute <integer>
	      	:second <integer> (second = 61 means the server has not been
		 		   contacted)
	      	) 

:timeout <integer>
The number of seconds to give the this source's server to respond to a
query before disconnecting the communications circuit.

These are the contents of the former NiCE NeXT User Group NeXTSTEP/OpenStep software archive, currently hosted by Netfuture.ch.