DataMuseum.dk

Presents historical artifacts from the history of:

DKUUG/EUUG Conference tapes

This is an automatic "excavation" of a thematic subset of
artifacts from Datamuseum.dk's BitArchive.

See our Wiki for more about DKUUG/EUUG Conference tapes

Excavated with: AutoArchaeologist - Free & Open Source Software.


top - download
Index: ┃ R T

⟦262740cc2⟧ TextFile

    Length: 5017 (0x1399)
    Types: TextFile
    Names: »README«

Derivation

└─⟦a0efdde77⟧ Bits:30001252 EUUGD11 Tape, 1987 Spring Conference Helsinki
    └─ ⟦this⟧ »EUUGD11/euug-87hel/sec1/sp/README« 

TextFile


Here are a pair of programs that might be of some use to those who have
trouble with spelling.

The first program, sp, accepts your tentative or approximate
spelling of a word as input and produces a list of words.
If the correct spelling of the word appears in one of the dictionaries used,
it is likely that it appears in the output list.
Note that this is different from the UNIX 'spell' command that
tells you which words in a document do not appear in the dictionary.

The second program, mksp, lets you maintain your own dictionary of troublesome
words.

=====
To run sp you'll need:
	- the Unix dbm routines, old or new (4.3BSD)

Not required, but very useful:
	- the source to the old dbm routines if you don't have the new ones
	  or your dbm routines don't have dbmclose() (check your man page for
	  dbm(3X) to see if you've got dbmclose())
	- /usr/dict/words plus any other large list of words you might have

=====
I apologize for the complexity of the following guide.  It is due to the
possibility of 4 different dbm configurations:  4.3 style dbm, Sun style dbm
with the dbmclose() routine, "old" (4.2BSD/V7) dbm with source and without
source.

1. The program assumes that a char is 8 bits and an int is at least 16 bits.
   I've avoided using shorts.

2. Note the following if you are using the old dbm routines that *don't* have
   dbmclose():
   The "old" dbm routines that don't have dbmclose() don't work properly if you
   do more than one dbminit().  If you have source code, you can apply the
   diffs so that multiple dbminit() calls will work, allowing
   multiple dictionaries to be used by sp, although you can still only access
   one dbm at a time.  If you do not have source then you can still use
   sp/mksp except you must change MAXDICT (in sp.h) to 1 and edit
   Makefile.newdbm as indicated there.  You will only be able to use one
   dictionary.  I'm including a bug report that came off the net for the old
   dbm routines.  This bug patch has been included in dbm.diffs but is
   surrounded by #ifdef BUGFIX.

   If you're applying the patches to the old dbm code, make a copy of dbm.c
   and dbm.h.  Apply the patches by:
	patch < dbm.diffs
   or by hand (Larry Wall's patch program is in the mod.sources archive).

3. Note the following if you are using the old dbm routines that *do* have
   dbmclose() (e.g., Sun 2 and Sun 3):
   Edit Makefile.newdbm and uncomment the two lines indicated.  Make using
   Makefile.newdbm (see below).

4. Check sp.h and adjust for local conditions.  You might also edit sp.1
   to reflect your local configuration.

5. I've tried to make it easy to change the key used for retrieving from
   the dbm.  The routines to make and disassemble a key are in misc.c.
   I want to keep the key as small as possible since dictionaries tend to
   be rather large.  I've used a vector of unsigned chars for the key because
   I didn't want to have to deal with various lengths of shorts and ints on
   different hardware.

6. If you are using the "new" dbm routines (e.g., those in 4.3BSD that allow
   multiple simultaneously open dbm's), if you have dbmclose(), or if you have
   the old dbm routines without the dbm source then do:
	make -f Makefile.newdbm
   otherwise do:
   	make

   Then move sp, mksp, and calcsoundex to a public directory.  Copy sp.1 to
   where you keep man pages for such programs (you might also link mksp.1 and
   calcsoundex.1 to sp.1).

7. If you are using Gosling EMACS, copy sp.ml (the MLISP interface to sp) into
   a public EMACS library.  I haven't tried to convert sp.ml to work with
   gnuemacs.  Put the documentation (sp.9) where appropriate on your system
   (you may need to edit the FILES section).

8. You should create a public library using /usr/dict/words, e.g.:
	mksp -a -v /usr/public/lib/sp.dict < /usr/dict/words
   The path of this dictionary should appear in DEFAULT_SPPATH (sp.h). Users
   should be made aware of the public version so they don't make their own copy.

9. dbm doesn't seem to work between a Sun and VAX across NFS.  Too bad.
   (It does work between Sun's.)
   Use rsh with the dictionary list on the command line.

10. The programs have been tested on Sun 3/160 (4.2BSD 3.0), VAX 750 (4.3BSD),
    using both the new and old dbm routines.

11. I have a dictionary of 35K words (350Kb) that do not appear in
    /usr/dict/words.  The only way I have of circulating it is on a
    double-sided Atari ST or Mac disk (single-sided if ARC'ed).  If you are
    interested send me a message.  Perhaps it could be archived somewhere
    (any volunteers?).

12. Reference: Knuth, D.E. The Art of Computer Programming, Volume 3/Sorting
    and Searching, 1973, pp.391-392.

13. If you find any bugs please notify me rather than posting to the net.

Enjoy!

-----
Barry Brachman
Dept. of Computer Science
Univ. of British Columbia
Vancouver, B.C. V6T 1W5

.. {ihnp4!alberta, uw-beaver}!ubc-vision!ubc-cs!brachman
brachman@cs.ubc.cdn
brachman%ubc.csnet@csnet-relay.arpa
brachman@ubc.csnet