|
DataMuseum.dkPresents historical artifacts from the history of: DKUUG/EUUG Conference tapes |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about DKUUG/EUUG Conference tapes Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - downloadIndex: ┃ R T ┃
Length: 5017 (0x1399) Types: TextFile Names: »README«
└─⟦a0efdde77⟧ Bits:30001252 EUUGD11 Tape, 1987 Spring Conference Helsinki └─ ⟦this⟧ »EUUGD11/euug-87hel/sec1/sp/README«
Here are a pair of programs that might be of some use to those who have trouble with spelling. The first program, sp, accepts your tentative or approximate spelling of a word as input and produces a list of words. If the correct spelling of the word appears in one of the dictionaries used, it is likely that it appears in the output list. Note that this is different from the UNIX 'spell' command that tells you which words in a document do not appear in the dictionary. The second program, mksp, lets you maintain your own dictionary of troublesome words. ===== To run sp you'll need: - the Unix dbm routines, old or new (4.3BSD) Not required, but very useful: - the source to the old dbm routines if you don't have the new ones or your dbm routines don't have dbmclose() (check your man page for dbm(3X) to see if you've got dbmclose()) - /usr/dict/words plus any other large list of words you might have ===== I apologize for the complexity of the following guide. It is due to the possibility of 4 different dbm configurations: 4.3 style dbm, Sun style dbm with the dbmclose() routine, "old" (4.2BSD/V7) dbm with source and without source. 1. The program assumes that a char is 8 bits and an int is at least 16 bits. I've avoided using shorts. 2. Note the following if you are using the old dbm routines that *don't* have dbmclose(): The "old" dbm routines that don't have dbmclose() don't work properly if you do more than one dbminit(). If you have source code, you can apply the diffs so that multiple dbminit() calls will work, allowing multiple dictionaries to be used by sp, although you can still only access one dbm at a time. If you do not have source then you can still use sp/mksp except you must change MAXDICT (in sp.h) to 1 and edit Makefile.newdbm as indicated there. You will only be able to use one dictionary. I'm including a bug report that came off the net for the old dbm routines. This bug patch has been included in dbm.diffs but is surrounded by #ifdef BUGFIX. If you're applying the patches to the old dbm code, make a copy of dbm.c and dbm.h. Apply the patches by: patch < dbm.diffs or by hand (Larry Wall's patch program is in the mod.sources archive). 3. Note the following if you are using the old dbm routines that *do* have dbmclose() (e.g., Sun 2 and Sun 3): Edit Makefile.newdbm and uncomment the two lines indicated. Make using Makefile.newdbm (see below). 4. Check sp.h and adjust for local conditions. You might also edit sp.1 to reflect your local configuration. 5. I've tried to make it easy to change the key used for retrieving from the dbm. The routines to make and disassemble a key are in misc.c. I want to keep the key as small as possible since dictionaries tend to be rather large. I've used a vector of unsigned chars for the key because I didn't want to have to deal with various lengths of shorts and ints on different hardware. 6. If you are using the "new" dbm routines (e.g., those in 4.3BSD that allow multiple simultaneously open dbm's), if you have dbmclose(), or if you have the old dbm routines without the dbm source then do: make -f Makefile.newdbm otherwise do: make Then move sp, mksp, and calcsoundex to a public directory. Copy sp.1 to where you keep man pages for such programs (you might also link mksp.1 and calcsoundex.1 to sp.1). 7. If you are using Gosling EMACS, copy sp.ml (the MLISP interface to sp) into a public EMACS library. I haven't tried to convert sp.ml to work with gnuemacs. Put the documentation (sp.9) where appropriate on your system (you may need to edit the FILES section). 8. You should create a public library using /usr/dict/words, e.g.: mksp -a -v /usr/public/lib/sp.dict < /usr/dict/words The path of this dictionary should appear in DEFAULT_SPPATH (sp.h). Users should be made aware of the public version so they don't make their own copy. 9. dbm doesn't seem to work between a Sun and VAX across NFS. Too bad. (It does work between Sun's.) Use rsh with the dictionary list on the command line. 10. The programs have been tested on Sun 3/160 (4.2BSD 3.0), VAX 750 (4.3BSD), using both the new and old dbm routines. 11. I have a dictionary of 35K words (350Kb) that do not appear in /usr/dict/words. The only way I have of circulating it is on a double-sided Atari ST or Mac disk (single-sided if ARC'ed). If you are interested send me a message. Perhaps it could be archived somewhere (any volunteers?). 12. Reference: Knuth, D.E. The Art of Computer Programming, Volume 3/Sorting and Searching, 1973, pp.391-392. 13. If you find any bugs please notify me rather than posting to the net. Enjoy! ----- Barry Brachman Dept. of Computer Science Univ. of British Columbia Vancouver, B.C. V6T 1W5 .. {ihnp4!alberta, uw-beaver}!ubc-vision!ubc-cs!brachman brachman@cs.ubc.cdn brachman%ubc.csnet@csnet-relay.arpa brachman@ubc.csnet