|
DataMuseum.dkPresents historical artifacts from the history of: RC4000/8000/9000 |
This is an automatic "excavation" of a thematic subset of
See our Wiki for more about RC4000/8000/9000 Excavated with: AutoArchaeologist - Free & Open Source Software. |
top - metrics - download
Length: 75264 (0x12600) Types: TextFile Names: »mdsortpr6tx «
└─⟦01e83a425⟧ Bits:30008166 Bånd med SW8010 og SW8500 source code └─⟦75ff9bef3⟧ └─⟦this⟧ »mdsortpr6tx «
mdsortproc jw 1/964e 091271.17 corrected in many ways 23.11.1973 by jw changed with vnc and ordering of synonyms 18.08.1976 by jw much changed in claimcalculation and strategy 7.1.1977 by jw extended to handle sq files 17.07.1978 by ib. trap specially for spill in key comparison 1.1.1979 by jw assign af segm.no in headpart of sq-file 1.4.81 by eah erroneous assisgn of segm.no in headpart of sq-file 1.4.81 var. 'passes' unitialized in 'select_workfile (situation=1) fb.1982.08.18 'segno' in filehead (sq) sometimes one too great fb.1982.09.02 claim procedure made monitor 9 compatible fb.1983.03.01 integer exception in select_work_file with large no of free segs fgs.1991.01.03 backing storage release 15.1 january 1988. fb 1987.11.26 correction in claim procedure caused by the monitor procedures changed handling of non fieled array parameters. external procedure mdsortproc(param, keydescr, names, eof, noofrecs, result, explanation); value eof; real eof; integer array param, keydescr; real array names; integer noofrecs, result, explanation; comment \f Algol fortran standard procedure mdsortproc page 1 Purpose. Mdsortproc, merge_disc_sorting_procedure, is a procedure, intended for fast sorting of one backing storage area. The procedure can be called from a program coded in algol or fortran for RC 4000, RC 6000, and RC 8000. Function. The procedure sorts a backing storage file holding records of either fixed or variable length, using backing storage throughout. The basic sorting method, is the merge technique: Sorted strings, as long as possible, are generated by internal sor- ting during the first reading of the input file. After that, these strings are merged repeatedly until only one sor- ted string is left. The procedure will try to minimize the sorting time by variation of mergepowers, blocklengths, use of single- or double-buffering, and by utilization of two disc-stores, if available. The procedure needs in total a backing storage area of about twice the size of the data to be sorted. It can be specified that the input file shall be cleared, so its area can be used for the merge. The free core, when the procedure is called, must be more than about 10000 halfwords, depending on the blocklengths and record- lengths specified. The value of 10000 is valid for blocklengths up to 2 segments. The sort can use any amount of free core to speed up the sorting and room for a work file on two different discs will reduce the time for input output. \f Algol fortran standard procedure mdsortproc page 2 Call. mdsortproc (param, keydescr, names, eof, noofrecs, result, explanation) param(1:7) (call value, integer array) This array holds various parameters of type integer and type boolean, describing the files and the records. param(1) segsperinblock. Blocklength of the input file, given as a number of segments. 1 <= segsperinblock <= 40. Supplied by sq system if content = 21. param(2) clearinput. 1: The input file is cleared, and its area can be used for the merge. 0: The input file must not be cleared. param(3) segsperoutblock. Blocklength of the final output file, given as a number of segments. 1 <= segsperoutblock <= 40. If param(3) = 0 , segsperoutblock:= segsperinblock. param(4) fixedlength. 1: Fixed recordlength. Inrec6/outrec6 are used. 0: Variable recordlength. Invar/outvar are used. 2: variable recordlength but no checksum. param(5) maxlength. The maximum length of variable length records, and the length of fixed length records, measured in halfwords. Maxlength >= 2 and maxlengh <= segsperinblock * 512 and maxlength <= segsperoutblock * 512. Maxlength must be even. It is important for the efficiency of the sort that maxlength reflects the real maximum length of variable length records. param(6) noofkeys. The number of keyfields in the sorting key. 1 <= noofkeys <= maxlength and <= 169. param(7) concerns the reaction on resource troubles. 0: Resource troubles will not stop the execution, the procedure returns with result > 1. <>0: Resource troubles causes runtime alarm. \f Algol fortran standard procedure mdsortproc page 3 keydescr(1: (call value, integer array) noofkeys, 1:2) The description of the sorting key. Keyfield n is specified as: +/- type, position, in keydescr(n, 1:2). The type ranges from 1 to 5, indicating: signed halfword, integer, long, real, or abshalfword. The sign of the type specifies the sequencing: + for ascending, and - for descending order. The position of the keyfield is specified as the number of the last halfword in the field, as for algol field variables. The position may not excede 2047. The entire keyfield must be within a maximum length record. The length of variable length records must not be less than the position of keyfield 1, the highest priority keyfield. records having equal values in all keyfields are sorted according to their occurrence in the input file, i.e. their mutual order is not changed. In connection with very long records (maxlength= param(5) >= 2046) the facility is switched off. names(1:6) (call and return value, real array) Contains 3 file and disc names. names(1:2) inputfile. The name of a backing storage area. The procedure asumes that the size of the area re- flects the amount of data to be sorted. names(3:4) outputfile. If names(3) = real<::> then the name of the output- file is returned in names(3:4), otherwise the name given is used for the final output file. An existing file of this name on scope temp is cleared without warning, just before the end. The sort is not able to use the resources of such a file. names(5:6) outdisc. If names(5) = real<::> then the output disc is selected according to the most efficient strategy, otherwise the disc specified is used. \f Algol fortran standard procedure mdsortproc page 4 eof (call value, real) If the parameter noofrecs is negative, then the end of the input file is indicated by a record holding the bitpattern given by eof in the first 4 halfwords of the userpart. Halfword 1 to 4 in case of fixed length and halfword 5 to 8 in case of variable length records. The final output file is terminated by an end of file record of maximum length in this case. noofrecs (call and return value, integer) If noofrecs is non negative, then the number of records in the input file is given by the value of noofrecs and an eof record is not created. The number of sorted records is returned in any case in this parameter. With an sq file noofrecs <= 0 means that noofrecs is supplied by the sq system, noofrecs > 0 means that only this number of records are to be sorted. result (return value, integer) The value of result specifies the result of the call of the procedure. In general, resource problems will yield a result different from 1, whereas errors concerning the parameters or hard errors will stop the execution by a runtime alarm. If param(7) <> 0 only result = 1 will occur, the other results are transformed to alarms. explanation (return value, integer) The value of this parameter should give a further explanation of result. See the next section. Sq system files. If the content field of the catalog entry of the input file is equal to 21, it is supposed that the file conforms to the conventions of the sq file system. The output file will be created as an sq file as well. Results. result explanation comment 1 segments output the sort was ok 2 -lacking core halfwords not sufficient core 3 see alarm disc not sufficient backing storage 4 see alarm out disc backing store specified by names (5:6) does not exist or has too few resources. Results > 1 are only given if param(7) = 0. The parameter noofrecs will contain the number of sorted records if result = 1 , otherwise it is unchanged. The output file will, provided result = 1, be cut to the minimum size, and tail(6) of the catalog entry will contain the same value as noofrecs, if not sq file. \f Algol fortran standard procedure mdsortproc page 5 Requirements. The available amount of core storage before the call of the proce- dure must satisfy the condition: free_half_words > 7000 + 512*(segsperinblock + segsperoutblock) + 4*maxlength + 24*noofkeys. The procedure requires as working areas two disc files of the size of the input file. This means in the case when the input file is removed that the procedure must be able to create one work file of that size. If the input file is not removed the procedure must be able to create two work files of the size of the input file. If an output disc is specified this disc must be able to hold the final output file. already at the beginning of the procedure it is checked that the output disc is capable of holding a file of the size of the input file. (this is of course the case if the inputfile is placed on the disc specified and has to be removed). the blocking of records may be changed by the sorting, so the output file may have a greater size than the input file. Work files are kept at minimum size by concatenation of the records without regards to block limits. The procedure needs 2 catalog entries, 2 area processes, and 1 message buffer. So, the job process should at least be the owner of 4 area processes, and 2 message buffers. But it is recommended to have a greater number of message buffers, (10 to 20), especially in the case of a sort of small records, (about 2 to 20 halfwords), with great core size. Variable length records. The sum check facility of invar is used during the reading of the input file if param(4) = 0 (invar with checksum control). The record length must not excede maxlength. The minimum record length is given by the greatest of the two val- ues: 4 (if noofrecs < 0 then 8) and keydescr(1, 2). Thus some of the keyfields of a short record may in fact be situ- ated outside the record. Such a record is sorted as if all the bits of keyfields outside the record were equal to zero. \f Algol fortran standard procedure mdsortproc page 6 Alarms. Parameter errors and hard file errors will stop the run with a run time alarm. alarmtext integer comment param param number error at param(param number) keyfield keyfield illegal position or type of keyfield <integer>. create monitor result abnormal result in call of the mon- itor procedure create entry. lookup monitor result abnormal result from lookup entry. This alarm will normally indicate that names(1:2) does not specify a catalog entry. change monitor result abnormal result in call of change entry. Should not occur. rename monitor result it is impossible to rename the final output file to names(3:4). remove monitor result abnormal result in call of remove entry. The alarm will normally con- cern the original input file. infile tail(1) names(1:2) does not point to a catalog entry describing an area. r.length record length variable length record of a length greater than maxlength, less than key- descr(1, 2), or less than 8 if eof used. passes 20 the sort could not be done in 20 mer- ging passes. This alarm should never occur. reccount record count this is a hard error or a programming error. The counts of records in the first pass and the last, are not equal, the last count is shown. size -lacking halfwords not enough core. disc -lacking entries too few entries in main catalog. segments not enough segments, the value of segments is the size of one workfile. out disc -1 the wanted output disc is not mounted. segments not room for one workfile on the wanted output disc. trap passnumber normally spill in key comparison. nrecs sq noofrecs param noofrecs > records in sq file. The alarms: size, disc and out disc (resources) will only be given in case param(7) <> 0, otherwise the corresponding results, 2 to 4, with explanation will be given. In addition, index alarms may occur, if the parameter arrays are incorrectly declared, and alarms from opensq or stderror may occur, if file or record formats are illegal or in case of hard errors. Alarms, with the exception of index alarms, are preceded by the text, ***mdsortproc alarm:. The alarm r.length, and stderror alarms occurring during the rea- ding of the input file are also preceded by a line, specifying the number of input records accepted before the error was detected. \f ; begin boolean fixedlength; integer bytes_per_inblock, bytes_per_outblock, content, discstores, freebytes, i, maxkeyposition, maxlength, maxlength_plus_four, messbufsforinp, min_segs_per_block, noofkeys, outbytes, outsegment, pass, passes, recs, recsout, segsin, segsoutzero, segsperinblock, segsperoutblock, shares, shortclock, strings1; real time; integer array in_segs, merge_power, segsout(1:20); real array sortfiles(1:4); comment -test-< procedure printkey(i1, i2, r) -test-< value i1, i2 -test-< integer i1, i2 -test-< real array r -test-< begin integer i, pos -test-< boolean field bfld -test-< integer field ifld -test-< long field lfld -test-< real field rfld -test-< pos:= write(out, <:<10>key :>, << ddddd>, i1, i2, <:::>) -test-< for i:= 3 step 2 until noofkeys*2+1 do begin if pos > 100 then pos:= write(out, <:<10> :>) -test-< bfld:= ifld:= lfld:= rfld:= keydescr(i+1) -test-< pos:= pos + (case abs keydescr(i) of ( write(out, << -dddd>, r.bfld extract 12), write(out, << -ddddddd>, r.ifld), write(out, <<-dddddddddddd>, r.lfld), write(out, <<-ddddddddddd>, r.rfld), write(out, << -dddd>, r.bfld extract 12))) -test-< end -test-< write(out, <:<10>:>) -test-< end printkey ; \f procedure print_rec(recno, rec); value recno; integer recno; real array rec; comment prints recno and the keyfields of rec; begin integer i, pos; boolean field bfld; integer field ifld; long field lfld; real field rfld; pos:= if pass = 0 then write(out, <:<10>no::>, <<ddddddd>, recno, <: key::>) else write(out, <:<10>key::>); for i:= 3 step 2 until noofkeys*2 + 1 do begin bfld:= ifld:= lfld:= rfld:= keydescr(i+1); if pos > 62 then pos:= write(out, <:<10> :>); pos:= pos + (case (abs keydescr(i)) extract 2 + 1 of ( <* 4 *> write(out, <<-dddddddd.dddd'-dd>, rec.rfld), <* 1,5 *> write(out, << dddd>, rec.bfld extract 12), <* 2 *> write(out, << -ddddddd>, rec.ifld), <* 3 *> write(out, << -ddddddddddddddd>, rec.lfld))); end print keyfields; write(out, <:<10>:>); end print_rec; procedure print_zsort(zsort); zone zsort; comment prints the two last compared records in zsort. the comparison must have been done by outsort or lifesort; begin integer i; integer field recnumber; integer array ia(1:20), bases(1:2); recnumber:= maxlength + 2; getzone6(zsort, ia); system(5, ia(2) - 2, bases); for i:= 1, 2 do begin ia(14):= bases(i); <* recordbase *> setzone6(zsort, ia); print_rec(zsort.recnumber, zsort); end print the two records; end print_zsort; \f procedure alarm(text, int); value text, int; integer text, int; comment this is the common alarm and error procedure. it removes the workfiles. the integer text selects the alarmtext, and the integer int is the alarm integer. if text >= 100 return is performed to docerror. if text >= 12 and <= 14 return may be performed with result and explanation by a jump to return_from_discsort, otherwise system is called by a jump to the bottom of procedure discsort in order to get a simple alarm address. the string value of the text is transmitted in sortfiles(1), and the integer in segsin. ; begin integer i; for i:= 1, 3 do clear_file(sortfiles, i); if text >= 12 and text <= 14 then begin comment return with result <> 1 depending on param(7); if param(7) = 0 then begin result:= text - 10; explanation:= int; goto return_from_discsort; end continuation wanted; end text 12 to 14; write(out, <:<10><10>***mdsortproc alarm:<10>:>); if text = 9 or text = 100 then write(out, <:<10> accepted records: :>, recs, <:<10>:>); if text < 100 then begin comment not called from docerror; sortfiles(1):= real (case text of( <:<10>param :>, <:<10>keyfield:>, <:<10>create :>, <:<10>lookup :>, <:<10>change :>, <:<10>rename :>, <:<10>remove :>, <:<10>infile :>, <:<10>r.length:>, <:<10>passes :>, <:<10>reccount:>, <:<10>size :>, <:<10>disc :>, <:<10>out disc:>, <:<10>trap :>, <:<10>nrecs sq:>)); segsin:= int; comment -test-< system(9, segsin, string sortfiles(1)); goto alarmcall; end not called from docerror; end alarm; \f procedure clear_file(name, i); value i; integer i; real array name; begin zone z(1, 1, stderror); open(z, 0, string name(increase(i)), 0); monitor(48, z, 0, in_segs); <* remove entry *> end clear_file; procedure docerror(z, s, b); zone z; integer s, b; comment the procedure calls alarm to have the sortfiles removed and the alarm headline printed, and calls stderror; begin alarm(100 + pass, 0); stderror(z, s, b); end docerror; \f procedure endoffilerec(zout); zone zout; comment note that this procedure finishes the sort by a jump to the label return_from_disc_sort. if the parameter noofrecs < 0, the procedure creates a record of the maximum length holding the bitpattern given by eof in the first 4 bytes of the userpart. if content is not zero, the remaining part of the block, or per- haps a new block is filled with integers of the value -1. remember that the final output is super blocked if made in the mergephase; begin integer i, maxlength, remaining, remaining_blocks, segments; integer array ia(1:20); integer field ifld; real array field raf; if recs <> recsout then alarm(11, recsout); if noofrecs < 0 then begin comment create a maximum length eof record; maxlength:= param(5); <* use the original maxlength *> ifld:= outrec6(zout, 0) mod bytes_per_out_block; if ifld < maxlength then begin outrec6(zout, ifld); for ifld:= ifld step -2 until 2 do zout.ifld:= 0; end not room for maxlength in current block; outrec6(zout, maxlength); for ifld:= 2 step 2 until maxlength do zout.ifld:= 0; if fixedlength then zout(1):= eof else begin ifld:= 2; zout.ifld:= maxlength; zout(2):= eof; checkvar(zout); end variable length; end noofrecs < 0; \f comment find the size of the file; for i:= 1 step 1 until 10 do ia(i):= 0; <* tail for changeentry *> remaining:= outrec6(zout, 0); getposition(zout, 0, segments); if pass = 0 then segments:= segments + segs_per_outblock else begin comment superblocking is used; remaining_blocks:= (remaining - 2)//bytes_per_out_block; remaining:= remaining - remaining_blocks * bytes_per_out_block; segments:= segments + segs_out(pass) - remaining_blocks * segs_per_out_block; end after merge; if content <> 0 then begin comment fill the remaining part according to bs and sq system; outrec6(zout, remaining); for ifld:= 2 step 2 until remaining do zout.ifld:= -8388608; ia(7):= segments // segs_per_out_block - 1; <* blocknumber *> ia(8):= bytes_per_out_block - remaining; ia(9):= content shift 12 add segs_per_out_block; remaining:= 0; if content = 21 then begin integer array field headpart; outrec(zout, 0); getposition (zout, 0, ia(7)); ia(7) := ia(7)-segs_per_out_block; setposition(zout, 0, 0); headpart:= 0; invar(zout); i:= zout.headpart(1); setposition(zout, 0, 0); swoprec6(zout, i); headpart:= zout.headpart(3); headpart:= zout.headpart(1); resultsq:= 1; ia(6):= shortclock:= systime(7, 0, time); for i:= 2 step 1 until 7 do zout.headpart(i):= case i -1 of (1, recs, ia(7), ia(8), 0, shortclock ); checkvar(zout); ia(7):= ia(8):= 0; end sq file; end bs system; close(zout, true); ia(1):= explanation:= segments - remaining//512; ia(if content = 21 then 10 else 6):= noofrecs:= recs; i:= monitor(44, zout, 0, ia); <* change entry *> if i <> 0 then alarm(5, i); comment rename the file or return workname; if names(3) <> real<::> then begin raf:= -2*4; for i:= 3 step 1 until 4 do ia.raf(i):= names(i); i:= monitor(46, zout, 0, ia); <* rename entry *> if i = 3 <* the name exists already *> then begin clear_file(names, 3); i:= monitor(46, zout, 0, ia); <* repeat rename *> end i = 3; if i <> 0 then alarm(6, i); end rename else begin comment return workname; getzone6(zout, ia); raf:= 2 - 2*4; for i:= 3 step 1 until 4 do names(i):= ia.raf(i); end return name; result:= 1; goto return_from_discsort; end endoffilerec; \f procedure endoutstring(zout, out_base, block_size); value block_size; zone zout; real array field out_base; integer block_size; comment the procedure outrecs a string chaining record of 6 bytes, pointing to the start of the current outstring. the first 4 bytes contain the values of the variables outsegment and outbytes, specifying the position of the preceding chain record. the last two bytes hold the value of recsout, specifying the number of records in the current outstring. outsegment and outbytes are updated to point to the new chain record, and recsout is set to zero. global quantities: outsegment the segment number of the block holding the last created string chaining record. outbytes the number of bytes preceding the chain record in this block. recsout the number of records in the current outstring. ; begin integer i, segment; integer field new_base; getposition(zout, 0, segment); new_base:= out_base; for i:= outsegment, outbytes, recsout do begin if new_base >= block_size then begin outrec6(zout, block_size); new_base:= 2; end change output block else new_base:= new_base + 2; zout.new_base:= i; end for i; comment -test-< write(out, <:<10>test 14 , outsegment, outbytes,:>, <: recsout: :>, outsegment, outbytes, recsout); comment set outsegment, outbytes, recsout for the new string; outsegment:= out_base // 512 + segment; outbytes := out_base mod 512; recsout:= 0; out_base:= new_base; end endoutstring; \f procedure errorinoutfile(z, s, b); zone z; integer s, b; comment this procedure is used as the blockprocedure of the output zones. if not end of document then docerror is called. else alarm is called; begin if s shift (-18) extract 1 <> 1 then docerror(z, s, b); alarm(13, segsin); <* may give result 3 *> end errorinoutfile; \f procedure select_merge_strategy; comment this procedure selects the most efficient strategy for the merge under the constraints chosen. the quantities, which can be altered, are the number of passes, the mergepowers and the blocklengths corresponding to each pass, and the number of shares to be used for input/output. important global quantities: discstores if discstores=2 then the transfertime can be cut to the half by the use of doublebuffering. freebytes the amount of core available for buffers. a reasonable amount of room for programsegments and variables has been subtracted in advance. passes at entry, the value of passes defines the pos- sible numbers of passes: 0: any number of passes is possible. 1: only an odd number of passes is possible. 2: only an even number of passes is possible. at return, passes contains the selected number of merging passes. recs the total number of records. segsin the number of segments of the input file. segsperoutblock blocklength of the final output file. shares the selected number of shares for input/output. mergepower(1:20) the selected mergepowers for the merge passes. insegs(1:20) input blocklengths for merging passes. segsout(1:20) output blocklengths for merging passes. strings1 the number of strings generated in pass 0. ; begin integer test_passes, test_shares, bytes_per_in_seg, bytes_per_power, bytes_per_out_seg, first_pass, pass_step, last_maxpower, maxpower; real great, min_time, time, time_best_passes; integer array test_mergepower, test_in_segs, test_segs_out(1:20); \f boolean procedure find_powers; comment given a specific number of testpasses, the string generation pass not counted, this procedure will, if sensible, return a series of test_mergepowers. ; begin integer i, last_maxstrings, maxstrings, p; maxstrings:= p:= if test_passes > 1 then 1 else if strings1 < last_maxpower then strings1 else last_maxpower; for i:= 1 step 1 until test_passes do test_mergepower(i):= p; find_powers:= true; if maxstrings >= strings1 then goto return; last_maxstrings:= 0; for p:= p + 1 while last_maxstrings < maxstrings do begin last_maxstrings:= maxstrings; for i:= 1 step 1 until test_passes do if p <= (if i < test_passes then maxpower else last_maxpower) then begin test_merge_power(i):= p; maxstrings:= maxstrings // (p-1) * p; if maxstrings >= strings1 then goto return; end mergepower can be increased to p; end for p; find_powers:= false; return: end find_powers; \f real procedure segs_and_time; comment given a set of test_mergepowers, the procedure will find the best blocklengths. the procedure calculates the time needed for the merge, and if this time is less than mintime, it is stored in mintime, and the relevant values are saved in the global variables used to control the merge. the time calculation is based upon empirical data for the disc, as well as for the program. ; begin integer i, max_buffer_bytes, p, p_times_bytes_in, segments_in, segments_out, test_segments_in, test_segments_out; real time, disc_for_pass, disc_for_in, disc_for_out, test_time, accesstime, transfertime, cpu_for_pass; \f comment define empirical time constants in milliseconds: for the disc store the following approximation is used: disctimepersegment = accesstime/segsperblock + transfertime. if only 1 discstore or only 1 share is used, the times for input and output are added, otherwise the maximum value is used; accesstime := 30; transfertime:= 25/12 <* 12 = some mean track length *>; comment if several disc store types are used, the time constants should be set according to a device type in the external process for the disc storage device. it is very difficult to calculate the time used for the disc in a realistic manner. it should f.ex. be taken into account that the disc-controller may allow parallel head positioning, but not parallel data transfers, also the transfertime may be greater if shorter disc slices chained together are physically spread over the disc. the cpu time is estimated as a linear function of the number of records and the number of segments. the following time constants should not be quite unrealistic: but the machinetype should be fetched from the monitor and the timeconstants set according to the speed of the cpu. cpu_per_record : 2.0 cpu_per_segment : 3.2 ; cpu_for_pass := recs * 2.0 + segsin * 3.2; comment in addition to the times mentioned above, 2 seconds are added per merging pass, and for 2 shares 10 percent is added to total time to cover a persumed interference between the discs and the cpu. another reason for this punishment of double buffering has been the fact that the total load of the machinery, i.e. the sum of the individual working times of discs and cpu, is greater with double buffering due to smaller blocklengths or more passes; \f time:= 0; for i:= 1 step 1 until test_passes do begin comment set some usefull variables; p:= test_merge_power(i); p_times_bytes_in:= bytes_per_in_seg * p; max_buffer_bytes:= free_bytes - bytes_per_power * p; disc_for_pass:= great; test_segments_in:= min_segs_per_block; calc_segments_out: test_segments_out:= (max_buffer_bytes - test_segments_in * p_times_bytes_in) // bytes_per_out_seg; if i = test_passes <* the last pass *> then testsegmentsout:= (if content = 21 and testsegmentsout > segsperoutblock then segsperoutblock else testsegmentsout) // segs_per_out_block * segs_per_out_block; if test_segments_out >= min_segs_per_block then begin comment calculate the disctime for 1 segment; disc_for_in := access_time/test_segments_in + transfer_time; disc_for_out:= access_time/test_segments_out + transfer_time; test_time:= if test_shares = 1 or discstores = 1 then disc_for_in + disc_for_out else if disc_for_in > disc_for_out then disc_for_in else disc_for_out; if disc_for_pass > test_time then begin disc_for_pass:= test_time; segments_in := test_segments_in; segments_out := test_segments_out; test_segments_in:= test_segments_in + 1; goto calc_segments_out; end a better result; end test_segments_out >= min_segs_per_block; disc_for_pass:= disc_for_pass * segsin; time:= (if test_shares = 1 then disc_for_pass + cpu_for_pass else (if disc_for_pass > cpu_for_pass then disc_for_pass else cpu_for_pass) * 1.10) + 2000.0 + time; test_in_segs (i):= segments_in; test_segs_out(i):= segments_out; end for i; \f segs_and_time:= time; if time < mintime then begin comment return the hitherto best values; mintime:= time; passes:= test_passes; shares:= test_shares; for i:= 1 step 1 until test_passes do begin merge_power(i):= test_merge_power(i); in_segs (i):= test_in_segs (i); segs_out (i):= test_segs_out (i); end; end a better result; end segs_and_time; \f comment before pass 1. strings1 contains the number of strings generated during pass 0, recs the real number of records, segsin the minimum size of the output file from the string generation; bytes_per_power:= maxlength + 106; <* 1 record + 8 bytes + zonedescriptor and 2 sharedescriptors *> first_pass:= if passes = 2 then 2 else 1; pass_step := if passes = 0 then 1 else 2; min_time:= great:= '10; for test_shares:= 1 step 1 until 2 do begin bytes_per_in_seg:= bytes_per_out_seg:= test_shares * 512; i:= minsegsperblock * bytes_per_in_seg + bytes_per_power; max_power := (free_bytes - minsegsperblock * bytes_per_out_seg) // i; last_maxpower:= (free_bytes - segs_per_out_block * bytes_per_out_seg) // i; if test_shares = 2 then begin if maxpower > messbufsforinp then maxpower:= messbufsforinp; if last_maxpower > messbufsforinp then last_maxpower:= messbufsforinp; end doublebuffering; time_best_passes:= great; for test_passes:= first_pass step pass_step until 20 do begin if find_powers then begin comment ok; time:= segs_and_time; if time < time_best_passes then time_best_passes:= time else goto change_shares; end find_powers; end for testpasses; change_shares: end for shares; if mintime >= great then alarm(10, 20); comment this situation, that the sort can not be done in 20 passes, after the checks performed in checkparam, must never occur, and should be considered a program error; comment -test-< write(out, <:<10> mintime: :>, <<ddd ddd.dd>, mintime/1000); end select_merge_strategy; \f procedure select_work_file(situation, infile); value situation; integer situation; real array infile; comment the procedure creates the workfile needed in the situation: 1. before pass 0, the stringgeneration. the original inputfile and the final outputdisc is checked. passes = 0 at calltime signals that the workfile should be the final outputfile. if that is impossible passes is set to 1 . discstores is set to 1 or 2 depending on whether the input- file and the workfile is placed on the same disc or not. 2. before pass 1, the first merging pass. the return value of passes specifies the possible numbers of merging passes: 0: any number is allowed (1, 2, 3, ... ) 1: only odd numbers (1, 3, 5, ... ) 2: only even numbers (2, 4, 6, ... ) discstores as for situation 1. ; begin boolean small_input_disc; integer best_bsno, best_segments, best_weight, bsno, entries, free_segs_on_input_disc, i, input_bsno, main_bsno, max_loss_per_block, output_bsno, safe_segments, segments, segs_on_input_disc, slice_length, weight; integer array ia(1:20), tail(1:10); integer array field headpart; long array bestbsname, bsname(1:2); real array field areaname, discname; zone z(128, 1, stderror); \f boolean procedure claim (keyno,bsno,bsname,entries,segm,slicelength); value keyno; integer keyno,bsno,entries,segm,slicelength; long array bsname; begin own boolean init,monold; own integer bsdevices,firstbs,ownadr,mainbs; integer i; long array field name; integer array core(1:18); if -,init then begin init:=true; system(5,64,core); monold:= core(1) < 9 shift 12 + 0; system(5,92,core); bsdevices:=(core(3)-core(1))//2; firstbs:=core(1); begin integer i; integer array ntable (0:bsdevices); integer array field iaf; iaf:= -2; system(5,firstbs,ntable.iaf); while ntable(mainbs)<>core(4) do mainbs:=mainbs+1; end; ownadr:=system(6,i,bsname); end; if bsno=-1 then bsno:=mainbs else if bsno<0 or bsno>=bsdevices or keyno<>0 and keyno<>2 and keyno<>3 then begin claim:=false; goto exitclaim end; claim:=true; begin integer array nametable(1:bsdevices); name:=18; system(5,firstbs,nametable); system(5,nametable(bsno+1)-36,core); if core(10) = 0 then goto exitclaim; bsname(1):=core.name(1); bsname(2):=core.name(2); slicelength:=core(15); system(5,ownadr+core(1),core); if monold then begin comment monitor version older than 9; entries:=core(keyno+1) shift (-12); segm:=core(keyno+1) extract 12 * slicelength; end else begin entries:=core(keyno*2+1); segm:= core(keyno*2+2)*slicelength; end; end; if false then begin exitclaim: entries:=segm:=slicelength:=0; bsname(1):=bsname(2):=0; end; end claim; \f comment lookup the input file; areaname:= discname:= 2; getzone6(z, ia); for i:= 1 step 1 until 2 do ia.areaname(i):= infile(i); setzone6(z, ia); i:= monitor(42, z, 0, tail); <* lookup entry *> if i <> 0 then alarm(4, i); if situation=1 then begin segsin := tail(1); if segsin<0 then alarm(8, segsin); <* no area*> safe_segments:= 2*segsin; <* the safe workfile size *> end situation=1 else if situation=2 then begin comment calculate the safe size of the workfiles; i:= bytesperoutblock // 2; max_loss_per_block:= if fixedlength then bytesperoutblock mod maxlength else ((if maxlength > i then i else maxlength) - 2); safe_segments:= round (segsin / (bytesperoutblock - max_loss_per_block) * bytesperoutblock) + 1 + segsperoutblock; end situation = 2; \f comment comment now find the best disc for the workfile. the selection is based upon the following priorities: 1: the final outputdisc if necessary or wanted. 2: a disc with safe_segments free segments. 3: the disc with most free segments. 4: the disc with the greatest slicelength. 5: a disc giving discstores = 2. 6: the final outputdisc. 7: the disc with the highest bsno. ; main_bsno:= bsno:= best_weight:= input_bsno:= output_bsno:= -1; <* pseudo call to assign main_bsno. i.e. bs device of maincatalog *> claim(0, main_bsno, bsname, entries, segments, slicelength); for bsno:= bsno + 1 while claim(0, bsno, bsname, entries, segments, slicelength) do if slicelength > 0 <* kit mounted *> then begin if bsno = main_bsno then begin comment check the number of entries in main catalog; i:= (if situation = 1 then param(2) else 1) + entries; if i < 2 then alarm(13, i - 2); end bsno = 0; weight:= if slicelength >= 80 then 16 else slicelength//5; comment check inputdisc and final outputdisc; if real bsname(1) = tail.discname(1) and real bsname(2) = tail.discname(2) then begin comment this is the disc of the input file; input_bsno:= bsno; segs_on_input_disc:= (tail(1) + slicelength - 1)//slicelength*slicelength + segments; small_input_disc:= segs_on_input_disc//2//slicelength*slicelength < safe_segments; free_segs_on_input_disc:= segments; segments:= (segs_on_input_disc - segsin)//slicelength*slicelength; <* segments = max unused segments on input disc *> end inputdisc else weight:= weight + 2; comment include segments in the weight; weight := (if segments >= safe_segments or extend segments * 20 + weight >= 8688000 then 8000000 else segments * 20) + weight; \f if real bsname(1) = names(5) and real bsname(2) = names(6) then begin comment this is the disc of the final outputfile; output_bsno:= bsno; weight:= weight + 1; <* after all the file shall end there *> if situation = 1 then begin comment check that there is room; if input_bsno <> output_bsno or param(2) = 0 <* the inputfile is not removed *> then begin comment there must be room now; if segments < segsin then alarm(14, segsin); if passes = 0 <* sort without merge wanted *> then weight:= 8100000; end room must be there now else <* input_bsno = output_bsno and param(2) <> 0 *> begin if passes = 0 and segments >= safe_segments <* safe *> then weight:= 8100000 <* sort without merge possible *> else passes:= 1; end perhaps room on outputdisc; end situation = 1 else <* situation = 2 *> if input_bsno <> output_bsno <* it must be selected *> then weight:= 8100000; end disc for final output; if best_weight <= weight then begin best_weight:= weight; best_bsno := bsno; best_segments:= segments; for i:= 1 step 1 until 2 do best_bsname(i):= bsname(i); end better or equal; end for bsno; \f comment now the best disc has been selected; comment -test-< write(out, <:<10>test 1, inpbs, outbs, bestbs, bestw, :>, <:bests, safe, soninp, sin::>, input_bsno, output_bsno, best_bsno, best_weight, bestsegments, safe_segments, segs_on_input_disc, segsin); if output_bsno = -1 then begin comment check that special outputdisc is not wanted; if names(5) <> real<::> then alarm(14, -1); <* does not exist *> end; if best_segments < segsin then alarm(13, segsin); discstores:= if input_bsno = best_bsno then 1 else 2; if situation = 1 then begin best_segments:= segsin; <* a reasonable filesize *> end else <* situation = 2 *> begin if discstores = 1 then begin passes:= 0; <* any number of merging passes *> best_segments:= free_segs_on_input_disc; <* the free segments *> if small_input_disc or best_segments < safe_segments then begin comment cut down the input file to a more safe size; tail(1):= best_segments:= if small_input_disc then segsin else safe_segments; i:= monitor(44, z, 0, tail); if i <> 0 then alarm(5, i); end little room or biased distribution; end discstores = 1 else <* discstores = 2 *> begin passes:= if output_bsno = -1 then <* passes is used to select the safest disc for final out *> (if segs_on_input_disc >= safe_segments and best_segments >= safe_segments then 0 else if segs_on_inputdisc < bestsegments then 1 else 2) else <* special outputdisc *> if best_bsno = output_bsno then 1 else 2; end discstores = 2; end situation = 2; \f comment now, at last, create the workfile; tail(1):= if best_segments > safe_segments then safe_segments else best_segments; for i:= 1 step 1 until 2 do begin ia.areaname(i):= real<::>; tail.discname(i):= real best_bsname(i); end; setzone6(z, ia); i:= monitor(40, z, 0, tail); <* create entry *> if i <> 0 then alarm(3, i); getzone6(z, ia); for i:= 1 step 1 until 2 do sortfiles(2 * situation - 2 + i):= ia.areaname(i); end select_work_file; \f comment the sortfiles are initialized with empty names, because they are removed in case of alarms, see procedure alarm; for segsin:= 1 step 1 until 4 do sortfiles(segsin):= real<::>; comment normalize the names in paramenter names; for i:= 1 step 2 until 5 do if names(i) extract 8 = 0 then names(i+1):= real <::>; begin integer array field headpart; integer array tail(1:10); integer i; zone z (128, 1, stderror); open(z, 4, names, 0); i:=monitor(42,z,0,tail); if i<>0 then alarm(4,i); content:=tail(9) shift (-12); if content<>20 and content<>21 then content:=0; <* not bs- nor sq-system. *> if content=21 then begin invar(z); headpart:= 0; headpart:= z.headpart(3); param(1):= z.headpart(10) extract 12; headpart:= z.headpart(1); i:= z.headpart(3); if i<noofrecs then alarm(16, noofrecs); if noofrecs<=0 then noofrecs:=i; close (z, false); end end; comment check the parameters; segsperinblock:= param(1); if segsperinblock < 1 or segsperinblock > 40 then alarm(1, 1); comment clear input; if param(2) <> 0 and param(2) <> 1 then alarm(1, 2); segsperoutblock:= if param(3) = 0 then segsperinblock else param(3); if segsperoutblock < 1 or segsperoutblock > 40 then alarm(1, 3); bytes_perinblock := segsperinblock * 512; bytes_peroutblock:= segsperoutblock * 512; i:= param(4); fixedlength:= i = 1; if i < 0 or i > 3 then alarm(1, 4); maxlength:= param(5); if maxlength < 2 or maxlength > bytes_perinblock or maxlength > bytes_peroutblock or maxlength mod 2 <> 0 then alarm(1, 5); min_segs_per_block:= (maxlength + 510)//512; maxlength_plus_four:= maxlength + 4; comment these two values are often used in the planning; noofkeys:= param(6); if noofkeys < 1 or noofkeys > maxlength or noofkeys > 169 then alarm(1, 6); begin comment block for check of keydescr, coresize and messagebuffers; integer i, abstype, neededcorebytes, position; real array rarr(1:2); comment check the contents of integer array keydescr; maxkeyposition:= 0; for i:= 1*2+1 step 2 until noofkeys*2+1 do begin abstype:= abs keydescr(i); if abstype = 5 <* absbyte *> then abstype:= 1; position:= keydescr(i+1); if position < abstype or position > maxlength or position mod 2 <> 0 and abstype <> 1 or abstype < 1 or abstype > 4 then alarm(2, i//2); if position > maxkeyposition then maxkeyposition:= (position + 1)//2*2; end checkkeydescr; if fixedlength then maxkeyposition:= 2; comment see declaration of array shortrec in block for pass 0; \f comment check the available amount of core; system(2, freebytes, rarr); neededcorebytes:= 6650 <* 350 bytes below the 7000 bytes *> + bytesperinblock + bytesperoutblock + 4*maxlength + 24*noofkeys; i:= free_bytes - needed_core_bytes; if i < 0 then alarm(12, i); comment subtract from the value of freebytes: 1: room for about 10 program segments + up to 12 segments extra. 2: room for 3 zone-descriptors, and 5 share-descriptors. 3: room for local variables. 4: room for keycode and 1 record in sort-zone; freebytes:= freebytes + (-6650 + 510 + 2*106) - maxlength - 24*noofkeys - ( if i > 3*12*512 then 12*512 else i//3); comment as i >= 0 freebytes will now satisfy the relation: freebytes >= 510 + 2*106 + 3*maxlength + 512*(segsperinblock+segsperoutblock). this is a very important fact since the remaining algorithm of the program will ensure that the sorting can be done if freebytes is not less than any of the following two expressions: 1. string generation: 512*(segs_per_in_block + min_segs_per_block) + 2*(maxlength + 4) + max_keyposition. 2. merging passes: 512*3*min_segs_per_block + 2*(maxlength + 106). where min_segs_per_block = (maxlength + 510)//512. find out the number of free message buffers from the job process; i:= system(6, 0, rarr); system(5, i+26, rarr); messbufsforinp:= rarr(1) shift (-36) extract 12 - 2; comment the buffer claim is byte 26 of the job process description. the output zone requires up to 2 message buffers during a block change, one of these can also be utilized for write(out,...; end local block; \f comment now the parameters and the available core storage have been checked, and various global variables have been initialized; comment -test-< write(out, <:<10>test 2, freebytes, messbufsforinp: :>, freebytes, messbufsforinp); comment test of procedure strategy, mintime is printed at the end of procedure strategy *test str.* write(out, <:<10>test 10 <10>:>) *test str.* for freebytes:= 80000, 40000, 20000, 7000 + 512*(segsperinblock + segsperoutblock) + 4*maxlength - 300, freebytes - 12*512 - maxlength - 400 do begin for segsin:= 8, 30, 125, 500, 2000, 8000 do begin for discstores:= 1, 2 do begin for outbytes:= 0 step 1 until discstores*2 - 2 do begin write(out, <:<10>freebytes, segsin, discstores, passes: :>, << dddd>, freebytes, segsin, discstores, outbytes) *test str.* noofrecs:= segsin * 512.0 / maxlength *test str.* for strings1:= 1, 10, 100, 1000 do begin passes:= outbytes *test str.* select_merge_strategy *test str.* write(out, <: shares, strings1: :>, << dddd>, shares, strings1, <:<10> segs:>) *test str.* for pass:= 1 step 1 until passes do write(out, << ddd>, segsout(pass)) *test str.* write(out, <:<10> insegs :>) *test str.* for pass:= 1 step 1 until passes do write(out, << ddd>, insegs(pass)) *test str.* write(out, <:<10> m.p. :>) *test str.* for pass:= 1 step 1 until passes do write(out, << ddd>, mergepower(pass)) *test str.* end end end end end; \f comment find out whether the sort can be done without merging passes. passes = 0 signals to the procedure select_work_file, that this is the case. if passes = 0 upon the return it is possible to do without merge; passes:= if noofrecs < 0 or noofrecs > (freebytes - maxkeyposition - bytes_per_inblock - bytes_per_outblock) // maxlength_plus_four then 1 else 0; selectworkfile(1, names); comment -test-< write(out, <:<10>test 3 :>, discstores, passes, segsin); shares:= 1; if passes = 0 then begin comment sort without merge; segsoutzero:= segsperoutblock; recs:= noofrecs; <* records in sortzone *> end no merging else begin comment select a reasonable blocklength and number of shares for the string generating pass (pass 0). this is done in a quite intuitive manner, as a flat optimum is assumed; segsoutzero:= (free_bytes - bytes_per_inblock - 3*maxlength_plus_four)//(3*512); comment as a start take a third of the available core minus room for inputblock, 2 records in sortzone and maxkeyposition. this ensures a minimum stringlength of 2 records which in turn ensures that the string chaining records (6 bytes each) will not take up as much room as the data records; \f if messbufs_for_inp > 0 then begin comment doublebuffering is possible; i:= segsoutzero // segsperinblock; if i >= 1+2*3 or ( i >= 1+2*1 and discstores = 2) then begin comment choose doublebuffering; shares:= 2; segsoutzero:= (segsoutzero - segsperinblock) // shares; end enough core for doublebuffering; end enough messagebuffers for doublebuffering; i:= 4 * segsperinblock; if segsoutzero > i then segsoutzero:= i; if segsoutzero < minsegsperblock then segsoutzero:= minsegsperblock; comment calculate the number of records in sortzone; recs:= (freebytes - maxkeyposition - (segsperinblock + segsoutzero) * 512 * shares) // maxlength_plus_four; if noofrecs >= 0 and recs > noofrecs then recs:= noofrecs; end select segsoutzero and shares; comment -test-< write(out, <:<10>test 4, shares, segsoutzero, recs::>, shares, segsoutzero, recs); \f pass:= 0; begin comment block for pass 0, string generation. the algorithm used here does very much look like the correspon- ding algorithm in the procedure tapesortadp. in order to speed up the code a bit, most quantities used in the inner loops are local; boolean fixedlength; integer i, active, dead, reclength, recsin, minlength, actual_maxlength, out_block_size, trap_situation; real endfile; integer field ifld, length, recordnumber; real field endmark; real array shortrec(1:(maxkeyposition + 2)//4); long field lfld; real array field first_part, out_base; zone zin(shares*segsperinblock*128, shares, docerror), zout(shares*segsout_zero*128, shares, errorinoutfile), zsort((recs+1)*maxlength_plus_four//4+3*(noofkeys+1)+6, 1, docerror), zcomp(3*noofkeys+6, 1, stderror); \f comment initialization of constants; fixedlength:= param(4) = 1; length:= 2; endmark:= if fixedlength then 4 else 8; endfile:= eof; recsin:= noofrecs; reclength:= maxlength; minlength:= keydescr(1*2+2); if recsin < 0 and minlength < 8 then minlength:= 8; comment the minimum length of a variable length record is the position of the first keyfield or endoffilemark, if it is used; i:= 1; if content = 21 then opensq(zin, string names(increase(i)), 1 shift 18, (if fixedlength then maxlength else 0) shift 12 add (if param(4) = 0 then 0 else 1)) else open(zin, 4, names, 1 shift 18); i:= 1; if passes = 0 and content = 21 then opensq(zout, string sortfiles(increase(i)), 0, (if fixedlength then maxlength else 0) shift 12 add 2) else open(zout, 4, sortfiles, 0); if param(4) = 0 and content <> 21 then begin comment checksum control wanted, set free zone parameter; integer array ia(1:20); getzone6(zin, ia); ia(11):= 1 shift 23; setzone6(zin, ia); end invar with checksum; actual_maxlength:= if fixedlength then maxlength else minlength; out_block_size:= segs_out_zero * 512; outrec6(zout, out_block_size); out_base:= strings1:= outsegment:= outbytes:= recsout:= recs:= active:= dead:= 0; if recsin = 0 then goto endoffile; \f comment initialize zsort and zcomp. an extra keyfield, recordnumber, is included for zsort in order to retain the ordering of synonyms. zcomp is used in calls of sortcomp to determine whether a record can participate in the current string or not. in this comparison it is needless and impossible to in- clude the recordnumber; i:= 2*noofkeys + 4; begin integer array local_keydescr(3:i); local_keydescr(i-1):= 2; local_keydescr(i ):= recordnumber:= maxlength + 2; for i:= i-2 step -1 until 3 do local_keydescr(i):= keydescr(i); startsort6(zcomp, local_keydescr, noofkeys , recordnumber); i:= startsort6(zsort, local_keydescr, noofkeys + (if recordnumber < 2047 then 1 else 0), recordnumber); end initialize zsort and zcomp; for i:= i step -1 until 1 do begin comment initial filling of zsort; if recsin = 0 then goto endoffile; if fixedlength then inrec6(zin, reclength) else begin invar(zin); reclength:= zin.length; if reclength > actual_maxlength then begin actual_max_length:= reclength; if reclength > maxlength then alarm(9, reclength); end a greater record else if reclength < minlength then alarm(9, reclength); end variable length; if recsin > 0 then recsin:= recsin - 1 else if zin.endmark = endfile then goto endoffile; recs:= recs + 1; newsort(zsort); active:= active + 1; for lfld:= maxkeyposition step -4 until reclength + 2 do zsort.lfld:= 0; comment set undefined keyfields to zero; tofrom(zsort, zin, reclength); zsort.recordnumber:= recs; end initial filling of zsort; comment start the writing of the first string; \f trap(trap_pass_0); nextout: trap_situation:= 1; <* comparison in zsort *> outsort(zsort); active:= active - 1; comment test trap - sqrt(param(8)-2); fromlifesort: recsout:= recsout + 1; if -,fixedlength then reclength:= zsort.length; if out_base + reclength > out_block_size then begin comment change the output block; if pass = passes then begin for ifld:= out_base + 2 step 2 until out_blocksize do zout.ifld:= 0; first_part:= 0; end zero fill in last pass else begin first_part:= out_blocksize - out_base; tofrom(zout.out_base, zsort, first_part); end not final output; outrec6(zout, out_blocksize); out_base:= reclength - first_part; tofrom(zout, zsort.firstpart, out_base); end change output block else begin if reclength < 26 then begin comment move with a for statement; for lfld:= reclength step -4 until 4 do zout.outbase.lfld:= zsort.lfld; if lfld = 2 then zout.outbase.length:= zsort.length; end short rec else tofrom(zout.outbase, zsort, reclength); outbase:= reclength + outbase; end normal record change; \f if recsin = 0 then goto endoffile; if fixedlength then inrec6(zin, reclength) else begin invar(zin); reclength:= zin.length; if reclength > actual_maxlength then begin actual_maxlength:= reclength; if reclength > maxlength then alarm(9, reclength); end a greater record else if reclength < minlength then alarm(9, reclength); end variable length; if recsin > 0 then recsin:= recsin - 1 else if zin.endmark = endfile then goto endoffile; recs:= recs + 1; \f if reclength >= maxkeyposition then begin comment there are no keyfields outside the record; trap_situation:= 2; <* in sortcomp zin *> comment test trap - sqrt(param(8)-3); if sortcomp(zcomp, zin, zsort) < 0 then begin comment the record just read cannot participate in the current string, because it should have preceded the last written; deadsort(zsort); dead:= dead + 1; end else begin comment the record can be included in the current string; newsort(zsort); active:= active + 1; end sortcomp; if reclength < 26 then begin comment move with for statement; for lfld:= reclength step -4 until 4 do zsort.lfld:= zin.lfld; if lfld = 2 then zsort.length:= zin.length; end short rec else tofrom(zsort, zin, reclength); zsort.recordnumber:= recs; if active > 0 then goto nextout; end no reset of undefined else begin comment the input record does not contain all keyfields, the undefined keyfields are set to zero before any comparison; for lfld:= maxkeyposition step -4 until reclength + 2 do shortrec.lfld:= 0; tofrom(shortrec, zin, reclength); trap_situation:= 3; <* in sortcomp shortrec *> comment test trap - sqrt(param(8)-4); if sortcomp(zcomp, shortrec, zsort) < 0 then begin comment see the comments above; deadsort(zsort); dead:= dead + 1; end else begin newsort(zsort); active:= active + 1; end; tofrom(zsort, shortrec, maxkeyposition); zsort.recordnumber:= recs; if active > 0 then goto nextout; end reset undefined; \f comment only dead records are now left in the sortzone. terminate the current string, activate the dead records, and continue; strings1:= strings1 + 1; endoutstring(zout, out_base, out_block_size); trap_situation:= 1; <* comparison in zsort *> lifesort(zsort); active:= dead - 1; dead:= 0; goto fromlifesort; trap_pass_0: if trap_situation = 1 then print_zsort(zsort) else begin <* error in sortcomp *> print_rec(zsort.recordnumber, zsort); if trap_situation = 2 then print_rec(recs, zin) else print_rec(recs, shortrec); end in sortcomp; alarm(15, 0); endoffile: recsin:= 0; comment the remaining active records are written to the current string. recsin = 0 makes the program return to endoffile without at- tempting further input; if active > 0 then goto nextout; strings1:= strings1 + 1; if dead > 0 then begin comment terminate the current string, activate the dead records, and write the last string; endoutstring(zout, out_base, out_block_size); lifesort(zsort); active:= dead - 1; dead:= 0; goto fromlifesort; end remaining dead; changerec6(zout, out_base); if param(2) = 1 then begin comment remove the input file; i:= monitor(48, zin, 0, insegs); if i <> 0 then alarm(7, i); end clearinput else close(zin, true); if pass = passes then endoffilerec(zout); comment endoffilerec jumps to returnfromdiscsort; comment find the length of the file generated; getposition(zout, 0, segsin); segsin:= (out_base - 1)//512 + 1 + segsin; setposition(zout, 0, 0); <* forces the last block out *> if actual_maxlength < max_keyposition then actual_maxlength:= max_keyposition; maxlength := actual_maxlength; maxlength_plus_four:= actual_maxlength + 4; comment -test-< write(out, <:<10>test 5 , strings1, segsin, recs: :>, strings1, segsin, recs); end string generation; \f comment now sorted strings have been output to the file, the name of which is stored in sortfiles(1:2); select_work_file(2, sortfiles); comment a new workfile has been created. its name is stored in sortfiles (3:4). the number of disc stores available for the merge is specified by the value of the integer discstores, and the value of passes specifies the the possible numbers of passes; comment -test-< write(out, <:<10>test 6 , discstores, passes: :>, discstores, passes); select_merge_strategy; comment -test-< write(out, <:<10>test 7 , passes, shares: :>, passes, shares); comment -test-< for pass:= 1 step 1 until passes do write(out, <:<10>test 8, s.in, s.out, m.p.: :>, insegs(pass), segsout(pass), mergepower(pass)); comment the value of passes defines the number of remaining passes. insegs, segsout(1:passes) contains the blocklengths of all passes, and mergepower(1:passes) contains the mergepowers. the value of shares specifies the number of shares to be used for input/output; \f pass:= 1; next_merging_pass: begin comment block for one merging pass; boolean fixedlength; integer i, reclength, instring, active, insegment, inbytes, nextrecsin, in_block_size, out_block_size; integer field ifld, length, stringno; integer array recsin, save_in_base(1:mergepower(pass)); long field lfld; real array field first_part, in_base, out_base; zone array z(mergepower(pass), shares*in_segs(pass)*128, shares, docerror); zone zout(shares*segsout(pass)*128, shares, errorinoutfile), zmerge((mergepower(pass)+1)*maxlength_plus_four//4+3*(noofkeys+1)+6, 1, docerror); procedure end_of_in_block(z, rec); zone z; real array rec; comment reads the next record from z, taking the first part from the current block and the second part from the next; begin first_part:= in_blocksize - in_base; tofrom(rec, z.in_base, first_part); in_base:= inrec6(z, 0); inrec6(z, in_base); if reclength = 1 000 000 then reclength:= z.length; in_base:= reclength - first_part; tofrom(rec.first_part, z, in_base); end end_of_in_block; comment initialize local constants; fixedlength:= param(4) = 1; reclength:= maxlength; length:= 2; comment open all the input/output zones, and initialize the merge zone; \f for instring:= 1 step 1 until mergepower(pass) do begin i:= if pass extract 1 = 1 then 1 else 3; open(z(instring), 4, string sortfiles(increase(i)), 1 shift 18); end open instrings; i:= if pass extract 1 = 1 then 3 else 1; if pass = passes and content = 21 then opensq(zout, string sortfiles(increase(i)), 0, (if fixedlength then maxlength else 0) shift 12 add 2) else open(zout, 4, string sortfiles(increase(i)), 0); in_block_size := in_segs (pass) * 512; out_block_size:= (if pass = passes then segs_per_out_block else segs_out(pass)) * 512; outrec6(zout, out_block_size); out_base:= 0; insegment:= outsegment; inbytes:= outbytes; nextrecsin:= recsout; comment these variables defines the position and length of the first input string for the merge, it is identical to the last created output string; outsegment:= outbytes:= recsout:= 0; comment -test-< write(out, <:<10>test 20 , start pass: :>, pass); if nextrecsin = 0 then goto terminatepass; comment only relevant for zero records to sort; comment initialize zmerge, stringno is included as the keyfield of the lowest priority in order to retain the ordering of synonyms. the sorting on the stringnumber is in decreasing order in passes with odd numbers because the strings here are read in reverse order; i:= 2*noofkeys + 4; begin integer array local_keydescr(3:i); local_keydescr(i-1):= if pass extract 1 = 1 then -2 else 2; local_keydescr(i ):= stringno:= maxlength + 2; for i:= i-2 step -1 until 3 do local_keydescr(i):= keydescr(i); startsort6(zmerge, local_keydescr, noofkeys + (if stringno < 2047 then 1 else 0), stringno); end initialize zmerge; active:= 0; \f trap(trap_pass_n); if false then begin trap_pass_n: if active >= 2 then print_zsort(zmerge); alarm(15, pass); end trap; fillzmerge: for instring:= 1, instring + 1 while instring <= mergepower(pass) and insegment >= 0 do begin comment position the zone corresponding to instring in front of the next string, read the string chaining record, and the first real record. the first record is inserted in zmerge; setposition(z(instring), 0, insegment); in_base:= in_bytes; i:= inrec6(z(instring), 0); inrec6(z(instring), i); comment -test-< write(out, <:<10>test 21 , instring, insegment, :>, <:inbytes, nextrecsin: :>, instring, insegment, inbytes, nextrecsin); recsin(instring):= nextrecsin - 1; comment recsin holds a counter of records for each instring; if insegment = 0 and inbytes = 0 then insegment:= -1 else begin comment read the string chaining record; ifld:= 0; for i:= 1 step 1 until 3 do begin if in_base >= in_blocksize then begin in_base:= inrec6(z(instring), 0); inrec6(z(instring), in_base); in_base:= 2; end blockchange else in_base:= in_base + 2; nextrecsin:= z(instring).in_base.ifld; case i of begin insegment:= nextrecsin; inbytes := nextrecsin; <* nextrecsin is ok *> end case i; end for i; end chain to next instring; \f newsort(zmerge); active:= active + 1; if -,fixedlength then reclength:= if in_base < in_block_size then z(instring).in_base.length else 1 000 000; if in_base + reclength > in_block_size then end_of_in_block(z(instring), zmerge) else begin tofrom(zmerge, z(instring).in_base, reclength); in_base:= reclength + in_base; end normal record change; save_in_base(instring):= in_base; for ifld:= reclength + 2 step 2 until maxkeyposition do zmerge.ifld:= 0; comment set undefined keyfields to zero; zmerge.stringno:= instring; comment -test-< printkey(recsout, instring, zmerge); end initial filling of zmerge; \f nextout: outsort(zmerge); comment test trap - sqrt(param(8)-5); instring:= zmerge.stringno; recsout:= recsout + 1; comment -test-< if recsin(instring) mod param(8) = 0 then printkey(recsout, instring, zmerge); if -,fixedlength then reclength:= zmerge.length; if out_base + reclength > out_block_size then begin comment change the output block; if pass = passes then begin for ifld:= out_base + 2 step 2 until out_blocksize do zout.ifld:= 0; first_part:= 0; end zero fill in last pass else begin first_part:= out_blocksize - out_base; tofrom(zout.out_base, zmerge, first_part); end intermediate pass; outrec6(zout, out_blocksize); out_base:= reclength - first_part; tofrom(zout, zmerge.first_part, out_base); end change out block else begin if reclength < 26 then begin comment move with for statement; for lfld:= reclength step -4 until 4 do zout.out_base.lfld:= zmerge.lfld; if lfld = 2 then zout.out_base.length:= zmerge.length; end short rec else tofrom(zout.out_base, zmerge, reclength); out_base:= reclength + out_base; end normal record change; \f if recsin(instring) = 0 then goto endofstring; recsin(instring):= recsin(instring) - 1; newsort(zmerge); in_base:= save_in_base(instring); if -,fixedlength then reclength:= if in_base < in_block_size then z(instring).in_base.length else 1 000 000; if in_base + reclength > in_block_size then end_of_in_block(z(instring), zmerge) else begin tofrom(zmerge, z(instring).in_base, reclength); in_base:= reclength + in_base; end normal record change; save_in_base(instring):= in_base; for ifld:= reclength + 2 step 2 until max_key_position do zmerge.ifld:= 0; comment set undefined keyfields to zero; zmerge.stringno:= instring; goto nextout; endofstring: active:= active - 1; if active > 0 then goto nextout; comment now all strings have been emptied, and a new round should be started if there are more strings left; if insegment >= 0 then begin endoutstring(zout, out_base, out_block_size); goto fillzmerge; end take a new round; \f terminatepass: changerec6(zout, out_base); if pass = passes then begin comment remove input file; i:= monitor(48, z(1), 0, recsin); if i <> 0 then alarm(7, i); comment end the sort by procedure endoffilerec, which jumps to the label return_from_discsort; endoffilerec(zout); end it was the last pass; setposition(zout, 0, 0); comment setposition forces the last block out; comment -test-< write(out, <:<10>test 9 , pass, segsin, recsout: :>, pass, segsin, recsout); pass:= pass + 1; goto next_merging_pass; end block for one merging pass; alarmcall: system(9, segsin, string sortfiles(1)); comment the alarm is called here in order to obtain a simple alarm address; returnfromdiscsort: ;comment -test-< write(out, <:<10>test 10, blocksread: :>, blocksread); end discsort; end end ▶EOF◀