DataMuseum.dk

Presents historical artifacts from the history of:

DKUUG/EUUG Conference tapes

This is an automatic "excavation" of a thematic subset of
artifacts from Datamuseum.dk's BitArchive.

See our Wiki for more about DKUUG/EUUG Conference tapes

Excavated with: AutoArchaeologist - Free & Open Source Software.


top - metrics - download
Index: C T

⟦b30367c39⟧ TextFile

    Length: 2492 (0x9bc)
    Types: TextFile
    Names: »CTWC.doc«

Derivation

└─⟦276d19d6e⟧ Bits:30007243 EUUGD5_I: X11R5
    └─⟦4856bf7e7⟧ »./mit-4/mit-4.00« 
        └─⟦635ff9e7e⟧ 
            └─⟦this⟧ »mit/doc/I18N/Xsi/Xlc/CTWC.doc« 

TextFile

$Header: CTWC.doc,v 1.2 90/12/29 22:19:08 morisaki Exp $
$Date: 90/12/29 22:19:08 $

                           CT and WC


   In this implementation the communication codeset is Compound Text
encoding, abbreviated to CT, the internal processing codeset is wide
character encoding, abbreviated to WC.  The CT and WC occuring wherever
in this implementation is defined as following:

1. The CT encoding conforms with X standard document "Compound Text
   Encoding, Version 1.1" except:

	Section 6.  Non-Standard Character Set Encoding
	Section 7.  Directionality
	Section 10. Extensions

   The CT string is terminated with null character(NULL).

2. The WC is restricted to the following part of DIS 10646.  

	. canonical form, 4 octets(4-byte: group/plane/row/cell).
        . in the next sentences all number is decimal, and default group
          is 032, plane is 032.
	. row 032 for ISO 8859-1(L.2,p18) 
	. right-hand half of row 033 for right half of ISO8859-2(L4,p18) 
	. right-hand half of row 040 for right half of ISO8859-5(S25.2,p19)
	. right-hand half of row 042 for right half of ISO8859-7(S26.2,p19)
	. right-hand half of row 044 for right half of ISO8859-6(S27.1,p19)
	. right-hand half of row 047 for JIS X 0201(S33,p21)
	. plane 040 of group 032 for Chinese GB 2312(S9,p9)
	. plane 048 of group 032 for Japanese JIS X 0208(S10,p9)
	. plane 056 of group 032 for Korean KS C5601(S10,p9)
	
   
   where the numbers in paranthesis are line#(L), section#(S) and page#(p)
   in document of DIS 10646.

   The other registered charset in CT are put on the following place of 
   DIS 10646.

	. right-hand half of row 034 for right half of ISO8859-3
	. right-hand half of row 035 for right half of ISO8859-4
	. right-hand half of row 036 for right half of ISO8859-9
  
   The WC string is terminated with WNULL character.  The WNULL is 
   implementation-defined. 
 
3. Conversion between WC and CT.

   The conversion between WC and CT will be done accoding to the above
   correspondency.  In CT a charset can be designated both to graphic
   left(GL) and to graphic right(GR); in WC there is no such GL and GR.
   So the conversion function translates GL and GR of CT to same place
   of WC.  In the reverse, the conversion function always translates
   the WC to the GL of CT as following:

	. plane 040 of Chinese to "ESC$(A", not "ESC$)A"
	. plane 048 of Japanese to "ESC$(B", not "ESC$)B"
	. plane 056 of Korean to "ESC$(C", not "ESC$)C"
	. others one to one, no ambigious.