1 {-

2 - Decode.hs

3 -

4 - Module containing the code to decode LZW encodings

5 -

6 - Paul Sanders, Applications Research Division, BTL 1992

7 -

8 - DEC_VERSION 1 uses a list with keys in ascending order as a table, ie.

9 - entry n is given by table!!n.

10 -

11 - DEC_VERSION 2 uses a list with keys in descending order as a table, ie.

12 - entry n is given by table!!(#table-n). We don't need to calculate the

13 - length of the table however as this is given by the value of the next

14 - code to be added.

15 -

16 - DEC_VERSION 3 uses a balanced binary tree to store the keys. We can do

17 - this cheaply by putting the key in the correct place straight away and

18 - therefore not doing any rebalancing.

19 -}

22 where

25 import Defaults

26 import BinConv

30 {- We ideally want to store the table as an array but these are inefficient

31 - so we use a list instead. We don't use the tree used by encode since we

32 - can make use of the fact that all our keys (the codes) come in order and

33 - will be placed at the end of the table, at position 'code'.

34 -

35 - An entry of (SOME n, 'c') indicates that this code has prefix code n

36 - and final character c.

37 -}

40 {- Kick off the decoding giving the real function the first code value and

41 - the initial table.

42 -}

45 decode []

46 = []

47 decode cs

50 {- decode` decodes the first character which is special since no new code

51 - gets added for it. It is also special in so far as we know that the

52 - code is a singleton character and thus has prefix NONE. The '@' is a

53 - dummy character and can be anything.

54 -}

59 where

62 {- do_decode decodes all the codes bar the first.

63 -

64 - If the code is in the table (ie the code is less than the next code to be

65 - added) then we output the string for that code (using unfold if a prefix

66 - type) and add a new code to the table with the final character output as

67 - the extension and the previous code as prefix.

68 -

69 - If the code is not one we know about then we give it to decode_special for

70 - special treatment

71 -}

78 where

84 {- decode_special decodes a code that isn't in the table.

85 -

86 - The algorithm in Welch describes why this works, suffice it to say that

87 - the output string is given by the last character output and the string

88 - given by the previous code. An entry is also made in the table for the

89 - last character output and the old code.

90 -}

94 where

100 {- unfold a prefix code.

101 -

102 - chain back through the prefixes outputting the extension characters as we

103 - go.

104 -}

106 unfold n t_len t

110 where

112 SOME n' = prefix

114 data DecompTable = Branch DecompTable DecompTable | Leaf (Optional Int, Char) deriving (Show{-was:Text-})

116 {- Insert a code pair into the table. The position of the code is given by

117 - the breakdown of the key into its binary digits

118 -}

122 {- We can place a code exactly where it belongs using the following algorithm.

123 - Take the code's binary rep expanded to the maximum number of bits. Start

124 - at the first bit, if a 0 then insert the code to the left, if a 1 then

125 - insert to the right. Carry on with the other bits until we run out and are

126 - thus at the right place and can construct the node.

127 -}

130 = Leaf v

140 {- For a lookup we use the same mechanism to locate the position of the item

141 - in the tree but if we find that the route has not been constructed or the

142 - node has the dummy value then that code is not yet in the tree. The way

143 - in which the decode algorithm works this should never happen.

144 -}

149 = v