Truncation of strings

The Fractal Sequence applet defaults to filter method ``T.'' In fact, any character other than

will run this filter method. In our example, this method replaces every substring ``12'' by the substring ``1,'' and uses the result to drive the chaos game. This means that the resulting object has points in exactly those addresses that contain no substring ``21'' (addresses are read in the opposite direction from sequences).

I believe that the resulting object is a near-fractal. Clearly there is a good deal of self-similarity: sub-squares

, and

are exact half-scale replicas of the entire object, but this similarity breaks down for sub-square

. Every attempt (of mine...) to describe this object using a finite set of affine transformations misses a small portion. Of course, the fact that I've been unable to find a suitable finite set of affine transformations doesn't prove that there are none. However, below (see Section 4) I prove there there is no `` really nice'' set of affine transformations that do the job, where by ``really nice'' I mean a finite set of affine transformations that are each a contraction, whose ranges are not overlapping, and whose ranges correspond to sub-squares of some finite size.

If you render this object in the Fractal Sequence applet using

, you'll find that some addresses appear denser than others. For example, addresses

, and

are ``hot spots'' (dense), whereas

, and

are considerably sparser (of course

is empty). Does this reflect the structure of the object, or simply the choice of probabilities? I think the answer is the latter (the choice of probabilities) but it may not be possible to choose an optimal set of

Here's my reasoning. Suppose

and

are two equal-length substrings over the digits $\{1,2,3,4\}$ . If I denote the corresponding areas as $A_{s_1}$ , and $A_{s_2}$ (the areas of those sub-squares with addresses

-reversed and

-reversed), I would like $p_{s_1}/A_{s_1}$

$p_{s_2}/A_{s_2}$ . For strings of length

this means that

$\displaystyle \frac{p_2}{3/4} = p_i\qquad i = 1,3,4,$

$\displaystyle \frac{(2/3)(1-3p_1)}{3/4} = p_1 \Longrightarrow p_1 = p_3 = p_4 \frac{8}{33}, \quad p_2 = \frac{3}{11}.$

However ``hot-spots'' remain at addresses

, and

. To see why this occurs, notice that whenever a string

is truncated by removing the one (or more) trailing

s, a new string $1\alpha$ is created, where $\alpha$ is one of

, or

-- the non-

character to the right of the block of

s. So the frequency of strings $1\alpha$ is increased, which corresponds to the number of points in addresses $\alpha 1$ . Since these addresses have areas equivalent to $\alpha 3$ and $\alpha 4$ , their relative density increases.

Is it possible to ``tweak'' the probabilities

so that all the addresses of length

are rendered with equal density? I don't think so, for the following reason. There are fifteen legal addresses of length

(address

is generated by a sequence that contains the forbidden substring). Among these fifteen, addresses

, and

have

the area of the others, since they each have a sub-square (e.g.

) removed. Recalling that sequences are read in the opposite order from addresses, this means that we'd like (after filtering) to end up with addresses

so that:

$\displaystyle p_2 p_2$	$\displaystyle =$	$\displaystyle p_2 p_1 \Longrightarrow p_2 = p_1$
$\displaystyle p_2 p_1$	$\displaystyle =$	$\displaystyle \frac{3}{4} p_1 p_1 \Longrightarrow p_2 = \frac{3p_1}{4}$
$\displaystyle .$