last week we talked a bit about generating permutations and I told you that
this week will be about combinations. Not true; there's a bit more to tell
about permutations and that's what this Tip Of The Week is about. Maybe later
we'll talk a bit about combinations (if anyone's interested).
The little utility class we implemented last week was able to generate a
next permutation (if any) given a current permutation. But what if we want
to generate the m-th permutation directly? For instance given a list of
six permutations of length three what is the, say, third permutation?
We obviously don't want to walk that entire list, generate all those next
permutations until we've got the m-th permutation. for the third permutation
it wouldn't be too bad but imagine we want the 1,000,000 permutation out
of 13 length strings of symbols? I certainly am too impatient to wait
that long before my little laptop comes up with an answer.
This tip defines such an algorithm for unique permutation sequences, i.e.
all characters in a permutation sequence are unique. Here's an example:
Expand|Select|Wrap|Line Numbers
- 0: ABC
- 1: ACB
- 2: BAC
- 3: BCA
- 4: CAB
- 5: CBA
first column shows the 'rank' of the permutations. I want an algorithm that,
given the length of the permutation (== 3 in the example) and the rank number
(also == 3 in the example), would give me 'BCA' for me reasonably efficient,
prefereably in big-Oh time proportional to n.
If you have a close look at the leftmost characters in the permutations above
you'd notice a regularity: A, A, B, B, C, C. In general for a permutation of
size n the leftmost character stays the same for n!/n consecutive permutations.
In our example n == 3 and n!/n == (n-1)! == 6/3 == 2! == 2 and indeed the
leftmost character stays the same every two consecutive permutations.
The first (n-1)! permutations have the same leading character, the second
(n-1)! permutations have the same leading character and so on.
So if we know the rank of the permutation (3 in the example) we immediately
know the leftmost available thing: we divide the rank by (n-1)! and we pick
the 3/2! == the 1st available character where A is the zeroth, B is the first
and C is the second character (we always count starting at zero).
We can repeat this trick: we now have a rank of 3%2! == 1 left and two
characters A and C are left for the rest of the permutation.
For a permutation of length two and the characters A and C we have to pick
the 1st (counting from zero) letter out of this list: 0: A and 1: C
This may look like Kaballah of Voodoo but there is a strict regularity: let
'r' be the rank of the permutation we're looking for then the first character
is the available character number r/(n-1)!. Next we're looking for the r%(n-1)!
rank and select the second character and so on and so on.
This looks a bit like a recursive definition and indeed it is but we can solve
it iteratively. Here's the algorithm:
Expand|Select|Wrap|Line Numbers
- 1: public static <T> List<T> permute(List<T> list, int r) {
- 2: int n= list.size();
- 3: int f= fac(n);
- 4: List<T> perm= new ArrayList<T>();
- 5: list= new ArrayList<T>(list);
- 6: for (list= new ArrayList<T>(list); n > 0; n--, r%= f) {
- 7: f/= n;
- 8: perm.add(list.remove(r/f));
- }
- 9: return perm;
- }
In line 1 we pass it a List<T> of available 'things' of type T and the rank
number 'r'.
In line two 'n' is calculated, i.e. the length of the permutation is equal to
the length of the available things T; they must be all unique.
In line 3 we calculate n! (see below for the trivial implementation of method
'fac'.
In line 4 we build a new empty list that can contain things of type T.
Line 5 is a bit of a technicality: the algorithm needs to change the 'list'
parameter. But what if the List<T> cannot be altered? The algorithm circumvents
that by making an alterable copy of the list. Now the original list can stay
the same and we only change a local copy of that list.
Line 6 does a whole lot: for every position in the list we calculate the
rank of the available character and decrement the size of the available
character list. The variable 'f' has the values n!, (n-1)!, ... 1 respectively.
Note that strictly speaking variable 'n' is not needed because it is always
equal to the length of variable 'list'.
Line 7 updates the value of 'f' (see above).
Line 8 removes the wanted thing T from the list of available things T and
appends it to the permutation being generated.
Finally line 9 returns the generated permutation given the rank and available
things T.
The next method exercises the method above a bit; it defines a list of three
Strings (the instantiation of a thing T) and generates all permutations:
Expand|Select|Wrap|Line Numbers
- public static void main(String[] args) {
- List<String> list= Arrays.asList("A", "B", "C");
- for (int i= 0, n= 3, f= fac(n); i < f; i++)
- System.out.println(i+": "+perm(list, i));
- }
Expand|Select|Wrap|Line Numbers
- List<Integer> list= Arrays.asList(1, 2, 3);
little method and it can handle all sorts of types without altering or
adding anything.
Note how a read-only list is generated by the Arrays.toList method. It doesn't
matter because our 'permute' method doesn't alter it, i.e. it builds a local
writable copy of the list and alters the copy.
So now we cannot just generate permutations given a previous permutation but
we can generate them in any particular order. There's a flipside of the medal:
the algorithm we developed in the previous tip of the week didn't care how
large the permutation sequence was. Given a permutation of, say 1,000,000
things it could calculate the next permution. This algorithm however has
some severe numerical limitations: 13! is the largest faculty that can be
stored in an int so at most permutations of at most 13 things can be calculated.
Changing all ints to longs buys you just a couple more things in the sequence
and redoing the entire algorithm using BigIntegers would solve the size
limitation completely; I leave that as an exercise for the reader ;-)
Oh, before I forget, here's the definition of the 'fac' method:
Expand|Select|Wrap|Line Numbers
- public static int fac(int n) {
- int f= 1;
- for (; n > 0; f*= n--);
- return f;
- }
come into play. Some programmers don't know when to apply them and their designs
end up worse than they deserve; on the other hand, other programmers use every
single design pattern in all of their programs and their designs look a bit
like fully decorated Christmas trees. We'll talk a bit about that next week.
kind regards,
Jos