Hi all,
I think this is a pretty interesting algorithm problem so I'll post a link under 'programming' as well. But I'm hoping for a python solution, so it's going to live here.
I have a list of items like this, *as strings*.
C.x
C.p
C.T
C.delta[1]
C.delta[2]
C.delta[3]
C.sat.x
C.sat.p
C.h
C.delta[5]
and I would like to convert that into something more readable, like
C: x, p, T, delta[1-3,5], h
C.sat: x, p
* values in brackets can be numbers, in which case they can be assumed integers, and sequences of integers should be condensed like [1,2,3,5] to [1-3,5]
* values in brackets can also be 'symbols', in which case they are just listed out, eg [symbol1, symbol2, symbol3].
* brackets can act as a suffix for any node, not just the final one
* nodes are either array member nodes like x in "A[1]", or symbol nodes, like "A[symbol1]" or named nodes, like "A.x".
* the algorithm should zipper together as much as possible, as shown
Optional: it would be cool if
A.x[1].p
A.x[2].p
could be output as "A: x[1-2].p". Obviously this makes the problem quite a bit harder.
I thought that it might be possible to tackle this using python's list.reduce(func) method. But I haven't had any luck yet. The only way this is going to be solvable seems to be to re-construct the 'tree' then reduce it somehow with a top-down approach of some sort.
Cheers
JP
by: anthrax759Posted on 2006-02-02 at 11:17:20ID: 15856012
I can offer a suggestion as to how you'd do this... first and foremost, you'll want to loop entirely through each string and create an entry for it... I'd suggest having a switch statement that catches any numerical value and sets it aside into that category.. then letter values and sets them aside in that category... then catch the open bracket [ and anything up until the close bracket ] and repeat the process for each term, and even separate the terms by . so for example have a data structure with a sub1,2,3,4,5,etc and bkt1,2,3,4,5,etc and num1,2,3,4,5,etc.
We could take this:
C.x[12].pqzed.o
and as it looped through, it would be stored as follows:
C -> it's a letter, store it in sub1 then check next value
. -> it's a dot, so we'll take it as a separation
x -> it's a letter, so store it in sub2 and check next value
[ -> ooh it's a bracket, so we store not only the letter before it but also the element(s) inside it as well as the number itself in
] -> end the bracket
. -> next term
p -> letter, check next...
q -> letter, check next...
z -> letter, check next...
e -> letter, check next...
d -> letter, check next...
. -> terminate the last string and start new...
o -> letter, end of term...
Archive the term and start fresh for the next one...
Once you've achieved this, you'll want to do data comparisons...
Check each of the sub-terms to see if they're the same or at least similar, differing only by numbers... and if so, order the numbers...
As for the 1-7, that's a simple counting algorithm... compare the numerical values, and increment to see if they're in sequence... if so, continue until that sequence is broken, and once it's broken you have the starting term and the term that breaks the cycle(minus one) and just separate them with a dash... then, continue for the rest of them... I could probably write this up in a day or two although I write in C/C++ more than Python...
I hope this helps.
-aNthrax