Learn the essential features and functions of the popular JavaScript framework for building mobile, desktop and web applications.

Hi all,

I think this is a pretty interesting algorithm problem so I'll post a link under 'programming' as well. But I'm hoping for a python solution, so it's going to live here.

I have a list of items like this, *as strings*.

C.x

C.p

C.T

C.delta[1]

C.delta[2]

C.delta[3]

C.sat.x

C.sat.p

C.h

C.delta[5]

and I would like to convert that into something more readable, like

C: x, p, T, delta[1-3,5], h

C.sat: x, p

* values in brackets can be numbers, in which case they can be assumed integers, and sequences of integers should be condensed like [1,2,3,5] to [1-3,5]

* values in brackets can also be 'symbols', in which case they are just listed out, eg [symbol1, symbol2, symbol3].

* brackets can act as a suffix for any node, not just the final one

* nodes are either array member nodes like x in "A[1]", or symbol nodes, like "A[symbol1]" or named nodes, like "A.x".

* the algorithm should zipper together as much as possible, as shown

Optional: it would be cool if

A.x[1].p

A.x[2].p

could be output as "A: x[1-2].p". Obviously this makes the problem quite a bit harder.

I thought that it might be possible to tackle this using python's list.reduce(func) method. But I haven't had any luck yet. The only way this is going to be solvable seems to be to re-construct the 'tree' then reduce it somehow with a top-down approach of some sort.

Cheers

JP

I think this is a pretty interesting algorithm problem so I'll post a link under 'programming' as well. But I'm hoping for a python solution, so it's going to live here.

I have a list of items like this, *as strings*.

C.x

C.p

C.T

C.delta[1]

C.delta[2]

C.delta[3]

C.sat.x

C.sat.p

C.h

C.delta[5]

and I would like to convert that into something more readable, like

C: x, p, T, delta[1-3,5], h

C.sat: x, p

* values in brackets can be numbers, in which case they can be assumed integers, and sequences of integers should be condensed like [1,2,3,5] to [1-3,5]

* values in brackets can also be 'symbols', in which case they are just listed out, eg [symbol1, symbol2, symbol3].

* brackets can act as a suffix for any node, not just the final one

* nodes are either array member nodes like x in "A[1]", or symbol nodes, like "A[symbol1]" or named nodes, like "A.x".

* the algorithm should zipper together as much as possible, as shown

Optional: it would be cool if

A.x[1].p

A.x[2].p

could be output as "A: x[1-2].p". Obviously this makes the problem quite a bit harder.

I thought that it might be possible to tackle this using python's list.reduce(func) method. But I haven't had any luck yet. The only way this is going to be solvable seems to be to re-construct the 'tree' then reduce it somehow with a top-down approach of some sort.

Cheers

JP

Experts Exchange Solution brought to you by

Enjoy your complimentary solution view.

Get this solution by purchasing an Individual license!
Start your 7-day free trial.

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

You've explained the issue of string parsing the lines here but that's really not the part I have a problem with. I can do this with simple regexps or DParser or something. The tough part is the "data comparisons" part, and that is the only part I am struggling to conceptualise properly. I really think that one needs to construct a tree and then traverse it in a clever way. But this is the part that I'm looking for input on.

Cheers

JP

from operator import itemgetter

def fold(data):

""" fold sorted numeric sequence data into ranged representation:

>>> fold([1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28])

'[1,4-6,10,15-18,22,25-28]

"""

folded = []

for k, g in groupby(enumerate(data), lambda (i,x):i-x):

seq = map(itemgetter(1), g)

if len(seq) > 1:

x = '%s-%s' % (seq[0], seq[-1])

else:

x = str(seq[0])

folded.append(x)

return folded and '[%s]' % ','.join(folded) or ''

def reduce(names):

"""reduce a list of items nto something more readable:

>>> data = 'C.x C.p C.T C.delta[1] C.delta[2] C.delta[3] C.sat.x C.sat.p C.h C.delta[5]'.split()

>>> res = reduce(data)

>>> for k in sorted(res):

... print '%s: %s' % (k, res[k])

C: T, delta[1-3,5], h, p, x

C.sat: p, x

"""

data = sorted([n.split('.') for n in names], key=len)

res = {}

for k, g in groupby(data, lambda x: len(x)):

item = g.next()

assert len(item) == k

key = '.'.join(item[:-1])

indexed = {}

seq = set(get(indexed, item))

for item in g:

seq.add(get(indexed, item))

res[key] = ', '.join(i+fold(indexed.get(

return res

def get(indexed, item):

item = item[-1]

if item.endswith(']'):

item, idx = item[:-1].split('[')

indexed.setdefault(item, []).append(int(idx))

return item

Enjoy!

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trialHave just been trying your suggestion out now. I have a minor problem though, mostly due to my example not being general enough. In some cases, my list will look like this:

Sl.phirdelta_r2[49]

Sl.phirdelta_r2[50]

Sl.phirdelta_r2[51]

Sl.dPSI_ddelta[56]

Sl.DELTA[56]

Sl.tau

rhof

rhol

T

Sl.delta

Sl.d1

As you can see, there are multiple root nodes in this list, eg 'rhof', 'rhol', 'T'. It seems to be a possibility that isn't handled by your code, and the code ends up splitting strings into characters in this case. Any suggestions for a quick fix on this?

Cheers

JP

key = '.'.join(item[:-1]) or ''

seq = set([get(indexed, item)])

Global names will be grouped under ''. Put whatever name you want (eg. 'GLOBAL') instead of ''.

S2.d1

S2.rho

S2.t1

S2.phi0tau

S2.phirtau_r2[8]

S2.phirtau_r2[9]

S2.phirtau_r2[10]

S2.phirtau_r2[11]

S2.phirtau_r2[12]

S2.phirtau_r2[13]

S2.phirtau_r2[14]

S2.phirtau_r2[15]

S2.phirtau_r2[16]

S2.phirtau_r2[17]

S2.phirtau_r2[18]

S2.phirtau_r2[19]

S2.phirtau_r2[20]

S2.phirtau_r2[21]

S2.phirtau_r2[22]

S2.phirtau_r2[23]

S2.phirtau_r2[24]

S2.phirtau_r2[25]

S2.phirtau_r2[26]

S2.phirtau_r2[27]

S2.phirtau_r2[28]

S2.phirtau_r2[29]

S2.phirtau_r2[30]

S2.phirtau_r2[31]

S2.phirtau_r2[32]

S2.phirtau_r2[33]

S2.phirtau_r2[34]

S2.phirtau_r2[35]

S2.phirtau_r2[36]

S2.phirtau_r2[37]

S2.phirtau_r2[38]

S2.phirtau_r2[39]

S2.phirtau_r2[40]

S2.phirtau_r2[41]

S2.phirtau_r2[42]

S2.phirtau_r2[43]

S2.phirtau_r2[44]

S2.phirtau_r2[45]

S2.phirtau_r2[46]

S2.phirtau_r2[47]

S2.phirtau_r2[48]

S2.phirtau_r2[49]

S2.phirtau_r2[50]

S2.phirtau_r2[51]

S2.phirdelta_r2[9]

S2.phirdelta_r2[10]

S2.phirdelta_r2[11]

S2.phirdelta_r2[12]

S2.phirdelta_r2[13]

S2.phirdelta_r2[14]

S2.phirdelta_r2[15]

S2.phirdelta_r2[16]

S2.phirdelta_r2[17]

S2.phirdelta_r2[18]

S2.phirdelta_r2[19]

S2.phirdelta_r2[20]

S2.phirdelta_r2[21]

S2.phirdelta_r2[22]

S2.phirdelta_r2[23]

S2.phirdelta_r2[24]

S2.phirdelta_r2[25]

S2.phirdelta_r2[26]

S2.phirdelta_r2[27]

S2.phirdelta_r2[28]

S2.phirdelta_r2[29]

S2.phirdelta_r2[30]

S2.phirdelta_r2[31]

S2.phirdelta_r2[32]

S2.phirdelta_r2[33]

S2.phirdelta_r2[34]

S2.phirdelta_r2[35]

S2.phirdelta_r2[36]

S2.phirdelta_r2[37]

S2.phirdelta_r2[38]

S2.phirdelta_r2[39]

S2.phirdelta_r2[40]

S2.phirdelta_r2[41]

S2.phirdelta_r2[42]

S2.phirdelta_r2[43]

S2.phirdelta_r2[44]

S2.phirdelta_r2[45]

S2.phirdelta_r2[46]

S2.phirdelta_r2[47]

S2.phirdelta_r2[48]

S2.phirdelta_r2[49]

S2.phirdelta_r2[50]

S2.phirdelta_r2[51]

S2.r3_b1[52]

S2.r3_b1[53]

S2.r3_b1[54]

S2.PSI[55]

S2.dPSI_dtau[55]

S2.dPSI_ddelta[55]

S2.theta[55]

S2.dDELTA_ddelta[55]

S2.DELTA[55]

S2.dDELTAbi_ddelta[55]

S2.dDELTAbi_dtau[55]

S2.theta[56]

S2.dDELTA_ddelta[56]

S2.DELTA[56]

S2.dDELTAbi_ddelta[56]

S2.dDELTAbi_dtau[56]

S2.PSI[56]

S2.dPSI_dtau[56]

S2.dPSI_ddelta[56]

T_ho

C.DT_2

S2.phirdelta

C.LMTD

q

C.h_h2

S2.phirdelta_r2[8]

S2.phirtau

S2.tau

S2.delta

and I got the entry 'LMTD' showing up under 'S2'. Any further thoughts?

def fold(data):

""" fold sorted numeric sequence data into ranged representation:

>>> fold([1, 4,5,6, 10, 15,16,17,18, 22, 25,26,27,28])

'[1,4-6,10,15-18,22,25-28]

"""

folded = []

for k, g in groupby(enumerate(sorted(d

seq = map(itemgetter(1), g)

if len(seq) > 1:

x = '%s-%s' % (seq[0], seq[-1])

else:

x = str(seq[0])

folded.append(x)

return folded and '[%s]' % ','.join(folded) or ''

def reduce(names):

"""reduce a list of items nto something more readable:

>>> data = 'C.x C.p C.T C.delta[1] C.delta[2] C.delta[3] C.sat.x C.sat.p C.h C.delta[5]'.split()

>>> res = reduce(data)

>>> for k in sorted(res):

... print '%s: %s' % (k, res[k])

C: T, delta[1-3,5], h, p, x

C.sat: p, x

"""

data = sorted([n.split('.') for n in sorted(names)], key=len)

res = {}

for k, g in groupby(data, lambda x: len(x)):

if k == 1:

indexed = {}

seq = set(get(indexed, item) for item in g)

res['[global]'] = ', '.join(i+fold(indexed.get(

else:

for key, g1 in groupby(g, lambda x: '.'.join(x[:-1])):

indexed = {}

seq = set(get(indexed, item) for item in g1)

res[key] = ', '.join(i+fold(indexed.get(

return res

Python

From novice to tech pro — start learning today.

Experts Exchange Solution brought to you by

Enjoy your complimentary solution view.

Get this solution by purchasing an Individual license!
Start your 7-day free trial.

We could take this:

C.x[12].pqzed.o

and as it looped through, it would be stored as follows:

C -> it's a letter, store it in sub1 then check next value

. -> it's a dot, so we'll take it as a separation

x -> it's a letter, so store it in sub2 and check next value

[ -> ooh it's a bracket, so we store not only the letter before it but also the element(s) inside it as well as the number itself in

] -> end the bracket

. -> next term

p -> letter, check next...

q -> letter, check next...

z -> letter, check next...

e -> letter, check next...

d -> letter, check next...

. -> terminate the last string and start new...

o -> letter, end of term...

Archive the term and start fresh for the next one...

Once you've achieved this, you'll want to do data comparisons...

Check each of the sub-terms to see if they're the same or at least similar, differing only by numbers... and if so, order the numbers...

As for the 1-7, that's a simple counting algorithm... compare the numerical values, and increment to see if they're in sequence... if so, continue until that sequence is broken, and once it's broken you have the starting term and the term that breaks the cycle(minus one) and just separate them with a dash... then, continue for the rest of them... I could probably write this up in a day or two although I write in C/C++ more than Python...

I hope this helps.

-aNthrax