Should also be the simple name-value pairs be represented as lists with two elements?
Main Topics
Browse All TopicsI'm looking for an elegant solution for string parsing problem.
The string contains bracketed name/value pairs.
I want an elegant solution.
- Don't expect to get points for a quick and dirty solution
Especially one that "almost" works
I don't want, and won't accept a LEX/YACC solution.
I must be able to implement it in Python 2.1 (so, pyParsing won't work).
I will accept a complete pseudo-code algorithm that I can implement
Notes:
- Name & value are separated by a single space
- Value may be empty (e.g., "[name ]")
- Values may contain other bracketed name/value pairs.
- Nested result may be arbitrarily deep
- solution must be able to handle nesting at least 7 deep
Sample input:
This Question has been solved and asker verified All Experts Exchange premium technology solutions are available to subscription members.
Experts Exchange has been collecting answers to technology questions since 1996…3 million and counting! If you have a question, chances are we already have your answer.
If you can't find the exact answer you're looking for, ask our exclusive community of 50,000 experts. You’ll get a personalized answer from a trusted professional.
Thousands of free tech tips, tricks, how-to’s and tutorials are available in our peer reviewed articles section. See for yourself how smart our experts are, no login required.
Access the answers to your technology questions today.
30-day free trial. Register in 60 seconds.
Members of the expert community talk about why the experience at Experts Exchange is different than what you will find anywhere else.

Try it out and discover for yourself.
30-day free trial. Register in 60 seconds.
Join the community of experts here and help other tech pros by answering question in your area of expertise. You can earn FREE access to all Experts Exchange's premium features and resources.
Unfortunately, not everything is a name value pair...
Sorry for not being more clear.
I see that you put underscores where the blanks were.
Q: How should this be interpreted?
[[XXXX_XXXXXXXX.XXXXXXXXXX
A: I would like to have a list returned with 2 name pair elements,
the first would contain:
[ 'XXXX', 'XXXXXXXX.XXXXXXXXXXXXXXXX
and the second would contain:
['XXXXX', 'XXXX']
So, that specific string should return the following list:
[ [ 'XXXX', XXXXXXXX.XXXXXXXXXXXXXXXXX
The think with which I have been having real trouble is when a value contains nested brackets...
For example (with single character "items"):
'[ A [ [ [ B C ] [ D E ] [ [ F G ] [ H I ] ] ] [ [ J K ] [ L M ] [ [ N O ] [ P Q ] ] ] ] ]'
Here is what I would like to see:
1) [adam amanda]
['adam', 'amanda']
2) [[adam amanda][brian brenda]]
[['adam', 'amanda'], ['brian', 'brenda'] ]
3) [[adam amanda][brian brenda][craig ]]
[ ['adam', 'amanda'], ['brian', 'brenda'], ['craig', ''] ]
4) [adam ]
['adam', '' ]
5) [adam amanda][brian brenda]
This would not occur. The data is guaranteed to be well formed.
go ahead and throw an exception
6) [adam [brian [craig cassy]]]
['adam', [ 'brian', ['craig', 'cassy'] ] ]
>> or am i completely off the track there?
Not at all. You seem to understand my challenge... ;-)
OK. Let's try the snippet below first to clarify how it should work. The code is based on finite automaton (processing single characters and changing the status to capture the situation). However, the finite automaton apparatus (theoretically of the same power as regular expressions) is not capable to capture nested pair structures. This way it could be done recursively (but not that easy or efficient because of the stream of processed characters) or using a stack that captures the not-finished levels. I have chosen the second approach. The resulting list can be accessed as stack[0]. The code should be cleaned, yet. It also depends on your clarification.
Studying http:#a25114013 ...
Yes. Sorry for the "mistake". This was not intentional. You can also completely get rid of it by replacing
s = ''.join(lstStr) # collected string
lstStr = None # init
lst.append(s) # add to the current list
by
lst.append(''.join(lstStr)
lstStr = None # init
I am not sure, but even the original code could be correct. The s is reused while the reference to the original string can be remembered by the code implementing the for loop. Anyway, even it it was the case, I do not like tricks like that. The code should be as understandable as possible.
The doc says (http://docs.python.org/re
"The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list. The suite is then executed once for each item provided by the iterator..."
This means that you should not observe an error in the produced result. Anyway, you are right.
Yes, I did exactly that (http:#a25118218)
>> "The expression list is evaluated once; it should yield an iterable object. An iterator is created for the result of the expression_list. The suite is then executed once for each item provided by the iterator..."
I agree completely. However, as I pointed out, this is for Python 2.1, which preceded iterators... :-)
You are definitely good in programming, so you know what to do ;) The code can be simplified. For example, you can remove the
lstStr = None # init
and
assert lstStr is None
It is there only to make the potential errors more visible. You know. However, other people may be reading this...
Have a nice day ;)
Business Accounts
Answer for Membership
by: peprPosted on 2009-08-17 at 01:53:36ID: 25112461
There is a sequence like (spaces replaced by _ to visualize them)
XXXXXXXXXX XXXXXXXXX] _[XXXXX XXXX]_]
[[XXXX_XXXXXXXX.XXXXXXXXXX
How this should be interpreted? To simplify further, the X'es shortened to A, B, C, D respectively:
[[A_B]_[C_D]_]
If the name-value are paired with possibly missing value, what is the name for the last empty value. In other words, is it only a syntax feature and the above should be considered the same as
[[A_B]_[C_D]]
I.e. what to do with the space in front of the last ] ?