sjmont
asked on
Convert tabs to spaces in tcl??
Using tcl, I'm reading through a file generated by a report-writer. There are some records that have tabs in them, some don't. I need to parse out the fields (columns in the report) positionally. The problem is using string range or crange views the \t in the records as single characters
I tried doing
regsub -all $rec \t " " rec
but since the amount of whitespace in a tab is not static, I was getting undesired results.
Example of report line:
10/27/08 801062347 801062347 .00 Charge 70.99 N
Corresponding example of octal display from the report file:
0001460 \r \n 1 0 / 2 7 / 0 8 \t \t
0001500 8 0 1 0 6 2 3 4 7 \t
0001520 8 0 1 0 6 2 3 4 7 \t \t .
0001540 0 0 C h a r g e \t \t
0001560 7 0 . 9 9 N \r \n
I'd prefer to handle this in the tcl script but if necessary, I suppose I can manipulate the file using ksh beforehand, although I'm not sure how I would do that either.
Any ideas?
I tried doing
regsub -all $rec \t " " rec
but since the amount of whitespace in a tab is not static, I was getting undesired results.
Example of report line:
10/27/08 801062347 801062347 .00 Charge 70.99 N
Corresponding example of octal display from the report file:
0001460 \r \n 1 0 / 2 7 / 0 8 \t \t
0001500 8 0 1 0 6 2 3 4 7 \t
0001520 8 0 1 0 6 2 3 4 7 \t \t .
0001540 0 0 C h a r g e \t \t
0001560 7 0 . 9 9 N \r \n
I'd prefer to handle this in the tcl script but if necessary, I suppose I can manipulate the file using ksh beforehand, although I'm not sure how I would do that either.
Any ideas?
Can this be used:
% set conv_rec [lrang $rec 0 6]
10/27/08 801062347 801062347 .00 Charge 70.99 N
% set conv_rec [lrang $rec 0 6]
10/27/08 801062347 801062347 .00 Charge 70.99 N
ASKER
Yes, that's a valid thought. I guess I should have mentioned that the reason I was parsing the data positionally was because there sometimes is a field missing from the record, for example:
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ------
DATE TRANS # CUSTOMER RESPONSIBLE PARTY BALANCE PMT/CHG TYPE AMOUNT POSTED
-------------------------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ------
11/01/08 801000349 801000349 .00 Charge 63.00 N
11/01/08 00566317 801000886 801000886 .00 Charge 11.25 N
So in such a case using a list would offset the data in the list elements
--------------------------
DATE TRANS # CUSTOMER RESPONSIBLE PARTY BALANCE PMT/CHG TYPE AMOUNT POSTED
--------------------------
11/01/08 801000349 801000349 .00 Charge 63.00 N
11/01/08 00566317 801000886 801000886 .00 Charge 11.25 N
So in such a case using a list would offset the data in the list elements
Does defining start and end of each field, then use "string range" work for you? For example:
set dt_start 0
set dt_end 7
set trans_start 8
set trans_end 19
...
...
set dt [string range $rec $dt_start $dt_end]
set trans_nu [string range $rec $trans_start $trans_end]
...
...
set dt_start 0
set dt_end 7
set trans_start 8
set trans_end 19
...
...
set dt [string range $rec $dt_start $dt_end]
set trans_nu [string range $rec $trans_start $trans_end]
...
...
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
If I understand your quest correctly, you want to replace tabs with spaces while keeping the same "positioning" as the tabs produce in a terminal?
The reason for this being that you intend to match the position of the headers with the data fields, or something like that? I think that is also what has been suggested above kind of?
The attached proceedure could perhaps be of some assistance with replacing tabs with space at least. Probably very slow, but might be good enough for small amounts of data, or you can just use the idea, and write a better performing version :-)
It simply walks through the original string one char at a time, and if there is a tab at the position it will be substituted with enough spaces to pad up to the next tab-stop (column width defined by the tabstop variable).
The reason for this being that you intend to match the position of the headers with the data fields, or something like that? I think that is also what has been suggested above kind of?
The attached proceedure could perhaps be of some assistance with replacing tabs with space at least. Probably very slow, but might be good enough for small amounts of data, or you can just use the idea, and write a better performing version :-)
It simply walks through the original string one char at a time, and if there is a tab at the position it will be substituted with enough spaces to pad up to the next tab-stop (column width defined by the tabstop variable).
proc tab2space { str } {
set tabstop 8
set n 0
set nspc 0
set strlen [string length $str]
while { $n < $strlen } {
set c [string index $str $n]
if { "$c" == "\t" } {
set nn [expr $n + $nspc]
set spaces [expr $tabstop - ( $nn % $tabstop )]
incr nspc [expr $spaces - 1]
append newstr "[string repeat " " $spaces]"
} else {
append newstr $c
}
incr n
}
return "$newstr"
}
which matches the given text better
Regards
Friedrich