Regular expressions help parsing a txt file in PHP

I need the following data parsed and each value put into a variable. You'll notice the variable amount of spaces and the extra nonimportant data at the end. This is how the file reads:
ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0


Can someone help me write the regular expressions or other method to parse these into variables?
LVL 10
stormistAsked:
Who is Participating?
 
TeRReFCommented:
How about this (you'll end up with an array that contains all values):

<?php

/*
$file = 'ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0';
*/

$file = file_get_contents('/path/to/your_file.txt');

// change values in the keys array in order to let them correspond with the values in the final $vars array
$keys = array('col1', 'col2', 'col3', 'col4', 'col5',
              'col6', 'col7', 'col8', 'col9', 'col10',
              'col11', 'col12', 'col13', 'col14', 'col15');

$lines = preg_split('/\n/', $file);

array_shift($lines);
array_shift($lines);

$vars = array();

foreach ($lines as $line) {
  $vals = array_chunk(preg_split('/\s+/', $line), 15);
  $vars[] = array_combine($keys,$vals[0]);
}

print_r($vars);

?>
0
 
hernst42Commented:
You could use something like this to spilt up the columns:

$rowdata = preg_split('/\s+/', $line);
$i = 0;
$ticket = $rowdata[$i++];
$opentime = $rowdata[$i++] ." ". $rowdata[$i++];
if ($rowdata[$i] == 'balance') {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = '';
} else {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = $rowdata[$i++];
}
$open = $rowdata[$i++];
...

One problem with your code TeRReF is that it could not handle the first balance line correctly.
0
 
Cornelia YoderArtistCommented:
If you are doing one row at a time, this will give you an array of the values in the row.

$arrayofvalues = explode(" ",$inputstring);

http://us2.php.net/manual/en/function.explode.php
0
Cloud Class® Course: Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

 
netmunkyCommented:
it's a fixed width format, and unless you can guarantee all fields exist under certain conditions, it's best to stick to the fixed width format.
you're probably better off using http://www.php.net/preg-match
$results = array();
$keys = array( "ticket", "datetime", ...etc... );

.. while loop through reading of file ...
  $pattern = "/(.{10})(.{16})(.{11})...etc.../"
  preg_match( $pattern, $line, $matches );
  array_push( $results, array_combine( $keys, $matches ) );
0
 
Stacy SpearPresident/Principal ConsultantCommented:
Terref's file looks like it will work just fine.

Although it will work, the issue I see that needs to be addressed elsewhere in your code is the validation of data. I believe that's what you wanted the regex for, to not only pull it, but to validate it at the same time. For instance, a proper record with have the open ticket time before the close time. I disagree with the format chosen for that field as its designed to be human readable, instead of being computational friendly. Storing time as a Unix timestamp for instance is a far better choice. I typically modify the epoch used based on the company.

Based on all that, once you pull the data into a structure, then use various means to validate it. I think an all inclusive regex is not the right way (although it too would "get" all the data, doing the comparisons will eat lots of CPU cycles).
0
 
netmunkyCommented:
if you want to convert string to time, you can use http://www.php.net/str_replace (to replace the . with -)
$fixed_date = str_replace(".","-","2006.03.12 12:03");

then you can get use http://www.php.net/strtotime for $time = $strtotime( $fixed_date );

and terraf's solution almost works, except that it assumes that all fields exist on all lines, which in the case of the 'balance' line they do not. hence the better solution is to follow the original fixed width format. (copy/paste the original text into a file fixed width font, and you will see that all columns line up exactly)
0
 
stormistAuthor Commented:
Thanks all
0
 
TeRReFCommented:
You're welcome :)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.