Link to home
Start Free TrialLog in
Avatar of stormist
stormist

asked on

Regular expressions help parsing a txt file in PHP

I need the following data parsed and each value put into a variable. You'll notice the variable amount of spaces and the extra nonimportant data at the end. This is how the file reads:
ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0


Can someone help me write the regular expressions or other method to parse these into variables?
ASKER CERTIFIED SOLUTION
Avatar of TeRReF
TeRReF
Flag of Netherlands image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
You could use something like this to spilt up the columns:

$rowdata = preg_split('/\s+/', $line);
$i = 0;
$ticket = $rowdata[$i++];
$opentime = $rowdata[$i++] ." ". $rowdata[$i++];
if ($rowdata[$i] == 'balance') {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = '';
} else {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = $rowdata[$i++];
}
$open = $rowdata[$i++];
...

One problem with your code TeRReF is that it could not handle the first balance line correctly.
If you are doing one row at a time, this will give you an array of the values in the row.

$arrayofvalues = explode(" ",$inputstring);

http://us2.php.net/manual/en/function.explode.php
it's a fixed width format, and unless you can guarantee all fields exist under certain conditions, it's best to stick to the fixed width format.
you're probably better off using http://www.php.net/preg-match
$results = array();
$keys = array( "ticket", "datetime", ...etc... );

.. while loop through reading of file ...
  $pattern = "/(.{10})(.{16})(.{11})...etc.../"
  preg_match( $pattern, $line, $matches );
  array_push( $results, array_combine( $keys, $matches ) );
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of stormist
stormist

ASKER

Thanks all
You're welcome :)