Solved

Regular expressions help parsing a txt file in PHP

Posted on 2007-04-11
8
204 Views
Last Modified: 2008-02-01
I need the following data parsed and each value put into a variable. You'll notice the variable amount of spaces and the extra nonimportant data at the end. This is how the file reads:
ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0


Can someone help me write the regular expressions or other method to parse these into variables?
0
Comment
Question by:stormist
8 Comments
 
LVL 29

Accepted Solution

by:
TeRReF earned 400 total points
ID: 18890365
How about this (you'll end up with an array that contains all values):

<?php

/*
$file = 'ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0';
*/

$file = file_get_contents('/path/to/your_file.txt');

// change values in the keys array in order to let them correspond with the values in the final $vars array
$keys = array('col1', 'col2', 'col3', 'col4', 'col5',
              'col6', 'col7', 'col8', 'col9', 'col10',
              'col11', 'col12', 'col13', 'col14', 'col15');

$lines = preg_split('/\n/', $file);

array_shift($lines);
array_shift($lines);

$vars = array();

foreach ($lines as $line) {
  $vals = array_chunk(preg_split('/\s+/', $line), 15);
  $vars[] = array_combine($keys,$vals[0]);
}

print_r($vars);

?>
0
 
LVL 48

Expert Comment

by:hernst42
ID: 18890418
You could use something like this to spilt up the columns:

$rowdata = preg_split('/\s+/', $line);
$i = 0;
$ticket = $rowdata[$i++];
$opentime = $rowdata[$i++] ." ". $rowdata[$i++];
if ($rowdata[$i] == 'balance') {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = '';
} else {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = $rowdata[$i++];
}
$open = $rowdata[$i++];
...

One problem with your code TeRReF is that it could not handle the first balance line correctly.
0
 
LVL 27

Expert Comment

by:yodercm
ID: 18890427
If you are doing one row at a time, this will give you an array of the values in the row.

$arrayofvalues = explode(" ",$inputstring);

http://us2.php.net/manual/en/function.explode.php
0
Live: Real-Time Solutions, Start Here

Receive instant 1:1 support from technology experts, using our real-time conversation and whiteboard interface. Your first 5 minutes are always free.

 
LVL 8

Expert Comment

by:netmunky
ID: 18892821
it's a fixed width format, and unless you can guarantee all fields exist under certain conditions, it's best to stick to the fixed width format.
you're probably better off using http://www.php.net/preg-match
$results = array();
$keys = array( "ticket", "datetime", ...etc... );

.. while loop through reading of file ...
  $pattern = "/(.{10})(.{16})(.{11})...etc.../"
  preg_match( $pattern, $line, $matches );
  array_push( $results, array_combine( $keys, $matches ) );
0
 
LVL 23

Assisted Solution

by:Stacy Spear
Stacy Spear earned 50 total points
ID: 18894180
Terref's file looks like it will work just fine.

Although it will work, the issue I see that needs to be addressed elsewhere in your code is the validation of data. I believe that's what you wanted the regex for, to not only pull it, but to validate it at the same time. For instance, a proper record with have the open ticket time before the close time. I disagree with the format chosen for that field as its designed to be human readable, instead of being computational friendly. Storing time as a Unix timestamp for instance is a far better choice. I typically modify the epoch used based on the company.

Based on all that, once you pull the data into a structure, then use various means to validate it. I think an all inclusive regex is not the right way (although it too would "get" all the data, doing the comparisons will eat lots of CPU cycles).
0
 
LVL 8

Assisted Solution

by:netmunky
netmunky earned 50 total points
ID: 18895089
if you want to convert string to time, you can use http://www.php.net/str_replace (to replace the . with -)
$fixed_date = str_replace(".","-","2006.03.12 12:03");

then you can get use http://www.php.net/strtotime for $time = $strtotime( $fixed_date );

and terraf's solution almost works, except that it assumes that all fields exist on all lines, which in the case of the 'balance' line they do not. hence the better solution is to follow the original fixed width format. (copy/paste the original text into a file fixed width font, and you will see that all columns line up exactly)
0
 
LVL 10

Author Comment

by:stormist
ID: 18903869
Thanks all
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 18903907
You're welcome :)
0

Featured Post

Gigs: Get Your Project Delivered by an Expert

Select from freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I imagine that there are some, like me, who require a way of getting currency exchange rates for implementation in web project from time to time, so I thought I would share a solution that I have developed for this purpose. It turns out that Yaho…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

775 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question