Solved

Regular expressions help parsing a txt file in PHP

Posted on 2007-04-11
8
198 Views
Last Modified: 2008-02-01
I need the following data parsed and each value put into a variable. You'll notice the variable amount of spaces and the extra nonimportant data at the end. This is how the file reads:
ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0


Can someone help me write the regular expressions or other method to parse these into variables?
0
Comment
Question by:stormist
8 Comments
 
LVL 29

Accepted Solution

by:
TeRReF earned 400 total points
ID: 18890365
How about this (you'll end up with an array that contains all values):

<?php

/*
$file = 'ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0';
*/

$file = file_get_contents('/path/to/your_file.txt');

// change values in the keys array in order to let them correspond with the values in the final $vars array
$keys = array('col1', 'col2', 'col3', 'col4', 'col5',
              'col6', 'col7', 'col8', 'col9', 'col10',
              'col11', 'col12', 'col13', 'col14', 'col15');

$lines = preg_split('/\n/', $file);

array_shift($lines);
array_shift($lines);

$vars = array();

foreach ($lines as $line) {
  $vals = array_chunk(preg_split('/\s+/', $line), 15);
  $vars[] = array_combine($keys,$vals[0]);
}

print_r($vars);

?>
0
 
LVL 48

Expert Comment

by:hernst42
ID: 18890418
You could use something like this to spilt up the columns:

$rowdata = preg_split('/\s+/', $line);
$i = 0;
$ticket = $rowdata[$i++];
$opentime = $rowdata[$i++] ." ". $rowdata[$i++];
if ($rowdata[$i] == 'balance') {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = '';
} else {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = $rowdata[$i++];
}
$open = $rowdata[$i++];
...

One problem with your code TeRReF is that it could not handle the first balance line correctly.
0
 
LVL 27

Expert Comment

by:yodercm
ID: 18890427
If you are doing one row at a time, this will give you an array of the values in the row.

$arrayofvalues = explode(" ",$inputstring);

http://us2.php.net/manual/en/function.explode.php
0
 
LVL 8

Expert Comment

by:netmunky
ID: 18892821
it's a fixed width format, and unless you can guarantee all fields exist under certain conditions, it's best to stick to the fixed width format.
you're probably better off using http://www.php.net/preg-match
$results = array();
$keys = array( "ticket", "datetime", ...etc... );

.. while loop through reading of file ...
  $pattern = "/(.{10})(.{16})(.{11})...etc.../"
  preg_match( $pattern, $line, $matches );
  array_push( $results, array_combine( $keys, $matches ) );
0
What Is Threat Intelligence?

Threat intelligence is often discussed, but rarely understood. Starting with a precise definition, along with clear business goals, is essential.

 
LVL 23

Assisted Solution

by:Stacy Spear
Stacy Spear earned 50 total points
ID: 18894180
Terref's file looks like it will work just fine.

Although it will work, the issue I see that needs to be addressed elsewhere in your code is the validation of data. I believe that's what you wanted the regex for, to not only pull it, but to validate it at the same time. For instance, a proper record with have the open ticket time before the close time. I disagree with the format chosen for that field as its designed to be human readable, instead of being computational friendly. Storing time as a Unix timestamp for instance is a far better choice. I typically modify the epoch used based on the company.

Based on all that, once you pull the data into a structure, then use various means to validate it. I think an all inclusive regex is not the right way (although it too would "get" all the data, doing the comparisons will eat lots of CPU cycles).
0
 
LVL 8

Assisted Solution

by:netmunky
netmunky earned 50 total points
ID: 18895089
if you want to convert string to time, you can use http://www.php.net/str_replace (to replace the . with -)
$fixed_date = str_replace(".","-","2006.03.12 12:03");

then you can get use http://www.php.net/strtotime for $time = $strtotime( $fixed_date );

and terraf's solution almost works, except that it assumes that all fields exist on all lines, which in the case of the 'balance' line they do not. hence the better solution is to follow the original fixed width format. (copy/paste the original text into a file fixed width font, and you will see that all columns line up exactly)
0
 
LVL 10

Author Comment

by:stormist
ID: 18903869
Thanks all
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 18903907
You're welcome :)
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Introduction HTML checkboxes provide the perfect way for a web developer to receive client input when the client's options might be none, one or many.  But the PHP code for processing the checkboxes can be confusing at first.  What if a checkbox is…
This article will explain how to display the first page of your Microsoft Word documents (e.g. .doc, .docx, etc...) as images in a web page programatically. I have scoured the web on a way to do this unsuccessfully. The goal is to produce something …
The viewer will learn how to dynamically set the form action using jQuery.
The viewer will learn how to look for a specific file type in a local or remote server directory using PHP.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

19 Experts available now in Live!

Get 1:1 Help Now