Solved

Regular expressions help parsing a txt file in PHP

Posted on 2007-04-11
8
216 Views
Last Modified: 2008-02-01
I need the following data parsed and each value put into a variable. You'll notice the variable amount of spaces and the extra nonimportant data at the end. This is how the file reads:
ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0


Can someone help me write the regular expressions or other method to parse these into variables?
0
Comment
Question by:stormist
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
8 Comments
 
LVL 29

Accepted Solution

by:
TeRReF earned 400 total points
ID: 18890365
How about this (you'll end up with an array that contains all values):

<?php

/*
$file = 'ticket          open time       type  lot    curr    open      sl      tp        close time   close   swap  profit  pp
----------------------------------------------------------------------------------------------------------------------
#206237  2001.01.02 19:02    balance  0.0            0.00    0.00    0.00  2001.01.02 19:02    0.00   0.00 10000.00   0 0.00 0.00 "" 0
#206238  2001.03.22 06:59       sell  0.5  GBPUSD  1.4208  1.4258  1.4158  2001.03.23 02:00  1.4236  -1.84  -99.84 -28 0.00 994.56 "" 0
#206239  2001.03.26 02:00       sell  0.5  GBPUSD  1.4272  1.4322  1.4222  2001.03.26 03:43  1.4322   0.00 -175.00 -50 0.00 999.04 "" 0
#206242  2001.03.27 12:02        buy  0.1  GBPUSD  1.4350  1.4340  0.0000  2001.03.27 12:05  1.4340   0.00   -7.00 -10 0.00 200.90 "" 0
#206241  2001.03.27 08:59       sell  0.5  GBPUSD  1.4337  1.4387  1.4287  2001.03.27 13:42  1.4287   0.00  175.00  50 0.00 1003.59 "" 0
#206243  2001.03.27 20:59       sell  0.5  GBPUSD  1.4331  1.4381  1.4281  2001.03.28 06:36  1.4281  -1.84  173.16  50 0.00 1003.17 "" 0
#206244  2001.03.28 11:00       sell  0.5  GBPUSD  1.4318  1.4368  1.4268  2001.03.28 12:12  1.4368   0.00 -175.00 -50 0.00 1002.26 "" 0
#206247  2001.04.04 17:59       sell  0.5  GBPUSD  1.4324  1.4374  1.4274  2001.04.04 22:44  1.4374   0.00 -175.00 -50 0.00 1002.68 "" 0
#206245  2001.04.03 20:45       sell  0.5  GBPUSD  1.4330  1.4380  1.4280  2001.04.05 03:49  1.4380  -7.35 -182.35 -50 0.00 1003.10 "" 0
#206246  2001.04.04 06:59        buy  0.5  GBPUSD  1.4344  1.4294  1.4394  2001.04.05 03:52  1.4394   2.36  177.36  50 0.00 1004.08 "" 0
#206248  2001.04.04 23:00        buy  0.5  GBPUSD  1.4371  1.4321  1.4421  2001.04.05 06:24  1.4321   2.36 -172.64 -50 0.00 1005.97 "" 0
#206251  2001.04.16 09:59       sell  0.5  GBPUSD  1.4327  1.4377  1.4277  2001.04.16 09:59  1.4330   0.00  -10.50  -3 0.00 1002.89 "" 0
#206250  2001.04.13 08:57        buy  0.5  GBPUSD  1.4375  1.4325  1.4425  2001.04.16 10:14  1.4325   1.58 -173.42 -50 0.00 1006.25 "" 0
#206252  2001.04.17 02:59        buy  0.5  GBPUSD  1.4403  1.4353  1.4453  2001.04.17 07:09  1.4353   0.00 -175.00 -50 0.00 1008.21 "" 0
#206253  2001.04.17 08:00       sell  0.5  GBPUSD  1.4310  1.4360  1.4260  2001.04.17 15:59  1.4292   0.00   63.00  18 0.00 1001.70 "" 0
#206240  2001.03.30 06:27        buy  0.1  GBPUSD  1.4245  0.0000  0.0000  2001.04.17 15:59  1.4289   2.99   33.79  44 0.00 199.43 "" 0
#206254  2001.04.18 09:59       sell  0.5  GBPUSD  1.4230  1.4280  1.4180  2001.04.18 13:07  1.4280   0.00 -175.00 -50 0.00 996.10 "" 0
#206255  2001.04.19 11:00        buy  0.5  GBPUSD  1.4363  1.4313  1.4392  2001.04.19 14:33  1.4392   0.00  101.50  29 0.00 1005.41 "" 0';
*/

$file = file_get_contents('/path/to/your_file.txt');

// change values in the keys array in order to let them correspond with the values in the final $vars array
$keys = array('col1', 'col2', 'col3', 'col4', 'col5',
              'col6', 'col7', 'col8', 'col9', 'col10',
              'col11', 'col12', 'col13', 'col14', 'col15');

$lines = preg_split('/\n/', $file);

array_shift($lines);
array_shift($lines);

$vars = array();

foreach ($lines as $line) {
  $vals = array_chunk(preg_split('/\s+/', $line), 15);
  $vars[] = array_combine($keys,$vals[0]);
}

print_r($vars);

?>
0
 
LVL 48

Expert Comment

by:hernst42
ID: 18890418
You could use something like this to spilt up the columns:

$rowdata = preg_split('/\s+/', $line);
$i = 0;
$ticket = $rowdata[$i++];
$opentime = $rowdata[$i++] ." ". $rowdata[$i++];
if ($rowdata[$i] == 'balance') {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = '';
} else {
   $type = $rowdata[$i++];
   $lot = $rowdata[$i++];
   $cur = $rowdata[$i++];
}
$open = $rowdata[$i++];
...

One problem with your code TeRReF is that it could not handle the first balance line correctly.
0
 
LVL 27

Expert Comment

by:Cornelia Yoder
ID: 18890427
If you are doing one row at a time, this will give you an array of the values in the row.

$arrayofvalues = explode(" ",$inputstring);

http://us2.php.net/manual/en/function.explode.php
0
Why Off-Site Backups Are The Only Way To Go

You are probably backing up your data—but how and where? Ransomware is on the rise and there are variants that specifically target backups. Read on to discover why off-site is the way to go.

 
LVL 8

Expert Comment

by:netmunky
ID: 18892821
it's a fixed width format, and unless you can guarantee all fields exist under certain conditions, it's best to stick to the fixed width format.
you're probably better off using http://www.php.net/preg-match
$results = array();
$keys = array( "ticket", "datetime", ...etc... );

.. while loop through reading of file ...
  $pattern = "/(.{10})(.{16})(.{11})...etc.../"
  preg_match( $pattern, $line, $matches );
  array_push( $results, array_combine( $keys, $matches ) );
0
 
LVL 23

Assisted Solution

by:Stacy Spear
Stacy Spear earned 50 total points
ID: 18894180
Terref's file looks like it will work just fine.

Although it will work, the issue I see that needs to be addressed elsewhere in your code is the validation of data. I believe that's what you wanted the regex for, to not only pull it, but to validate it at the same time. For instance, a proper record with have the open ticket time before the close time. I disagree with the format chosen for that field as its designed to be human readable, instead of being computational friendly. Storing time as a Unix timestamp for instance is a far better choice. I typically modify the epoch used based on the company.

Based on all that, once you pull the data into a structure, then use various means to validate it. I think an all inclusive regex is not the right way (although it too would "get" all the data, doing the comparisons will eat lots of CPU cycles).
0
 
LVL 8

Assisted Solution

by:netmunky
netmunky earned 50 total points
ID: 18895089
if you want to convert string to time, you can use http://www.php.net/str_replace (to replace the . with -)
$fixed_date = str_replace(".","-","2006.03.12 12:03");

then you can get use http://www.php.net/strtotime for $time = $strtotime( $fixed_date );

and terraf's solution almost works, except that it assumes that all fields exist on all lines, which in the case of the 'balance' line they do not. hence the better solution is to follow the original fixed width format. (copy/paste the original text into a file fixed width font, and you will see that all columns line up exactly)
0
 
LVL 10

Author Comment

by:stormist
ID: 18903869
Thanks all
0
 
LVL 29

Expert Comment

by:TeRReF
ID: 18903907
You're welcome :)
0

Featured Post

Secure Your WordPress Site: 5 Essential Approaches

WordPress is the web's most popular CMS, but its dominance also makes it a target for attackers. Our eBook will show you how to:

Prevent costly exploits of core and plugin vulnerabilities
Repel automated attacks
Lock down your dashboard, secure your code, and protect your users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Password hashing is better than message digests or encryption, and you should be using it instead of message digests or encryption.  Find out why and how in this article, which supplements the original article on PHP Client Registration, Login, Logo…
This article discusses four methods for overlaying images in a container on a web page
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

728 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question