Solved

HOW DO I ORGANIZE AND DISPLAY TXT VIA PHP

Posted on 2009-05-04
15
293 Views
Last Modified: 2012-05-07
I have a large / unorganized text file that is created by a system.  Attached are two files.  One called beginningfile.txt and endresult.txt.  What I would like is organization in this file and displayed on a webpage via PHP.  The ENDRESULT is just an example.  Whatever is easiest will also be fine.  Essentially, instead of searching the text file or printing it and looking for '123456' and '456789' as UNIQUE ID's for the rest of the respective data and manually match them up, I would like for the PHP to search for 123456 or 456789 in this case and put them all together.  In one instance on the file it will say 02123456, in another it will say 04123456 and yet another it will only say 123456 ((SECTIONIII) column I, if this piece of the text was in a csv file).

Bottom line: there are three separate pieces of data that should be together, but aren't.  (for convenience, I have labeled section I,II,III although the titles don't exist in the real file).  I want to get these three separated values for the same UNIQUE ID gathered together somehow.  Alternative layouts are welcome.  Many thanks!!!

Again, of course ENDRESULT wouldn't be an endresult.txt, rather just HTML on Firefox or other browser.
BEGINNINGFILE.txt
ENDRESULT.txt
0
Comment
Question by:weklica
  • 8
  • 6
15 Comments
 
LVL 4

Expert Comment

by:davidsperling
ID: 24302294
0
 

Author Comment

by:weklica
ID: 24307112
Thanks.  I will look into these and see what I can come up with.  if anybody has code examples they would like to post --> keep in mind, the 123456 and 456879 are just examples.  There will be thousands of such numbers in the file.
0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24320485
When you say thousands of numbers in the file are they separated by sections; if so, how are they other sections "linked"?

Thanks
0
 

Author Comment

by:weklica
ID: 24321300
The file would look like this the following only a hundred times longer.  Just remember to think of 02123456 and 02456789 as examples only.  This first section blocked off below would likely have 200+ various numbers (all preceding with 02).

 ---------------------                                                                                                                                                                                                                                                                    
02123456        test1       test1    321321                                                                                                                                                                                                                                                                                
02456789        test2         test2   654654                                                                                                                                                                                                                                                                                
 ---------------------                                                                                                                                                                                                                                                           04123456       123456789121 1234567     26    001  0002172NPBAR                                         V11.22 V33.6                                                                                  7654321   98765432                                                                                                        AB1

04456789      123456789121 1234567     25    002  0002172ASDF                                         V11.22 V33.6                                                                                  7654321   98765432                                                                                                        AB2

04456789      123456789121 1234567     25    002  0002172ASDF                                         V11.22 V33.6                                                                                  7654321   98765432                                                                                                        AB3

TESTER2,TEST2,B,1234,ASDFF,DR,,ASDFASDF,EE,12345,123456,98798,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC
TESTER,TEST,B,1234,ASDFF,DR,,ASDFASDF,EE,12345,456789,98798,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC
TESTER,TEST,B,1234,ASDFF,DR,,ASDFASDF,EE,12345,123456,98798,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC,ETC
0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24323265
Okay I see 3 sections:
     
*SECTIONI*
*SECTIONII*
*SECTIONIII*


Does it look like this:
*SECTIONI*
#
# x 2
*SECTIONII*
#
# x 2
*SECTIONIII*
#
#
# x 1000
#
#

Like this:

*SECTIONI*
#
# x 1000
#
*SECTIONII*
#
# x 1000
#
*SECTIONIII*
#
# x 1000
#

Basically what I am trying to get at is: Is any one section BIGGER than the others?
0
 

Author Comment

by:weklica
ID: 24325241
Yes,

Section II may have the same number a couple of times.  Section I will only have the number once, and Section III may have double the numbers.

Really, I only need the Matches for Section I displayed with its corresponding Section II and Section III.  

If Section III has no corresponding Section II or Section I, I don't need it displayed.  You will notice in the 'endresult.txt' file, Section I is the first line,  Section II would be the next four (or 8 or however many, depending if it shows up more than once or twice), and Section three would follow).  I am not overly concerned about how it is organized, just that they are grouped together a bit better.  Attached is a screencast of the general idea.....

I can type in the numbers I am concerned about, and the Console on MAC will display all chunks of the file which contain the given string, that is essentially what I am after here....


http://screencast.com/t/urAtCxogih

0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24331412
Do you have control over the output format of this file?
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 

Author Comment

by:weklica
ID: 24333433
It is txt format only although I could script it out to become csv if necessary.
0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24353018
First lets get rid of all of that whitespace!

rename your input file to bfile.txt

=NerdsOfTech
<?php

$handle = fopen('bfile.txt', 'r'); // open file

echo '<pre>';

while (!feof($handle)) { // if not End Of File then loop

 $data = fgets($handle); // read line

 $data = trim($data); // trim spaces from front and back

 $data=preg_replace('/\s+/',' ',$data); // regex: trim all whitespace to 1 space

 echo "$data\n"; // output line

}

echo '</pre>';

?>

Open in new window

regex-trim-all-spaces-plustrim.png
0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24353031
Almost forgot the fclose :)
<?php

$handle = fopen('bfile.txt', 'r'); // open file

echo '<pre>';

while (!feof($handle)) { // if not End Of File then loop

 $data = fgets($handle); // read line

 $data = trim($data); // trim spaces from front and back

 $data=preg_replace('/\s+/',' ',$data); // regex: trim all whitespace to 1 space

 echo "$data\n"; // output line

}

fclose($handle);

echo '</pre>';

?>

Open in new window

0
 
LVL 19

Accepted Solution

by:
NerdsOfTech earned 500 total points
ID: 24353091
Next step remove newlines
<?php

$handle = fopen('bfile.txt', 'r');

echo '<pre>';

while (!feof($handle)) {

 $data = fgets($handle);

 $data = trim($data);

 $data=preg_replace('/\s+/',' ',$data);

 if ($data != ' ' && $data != '')

 {

  echo "$data\n";

 }

}

fclose($handle);

echo '</pre>';

?>

Open in new window

0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24353108
Now we are really to format this data!

:)
newlines-removed-after-multiwhit.png
0
 
LVL 19

Expert Comment

by:NerdsOfTech
ID: 24353189
The numbers 1234567..blah are confusing is there a way to make this example more unquie?

I want to analyze the patterns to match these rows correctly.

Basically, I want to know which numbers from section I match the subsequent sections II, and III.

Thanks
0
 

Author Comment

by:weklica
ID: 24353968
Yes, I will come up with some better examples.
0
 

Author Closing Comment

by:weklica
ID: 31591462
We didn't really get too far with this, but thanks for your efforts.  I will have a couple of more side projects for you to do in the near future if you are up to it.  I will get you the details when I am ready so you can quote it out.  Thanks
0

Featured Post

Top 6 Sources for Identifying Threat Actor TTPs

Understanding your enemy is essential. These six sources will help you identify the most popular threat actor tactics, techniques, and procedures (TTPs).

Join & Write a Comment

Suggested Solutions

Both Easy and Powerful How easy is PHP? http://lmgtfy.com?q=how+easy+is+php (http://lmgtfy.com?q=how+easy+is+php)  Very easy.  It has been described as "a programming language even my grandmother can use." How powerful is PHP?  http://en.wikiped…
This article discusses how to create an extensible mechanism for linked drop downs.
The viewer will learn how to count occurrences of each item in an array.
The viewer will learn how to create and use a small PHP class to apply a watermark to an image. This video shows the viewer the setup for the PHP watermark as well as important coding language. Continue to Part 2 to learn the core code used in creat…

747 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now