Avatar of Tom Knowlton
Tom Knowlton
Flag for United States of America asked on

More RegEx help - splitting a large block of text at a certain point

Consider the following snippet

		FL_FILL_NO_DATA=NO
		FL_RIGHT_JUSTIFY=NO
		FL_BOUNDARY=NO
		FL_UPDATE=S
		FL_VERIFY=K
		FL_MUST_ENTER=NO
		FL_MUST_COMPLETE=NO
		FL_INDEX_FIELD=NO
		FL_REGISTER=NO
		FL_DESKEW=NO
		FL_RESERVE_PUNC=NO
		FL_FIELD_LIST=NO
		FL_BB_NUMBER=0
		FL_CD_NUMBER=0
		FL_CD_MULTI=NO
		FL_VT_NUMBER=0
		FL_VT_INVALID=NO
		FL_NONDISP=NO
		FL_TAB_STOP=NO
		FL_RASCY=NO
		FL_FASCY=NO
		FL_COND_LINK=NO
		FL_ORIGINX=-1
		FL_ORIGINY=-1
		FL_TEXT_BUCKET=
{
FW_BM_URL_HEIGHT=0

FW_BM_URL_WIDTH=0

FW_DISABLE_SG=0

FW_FORMWARE_SG=1

FW_SUBMIT_TYPE=0

FW_GEN_FORM_END=0

FW_URL_WIN=0

FW_ST_DEFAULT=0

FW_MS_MINIMUM=0

FW_MS_MAXIMUM=1


}
		FL_EDIT_TYPE=0
		FL_ENGINE_USE=18
		FL_OCR_OFFSET=0
		FL_OCR_LENGTH=0
		FL_BLACK=0
		FL_WHITE=0
		FL_RVERIFY=YES
		FL_FEEDIT=NO
		FL_FSEDIT=NO
		FL_EXPORT_POS=0
		FL_PASSWORD=NO
		FL_NONKEY=NO
		FL_TEXT_FIELD=NO
		FL_CREATE_FULL_PAGE_FILE=NO
	}
	FIELD=
	{
		FL_NAME=po_number_line_1
		FL_TYPE=ANY
		FL_LENGTH=50
		FL_ROW=21
		FL_COL=10
		FL_ZONETYPE=KEY
		FL_ZONEX=-1
		FL_ZONEY=-1
		FL_ZONEH=0
		FL_ZONEW=0
		FL_ZOOM=100
		FL_BKCL=16777215
		FL_TEXTCL=0
		FL_OCR_FONT=REG
		FL_OCR_READ_LENGTH=FIXED
		FL_OCR_READ_FORMAT=WORD
		FL_OCR_CONFIDENCE=50
		FL_OCR_ERRORS_ALLOWED=99
		FL_OCR_CHARS_INCH=0
		FL_OCR_DOTS_INCH=0
		FL_LOAD_NEXT_IMAGE=NO

Open in new window




This snippet above is just a small part of the file.  There is tons of text on either side.

But this part:


FIELD=
      {
            FL_NAME=po_number_line_1


has some special features that I want to take advantage of in order to SPLIT the file.


This is the first location where FL_NAME begins using the "_<<integer>>" pattern, in this case it is "_1" because it is the first FL_NAME entry to use this formatting.


Where I want to split the file is at the instance of  "FIELD=" that immediately precedes
the FIRST occurrence of the "_1" naming convention.

So, the first part of the file would end right before "FIELD="

The second part of the file would begin at the 'F' in "FIELD="

Does that make sense?


Caution is warranted because there are instances of both FIELD= and FL_NAME that precede this particular point in the file, but none of those instances have FL_NAME entries that end with "underscore integer" naming.  I just want to find the first area where this happens and split the file as I described.


Many thanks for your speed.

( please provide working code in C#".  If you can split my little snippet in 2 parts, I think it will work on the larger file as well, so you can use that for your test )


Tom
.NET ProgrammingRegular ExpressionsC#

Avatar of undefined
Last Comment
Tom Knowlton

8/22/2022 - Mon
ASKER CERTIFIED SOLUTION
wdosanjos

Log in or sign up to see answer
Become an EE member today7-DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform
Sign up - Free for 7 days
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.
Not exactly the question you had in mind?
Sign up for an EE membership and get your own personalized solution. With an EE membership, you can ask unlimited troubleshooting, research, or opinion questions.
ask a question
Tom Knowlton

ASKER
This works, thanks!


Question:

filesplit[0] has the first part of the file.

Is there a way to put all of the rest of the file into

filesplit[1]

???

Right now filesplit has 101 strings.  I just want it to have 2 elements; the first part and the remainder of the file.
Tom Knowlton

ASKER
Great job!
wdosanjos

Yes, just add the number of elements to the Split call as follows:

var re = new Regex(@"(?=FIELD=\s+{\s+FL_NAME=\w+_\d+)");

var test = File.ReadAllText(@"C:\Temp\test.txt");

string[] filesplit = re.Split(test, 2);

Open in new window

Your help has saved me hundreds of hours of internet surfing.
fblack61
Tom Knowlton

ASKER
That did it.

Thanks!