Excel filling in data

Hello,
I have an excel which is attached for reference.
The issues I am having are as follows:
1.  (Workbook=Original)  
In column A the text contains text as Art. & then number. When the text is only Art. I can separate it conveniently in Column I. However when the text is contained in a string, I can use the formula to separate it in another column L.
The required result should be that the Art. with no. should all be in column I & after that removed from Column A irrespective of the fact if it's alone or contained in the string.
2. (Workbook = Modified)
a) The text which is assigned to columns from C to F should be removed from column A
b)  The relationship between columns is that
      F is child of E
      E is child of D
      D is child of C
   Whenever there is a value in F the value of E is repeated accordingly, same goes for other parent child relationship.
I have tried to put this in the workbook to make it clear, however please let me know if something is not clear.

Lastly the column I will always have values, that means if the row is empty in Column I, it will be deleted.
Original.xlsx
Modified.xlsx
LVL 11
AcklesAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi.

Please take a look at what I put in your column M.  If I understand you correctly it is what you want in column L.  I did not take a look at the rest of the issues yet.  I think if you use IF statements to test for conditions as I did in M it will come to you.  I think your errors are maybe all from using the FIND without capturing the error generated when a cell in column A does not have an Art.  When FIND looks for something not there, it generates the #VALUE error.

My edited version of your first sheet is attached.

The excerpt from the excel documentation on FIND is here.
Excel Documentation on Text function FIND and error modes.

Please let us know if this is what you want and if you need some more info.
EEExcelQ.xlsx
AcklesAuthor Commented:
Awesome!!!
That was perfect!

Can you PLEASE help me on other issues?
Regards,
A
AcklesAuthor Commented:
Is it possible to delete the text from column A which you have put in column M?

The issue that the same text is being split in other columns with Art. will be resolved if after pulling the value in Column M the Art is removed...
Learn SQL Server Core 2016

This course will introduce you to SQL Server Core 2016, as well as teach you about SSMS, data tools, installation, server configuration, using Management Studio, and writing and executing queries.

AcklesAuthor Commented:
I managed to remove the text from Column A by using find & replace with wildcard.

Now, the other issues of text.....
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Glad to help!!

Do you have to remove data from A?

I normally want to keep my input column intact and try to have data just fill in by reading from A rather than taking away from A.  Then changing A with new input is maybe all that needs to be done for each new set of data.  New columns of the separated data from A then appear in the assigned columns without messing up A.  Maybe this won't work for your application.
AcklesAuthor Commented:
All right, if I keep the data intact, I still want the other issue to be resolved, which is parent child relation....
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi.
Pretty busy.  I understand I think.  I was also asking that for the purpose of disassembling the outline without having to remove data, so could try shorter formula. I'll try to get back in 12 to 24 hours.  If you want to try, I was going to look at multiple nested IF and look for the period that occurs after Alpha, Roman, lowercase, etc., Then check if the period is preceded by "Art" and if not, then proceed to subdivide by taking LEFT((Find(". ",$A5))   Maybe need IFERROR there.
AcklesAuthor Commented:
I understand & appreciate your help.
Thanks a lot!
Christopher Jay WolffWiggle My Legs, OwnerCommented:
In my spare time, this will take some duration of time.  Wanted to let you know I've not quit.  We have holiday weekend here in USA, but will keep at it.
AcklesAuthor Commented:
Thanks!!!
I understand, if you get time have a look, otherwise have a relaxed weekend!!!

Regards,
A
AcklesAuthor Commented:
Why is this question marked neglected?
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi again Ackles.

For the sake of readers of the thread, I'll restate what you already knew in PMs.  This outline breakdown should be done with VB to be error free and to cover all situations of random almost infinite depth unpredictable outlines.  In other words, If we nest IFs up to 64 deep limit, the outline tree structure in col A might exceed that limit and cause failures.  Bad.  

if I understand correctly, we are to do this in formula and have strong beliefs that if the outline exceeds our formula limits or constraints we can flag the error.  And if the outline will not exceed main headings of A to Z, and sub-headings of Roman Numerals will not exceed I to XXVI, and sub-headings of Arabic Numerals will not exceed 1-26, and sub-headings of Lowercase Letters will not exceed a to z, then we can pursue this formula but it doesn't feel good.  :-)

Since we are proceeding this way, are there limits that you are working with for the outline?  My example above of 26 for each of the headings was completely arbitrary.
AcklesAuthor Commented:
If vb is the way to go, then please do it.
I was sticking to formula coz I'm not good with it
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Hi Ackles.
My apologies for the misunderstanding.  I thought you had to have a formula.  Could you please post any limits you have for the outline tree?  For instance, will the main heading of the outline ever exceed A to Z?  And if so, will it look like Excel which goes from A to AA, AB, AC, to AZ, etc., or is it impossible for the outline to exceed A to M for example?  Same question for Roman, Arabic, and Lowercase.  I see your formula covers up to I,V,X,L.  So does that mean the outline will never exceed those combinations and we do not have to code for C and M?

Over some lengthy period of time, I might be able to do the VB for you.  Better to help you find someone else.  But the limits above would be helpful.
AcklesAuthor Commented:
Shall I send you the link to original document?
Christopher Jay WolffWiggle My Legs, OwnerCommented:
Better answered by the someone else I'm trying to get to help you I think.  My questions were simply to obtain as many facts as we have about the incoming data.  Is it a government site and the outline limits are out of your control?  Is it a site you admin?  Things like that.
AcklesAuthor Commented:
It's the Swiss Law, free of copyright & yes it's a govt. site: https://www.admin.ch/opc/de/classified-compilation/19110009/index.html

See on the right side are the pdf's , the last pages of the pdf are the index which I am playing with.....
Christopher Jay WolffWiggle My Legs, OwnerCommented:
In case you end up assuming A-Z is enough, and so forth, and want a formula, I thought I would share my idea that may or may not work.

In order to maintain your previous column entries, I was working with the concept of one formula for each column.  Easier to mentally track, and then autofill down the column.  I used ". " rather than "." in case they ever included a URL or something in the title.

I only took it as far as this.

=IF(IFERROR(FIND(". ",$A6),"")="","",IF(AND(FIND(". ",$A6)=2,FIND(LEFT($A6,1),"ABCDEFGHIJKLMNOPQRTSTUVWXYZ")),$A6,""))

Open in new window


Then was going to investigate changing the very last $A6 to nested IF with testing of cell contents from directly above to test for conditions to separate Capitals from Roman.  Testing like IF(LEFT($A6,1)=LEFT($A5,1) and If I get it to work in on one cell, then autofill should help.  I didn't want to convert string to numeric if I could do it.  Then similar formula in the columns for Roman and Numbers and Lowercase, and then autofill column.

spreadsheet attached.
EEExcelQOutline.xlsx
AcklesAuthor Commented:
Jay,
I already have values separated per the criteria,the only piece of puzzle is to fill the values in left columns when the value appears in right column.... the parent child relationship...
aikimarkCommented:
Test this with a blank worksheet
You will need to remove the "0" values at the top.
Q_28710182.xlsm
aikimarkCommented:
I think the filldown the code is doing is overly generous
aikimarkCommented:
This version looks better.  No need to eliminate the "0" values.
Q_28710182.xlsm
AcklesAuthor Commented:
From the first look, it looks Awesome!!!
I will check it carefully & come back...

Can you please tell me exactly how you did it? I mean did you just work on my example (which I presume) or did you download the pdf & do it?

I ask specifically because I had to do lot to get to this point....

Again, THANKS!!!!
aikimarkCommented:
Look at my code.  I grabbed the HTML via an MSXML2 object and parsed the result with regular expressions.  After you prior related question (and lots of experience), I try and avoid parsing data from PDFs.
aikimarkCommented:
@Ackles

Are you still testing this code?
AcklesAuthor Commented:
Hey,
Sorry didn't get there, will do tomorrow & report back....
Sorry was tied up with other stuff
AcklesAuthor Commented:
Hey Mark,
I tried again, it does but not in the way I was expecting... maybe I should have put the complete file in the first instance, So Sorry.

Please find attached the Source PDF & the Excel I cleaned out to reflect it.
although there is a lot of manual work in excel because of my limited knowledge or skills, it's the starting point.

Please see if you can alter the code for the attached file, the pdf will give you the exact layout of what I want to achieve....

Thanks a lot for your help!!!
A
SampleIndex.xlsx
Index-ZGB.pdf
aikimarkCommented:
1. What do you need it to look like?
2. Why isn't the last section of the index included?
3. What is the rationale for including/excluding some of the non-article entries? (headers)
AcklesAuthor Commented:
Mark,
Please find attached the same Excel with some modifications till Row: 74. I have marked it Grey to mark end of edited.
The Yellow marked is exception & can be ignored.

The Column K: Article No. should not have any empty rows, as this will mark it consistent (I have done it manually & this is not the main concern)

Please have a look & let me know if it makes things clear.

2. The last section is not part of this index, I couldn't separate it from pdf.
3. The rationale is to mark them as parent child relationship, so if you see row 14, The Article belongs to Column H, & Column G is the parent of H & E is of G & so on.....
This is actually to put them in other format later, create a table & to identify where they belong.
SampleIndex.xlsx
aikimarkCommented:
I've added the Art. parsing.  You will need to manually correct the 28c-28f entry.
Q_28710182.xlsm
AcklesAuthor Commented:
Hi,
I see you omitted Column B to F, unfortunately they are also needed....
Sorry for misunderstanding...
aikimarkCommented:
There's no reason to have those columns, since there's no data for them.
AcklesAuthor Commented:
what do you mean there is no data?
Please have a look at the file I sent last, Edited....
aikimarkCommented:
Let me expand...
* there are four levels of title
* not all levels were moved into your data columns
* there is no consistent fill-down of the title data

============
Since I don't speak/read German, it would be helpful if you explicitly defined what items in the outline/hierarchy are being used and how they are mapped.  I'm pretty sure I understand the levels starting with capital letters.
AcklesAuthor Commented:
Hey Mark,
I understand your concern of German, unfortunately my level of German is also limited....

Now, all levels were not moved because they don't apply, if you see Column A, it should go down till row 153 as the Zweiter Teil comes there (which means Second).

In Column B the first entry comes in Row 154 as only Zweiter Teil has the first Abteilung.

The point is I will manually fill Column A - F (as in the last file I sent) now based on the values which I have already filled, is it possible to repeat them? (truly hope you say Yes)

Thanks again.
A
aikimarkCommented:
If you just want to fill down the blank cells, you can do this.
Sub FillDownOutline()
    Dim rngTgt As Range
    Set rngTgt = ActiveSheet.Range(ActiveSheet.Range("B3"), ActiveSheet.Range("B1").CurrentRegion.Cells.SpecialCells(xlCellTypeLastCell).Offset(0, -2))
    rngTgt.Cells.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=IF(LEN(r[-1]c)<>0,r[-1]c,"""")"
    rngTgt.Value = rngTgt.Value
End Sub

Open in new window


Note: This isn't as intelligent as the original formula
AcklesAuthor Commented:
Thanks,
It works well on the sample excel I attached for the first 4 columns!

Is it possible to incorporate both (one in your excel & the above mentioned) in the same workbook?

I'm sorry, but my skills with VBA are very limited or none...

Regards,
A
AcklesAuthor Commented:
I mean, the first should be filling the columns B - F
& then as you did earlier rest of the columns & deleting rows where there are no Article No's

Thanks in advance!
aikimarkCommented:
Do you need the <h1> title or just h2 and below?
AcklesAuthor Commented:
Sorry, I don't understand
aikimarkCommented:
see attached
Q_28710182.xlsm
AcklesAuthor Commented:
Hi Mark,
Looks good, except for one thing, in my Excel Column F (Unterabschnitt) has the first entry in Art. 360 Row 557, however in your result it's added in Column E.

This should be in Column F & Column E should go on with it's own value....
aikimarkCommented:
The title column placement is related to the (HTML) header level.  If you look at your PDF source, you will notice that there are two minor headings for Art. 360.  These two headings are at the same indentation level, which matches the same HTML header level that I am processing.  
Both of these headings/titles are at the same level, so the go in the same column.

Erster Abschnitt: Die eigene Vorsorge
Erster Unterabschnitt: Der Vorsorgeauftrag


I am going to look at the code one more time, because Art. 359 should have a different set of headers.  However, you will need to tweak my results manually.
AcklesAuthor Commented:
Mark,
I see what you mean, I was assuming that you are woking from Excel.

From what you say it makes sense & you are right, however from my point of view (if you see Excel) it's a seperated or a subsection of Erster Abschnitt...

Is it possible to have it the way I am hoping?

Regards,
A
aikimarkCommented:
Something out of the ordinary happens around these articles (see below).  The code is expecting data and there is only title headings or article headings.  The English translation indicates that items were repealed/deleted.  Look for blank values in column F.

Art. 134
Art. 359
Art. 373
Art. 389
Art. 398
Art. 418
Art. 419
Art. 442
Art. 450e
Art. 450f
Art. 915
aikimarkCommented:
* I am working from the HTML for the reasons I already stated.

* I posted a fill down code snippet earlier.  Did you try that?
Here is an updated version that matches what the parsing code is doing:
Sub FillDownOutline()
    Dim rngTgt As Range
    Set rngTgt = ActiveSheet.Range(ActiveSheet.Range("b4"), ActiveSheet.Cells.SpecialCells(xlCellTypeLastCell).Offset(0, -5))
    rngTgt.Cells.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=IF(LEN(r[-1]c)<>0,r[-1]c,"""")"
    Set rngTgt = ActiveSheet.Range(ActiveSheet.Range("B3"), ActiveSheet.Range("B1").CurrentRegion.Cells.SpecialCells(xlCellTypeLastCell).Offset(0, -2))
    rngTgt.Cells.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=IF(LEN(r[-1]c)<>0,r[-1]c,"""")"
    rngTgt.Value = rngTgt.Value
End Sub

Open in new window


* Since the data involves a language with which I am not fluent (German), it is neither practical nor wise for me to parse with respect to the German column indicators.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
AcklesAuthor Commented:
This code works fine, however is it possible to run it in two phases, i.e. First from Column B-F
& then from Column G to K

I say so because in the initial stage there will be some empty rows in Column K & if we delete them those values in Column B-F will also get deleted, so we first manipulate the Columns B-F & then the rest to delete empty rows.....

I don't know how to manage them....
aikimarkCommented:
The revised code I just posted does two phases.  It first fills down the header/title cells and then fills down the outline data cells.
AcklesAuthor Commented:
do I have to run it twice?
When I run once it only does the Columns B - F
AcklesAuthor Commented:
I mean the empty rows in Article are still left....
aikimarkCommented:
Are you running the updated code snippet I posted?
AcklesAuthor Commented:
yes
aikimarkCommented:
If you have extra columns in your data, or have a different starting point for your data, you will need to adjust the range references in the code.  The code is written to do the fill-down for four header/title columns (B:E) and then the outline data (F:H).
AcklesAuthor Commented:
oh, I see now:
Can you please edit it to do the following:
1st: Column B-F
2nd: Column G-J
3rd: Column K , to delete empty rows.

I'm so sorry , I know M getting greedy, but it would be perfect!!!!
aikimarkCommented:
You can do that (code tweak and row deletion) yourself.
AcklesAuthor Commented:
Thanks!!!
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Excel

From novice to tech pro — start learning today.