Link to home
Start Free TrialLog in
Avatar of Justin
Justin

asked on

Convert PDF to Excel

hi, I have very old accounting software and the Accounts I can only download it as a PDF (see attached).
I need to convert into an Excel spread sheet in order to create a reconciliation and I need to do this everyday. Can someone help? The most important items I need in a spread sheet cell are in the final column titled "Montant"
Accountingfile.pdf
Avatar of Danny Child
Danny Child
Flag of United Kingdom of Great Britain and Northern Ireland image

This has obviously been printed to paper - with the hand-written annotations.

Does the system produce a pure PDF at any point, or is the physical hard copy just scanned?

I've run the file through an Optical Character Recognition function in my PDF Editor - PDF-Xchange,
and this is what I got:
M--Personal-ee---PDF-OCR.xlsx
However, you'll see that some of the data hasn't converted cleanly - see row 31 or 34, where the 000's are seen as CCC or ccc.

This is due to the poor quality of the scan.  Hence, a "pure" PDF (ie Print to PDF) is better than one created by scanning paper.

It would help a bit if it wasn't annotated as well, as that just confuses things.

The next question is extracting data from this file, and you're either going to be using Text To Columns, or LEFT, MID and RIGHT functions to do this...
Sometimes, even (especially!) with really old systems, they can do a Print To File, and that way you get a pure text version of the document, often without the pagebreaks too.

You may need to install this as a driver first.
Add Printer.. Local.. select Generic in the list, then Generic/Text Only.
then open up the new queue - Properties.. Ports.. and tick the File box.
Avatar of Justin
Justin

ASKER

Here's a text file from the PDF. Can you see  if you can get this in a good format?
Accounting-Text-file.TXT
OK, I've gone a bit further with it.  It looks like you use the D/C column only in lines where there are numbers.
So, I've done 2 Find/Replaces for " D " ie SpaceDSpace - replacing it with semi-colon ;
and one more for " C ", replacing with ;
This means that the ; marker shows where your Montant ("Amount"?) value actually starts.

I've then used a MID function to pull out the whole number here:
=IFERROR(MID(B9,FIND(";",B9)+1,LEN(B9)-FIND(";",B9)-3),"")
and a RIGHT function to get the decimals:
=IF(ISNUMBER(FIND(";",B9)),RIGHT(B9,2)/100,"")

[the IFERROR and ISNUMBER above just make it skip any pure text lines]
- this is as far as the attached example goes

After that, I'd do a Copy/Paste Special.. Values on Col C (thus overwriting all the clever formulae with just values!)

The next problem would be that your values are formatted in a French style, with dots (.) as Thousand separators.  It's hard to do maths on these as they are.  So, I'd remove them - with do a Find/Replace on Col C, replacing all dots with "" ie nothing.

after that, you can just add the 2 columns together to get the amounts.
Of course, all the garbled characters still need removal, but that's discussed in my earlier points.
M--Personal-ee---PDF-OCR-v2.xlsx
cross-posted, but similar logic applies.  
gimmie a few minutes....;
Avatar of Justin

ASKER

Hi Dan, I used the Adobe Acrobat Pro DC PDF to Excel converter and I got the attached. it looks good but there 4 rows of Numbers in the Montant column per Cell. How can I get them in 1 cell per number? I also need to isolate the Contract Number shown as "No Contrat: (number)" so I can do a Vlookup against it to another system. Any ideas?
ACCOUNTING-EXCEL-FILE.xlsx
OK, I opened your file in Excel, just did Finish on the TextToColumns wizard.
I added an extra column before the data, just for tidiness.

 - got a lot of superscript 3's flying around now!
Then, I did a copy from your text string of the "3 D 3" part, and then Find Replace with ;
same again with "3 C 3", again with ;

Then, the formulae from cols C and D can be pasted in.

Extra problem this time around is that there are now spaces to contend with, but luckily, if we remove those French dots again, Excel sees them as numbers, and drops the initial spaces as well.
So, Copy/Paste Special.. Values on col C (trashing all the formulae)
Find/Replace "." with "" on Col C.
add new Col E, summing Cols C and D.
add final total.
M--Personal-ee--PDF-text.xlsx
That Adobe conversion is going to be a bit evil to work with.  It's got hidden characters in that cell to force the linebreaks.  Can be done, but the Text one is easier, and much less error-prone.

The final step would be to get this all macro'd up, but I don't do VBA... you could post another Q though!  
Could convert the whole text sheet with subtotals at the press of a button...
ASKER CERTIFIED SOLUTION
Avatar of Roy Cox
Roy Cox
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
An Adobe OCR operation produced the following text from the first page, starting with the first data header:
Ccci Dc.te Cat e Nc No de L􀅥belle No. Libelle D/C Hontant
EVE Corr,ptabl Vc.lsLir CCE Corr,pte MaiEcn Ccrr,pte Client/Cc􀃡rtier HorJtar::t
ECHEANCE fINALE N' cor.trat : C2C Type : PTF Agence : NEAC PPJU:: Cl:'er.t : NEF'!; LCN
Ecr 05/11/15 05/11/15 121 6001211 0 14 0 ADIE WASHINGTON NBAD LON MONTANT ECHU 0 38.000.237,50
ECF 05/11/15 05/11/15 115210 13120000002 NBAD PRETS TERME NBAD LON CAPITAL C 38.000.000,00
ECF 05/11/15 05/11/15 990900 38210000002 POSITION DE CHANGE NBAD LON INTERETS ECHUS C 237,50
ECF 05/11/15 05/11/15 990900 38210010400 CV POSITION USD NBAC LON CV IN COURUS 0 218,23
ECF 05/11/15 05/11 1 15 750100 70132000003 INT PERCUS LON TERME USD NBAD LON CV IN COURUS C 218,23
ECHEANCE FINALE N' contrat : 7675 Type : ETF Agence : NBAC PARIS Client : NBAD H.O.
ECF 05/11/15 05/11/15 111 11010030100 BDl' TARGET2 NBAC HO MONTANT ECHU C 941.994.635,83
ECF 05/11/15 05/11/15 116210 13220000002 NBAC EMPRT TERME NBAC HO CAPITAL D 942.000.000,00
ECF 05/11/15 05/11/15 630000 60132000005 INT SERVIS HO NBAD HO INTERETS ECHUS C 5.364,17
ECHEANCE FINALE N" contrat : 7676 Type : ETF Agence : NBAD PARIS Client : NBAD LON
ECF 05/11/15 05/11115 121 64012210100 EURO NEAD LON NBAD LON MONTANT ECHU C 2.493.985.798,06
ECF 05/11/15 05/11/15 116210 13220000002 NBAC EMPRT TERME NBAD LON CAPITAL 0 2.494.000.000,00
ECf 05/11/15 05/11/15 630000 60132000002 INTS SERVIS LON E TERME NBAD LON INTERETS ECHUS C 14.201,94
ECHEANCE INT. N" controt : 381301 Type : PIMMO$ Agence : NBAD Pl'.RIS Client : AL MAZROUEI
ECI 05/11/15 05/11/15 251 70038130100 E N COMPTE EUR 3813 MONTANT ECHU 0 7.108,58
ECI 05/11/15 05/11/15 240000 74338130100 PRETS IMMO PAR NR 3813 CAPITAL C 5.555,56
ECI 05/1 1/1 5 05/11 / 15 726000 70215000003 INT PERCUS IMMO PARTICUL 3813 lNTERETS ECHUS C 1. 553,02
t1ISE P. CISPO. N' ccntrct : eLi Type : PH Ager..cE : NEAC PP.RI S CL.er.t : NEF'!; LCN
111']) 05/11/15 G5/11/15 121 6CC12l1G14G Acr E I\'}>.SHINGTON / NBhC LCN CAPITAL C 36.0CC.COC,CO
HAC 05/11/15 05/11/15 11521C 1312GOGCCC2 NEP.D PRETS TERME NBAC LDN CAPITP.L 0 :;B.OCC.Cce,cc ,
I􀅦ISE P. CISFC. N' contrc.t : 7f77 Type : ETF Ager.c€ : NN.D FARIS CLent : NEAr: H.O.
I1l-L C5/1 1115 C5/11/j.5 111 1101CC3C1C( ECF TP.RGET2
􀀋 NEAr: HO CAFITP.L D 926.(.CC.COC,CC
11l-L C5/1􀃢/:'S C5/11/1􀁂 :;'􀁗E2􀁘C 1 322CCCCCC2 NEP'!; El􀁲PRT TERME NEF.C HO CAFIT}>.L C 92E.r.CC,CCC,CC ./
I􀇊ISE }>_ CIHe. H' CQT.trct : 7E7E TYFe : ETF J..qET.ce : NEIL FMI:: Cl􀎭er.t : NH.C LCN
11l-L C5/:;'1/1': C􀁙/ll / :􀁙 ::21 E􀁚CIL21CICC El'RO NEAr: LCI, ./ NEP.C LCN CAFITl-.L C 2.El7.CCC.CCC,CC
11l-L GS/l:i/1S C􀁙/11/15 11 E21C 13LLCCCCCC2 NU.C E11FRT TEF.l􀁳E NEP.L LCI, UJ'IT}>.L C 2.E17.CCC.CCC,CC
/
REESCCI1FTE INT. N' ccr.trGt : CO:; TYFe : FL'EC3H F.ger.C€ : NUL P}>.RI :: CLer.t : NU.C LCH
RES 05111115 05/11/15 310100 13172000002 PROD A REC P TERME NBAD LON I.COURUS TOTAUX 0 59,86
RES 05111115 05/11115 990900 38210000002 POSITION DE CHANGE NBAD LON I.COURUS TOTAUX C 59,86
RES 05111115 05/11115 990900 38210010400 CV POSITION USD NBAD LON CV IN COURUS 0 55,00
ReS 0􀁐/11115 05/11/15 750100 70132000003 INT PERCUS LON TERME USD NBJ\D LDN CV IN COURUS C 55,00
RES 06/11115 06/11115 310100 13172000002 PROD A REC P TERI'lE NBAD LON I.COURUS TOTAUX C 59,86
RES 06/11/15 06/11/15 990900 38210000002 POSITION DE CHANGE NBAD LON I.COURUS TOTAUX 0 59,86
Kt;􀎫 06/11/15 06/11/15 990900 38210010400 CV POSITION USD NBAD Lor, CV IN COURUS C SS,OO
RES 06/11115 06/11/15 750100 70132000003 INT PERCUS LON TERME USD NBAD LON CV IN COURUS 0 55,00
REESCOMPTE INT. N" contrat : 010 Type : PUBD3M Agence : NBAC PARIS Client : NBAD LON
RES 05/11115 05/11/15 310100 13172000002 PROD A REC P TERME NBAD LON I.COURUS TOTAUX 0 8,08
RES 05/11115 05/11115 990900 38210000002 POSITION DE CHANGE NBAD LON I.COURUS TOTAUX C 8,08
RES 05/11/15 05/11/15 990900 38210010400 CV POSITION USD NBAD LDN CV IN COURUS D 7,42
RES 05/11/15 05/11/15 750100 701320(l(lO03 INT PER("TJS T.f)N TF:RHF: nSf) NRAO LON r.v TN r.OIlRIlS r. 7 ,o4?
RES 06111115 06/11/15 310100 13172000002 PROD A REC P TERME NBAD LON I.COURUS TOTAUX C B,08
RES 06/11115 06/11/15 990900 38210000002 POSITION DE CHANGE NBAD LON I.COURUS TOTAUX 0 8,08
RES 06/11115 06/11115 990900 38210010400 CV POSITION USD NEAD LON CV IN COURUS C 7,42
RES 06/11115 06/11115 750100 70132000003 INT PERCUS LDN TERME USD NBAD LON CV IN COURUS 0 7,42
REESCOMPTE INT. N' contrat : 017 Type : PGBDHI Agence : NBAC PARIS Client : NBAD LON
RES 05/11115 05/11115 310100 13172000002 PROD A REC P TERME NBAD LON I.COURUS TOTAUX 0 268, 60
RES 05/11115 05/11115 990900 38210000002 POSITION DEVISE NBAD LON I.COURUS TOTAUX C 268, 60
RES 05/11/15 05/11115 990900 38210010006 CV POSITION GBP NBAC LON CV IN COURUS 0 376,93
RES 05/11/15 05/11/15 150100 70132060003 INT PERC. LDN TER. GBP NBAD LON CV IN COURUS C 376,93
RES 06/11/15 06/11115 310100 13172000002 PROD A REC P TERME NBAD LON I.COURUS TOTAUX C 268,60
RES 06/11115 06/11/15 990900 38210000002 POSITION DEVISE NBAD LDN I.COURUS TOTAUX D 268,60
RES 06/11115 06/11/15 990900 38210010006 CV POSITION GBP NBAD LDN CV IN COURUS C 376,93
RES 06/11115 06/11/15 750100 70132060003 INT PERC. LON TER. GBP NBAD LON CV IN COURUS 0 376,93
REESCOMPTE INT. N" contrat : 021 Type : PTF Agence : NBAD PARIS Client : NBAD LON
RES 05/11/15 05/11/15 310100 13172000002 PROD A REC P TERME NBAC LON I.COURUS TOTAUX 0 237,50
RES 05/11115 05/11/15 990900 38210000002 POSITION DE CHANGE NBAC LON I.COURUS TOTAUX C 237,50
RES 05111115 05/11/15 990900 38210010400 CV POSITION USD NBAD LON CV IN COURUS D 218,23
RES 05/11115 05/11/15 750100 70132000003 INT PERCUS LDN TERHE USD NBAD LON CV IN COURUS C 218,23
RES 06/11115 06/11115 310100 13172000002 PROD A REC P TERI'fE NBAD LON I.COURUS TOTAUX C 231,50
RES 06/11/15 06/11/15 990900 392100000:)2 POSITION DE CHANSE NBAD LDN I. COURUS TOT."DX D 231,50
RES 06/11115 06/11115 B0900 332IJ01D40::> CV POSITION USD NBAD LDN :::v IN COURUS C 218,23
RES 06/11115 06/11115 150100 70132000003 INT PERCUS LDN TERME USD NBAD LDN CV IN COURUS D 218,23
REESCOMPTE INT. N" c:mtrat : 32432 Type : PHll'lO Agence : NBAD PARIS Client : HID GROUP blJ() 
RES 05/11115 05111115 310300 20570000001 PROD A REC PRET IMMO 3243 I.COURUS TOTAUX 0 6,83
RES 05111115 05/11115 726000 70215000002 INT PERCUS lMi'10 SOC 3243 I.COURUS TOTAUX C 6,83 qJ>rJ .. 1, .

Open in new window

Parsing with the following regex pattern:
(\w{3}) ([^ ]*) ([\d\/]*) (\d{3,}) (\d{3,}) ([A-Z].*?) ((?:NBA. [A-Z]+|\d+)) ([A-Z].*?) (\w) (\d[^ ]*)(?:\r\n|$)
Produced the following parsed output of 35 data lines out of 70 input lines:
Submatches(0):
            [0] => ECF
            [1] => ECF
            [2] => ECF
            [3] => ECF
            [4] => ECF
            [5] => ECF
            [6] => ECF
            [7] => ECF
            [8] => ECf
            [9] => ECI
            [10] => ECI
            [11] => RES
            [12] => RES
            [13] => RES
            [14] => RES
            [15] => RES
            [16] => RES
            [17] => RES
            [18] => RES
            [19] => RES
            [20] => RES
            [21] => RES
            [22] => RES
            [23] => RES
            [24] => RES
            [25] => RES
            [26] => RES
            [27] => RES
            [28] => RES
            [29] => RES
            [30] => RES
            [31] => RES
            [32] => RES
            [33] => RES
            [34] => RES

Submatches(1):
            [0] => 05/11/15
            [1] => 05/11/15
            [2] => 05/11/15
            [3] => 05/11/15
            [4] => 05/11/15
            [5] => 05/11/15
            [6] => 05/11/15
            [7] => 05/11/15
            [8] => 05/11/15
            [9] => 05/11/15
            [10] => 05/11/15
            [11] => 05111115
            [12] => 05111115
            [13] => 05111115
            [14] => 06/11115
            [15] => 06/11/15
            [16] => 06/11115
            [17] => 05/11115
            [18] => 05/11115
            [19] => 05/11/15
            [20] => 06/11115
            [21] => 06/11115
            [22] => 05/11/15
            [23] => 05/11/15
            [24] => 06/11/15
            [25] => 06/11115
            [26] => 06/11115
            [27] => 06/11115
            [28] => 05/11/15
            [29] => 05/11115
            [30] => 05111115
            [31] => 05/11115
            [32] => 06/11115
            [33] => 06/11115
            [34] => 05/11115

Submatches(2):
            [0] => 05/11/15
            [1] => 05/11/15
            [2] => 05/11/15
            [3] => 05/11/15
            [4] => 05/11/15
            [5] => 05/11/15
            [6] => 05/11115
            [7] => 05/11/15
            [8] => 05/11/15
            [9] => 05/11/15
            [10] => 05/11/15
            [11] => 05/11/15
            [12] => 05/11115
            [13] => 05/11115
            [14] => 06/11115
            [15] => 06/11/15
            [16] => 06/11/15
            [17] => 05/11/15
            [18] => 05/11115
            [19] => 05/11/15
            [20] => 06/11/15
            [21] => 06/11115
            [22] => 05/11115
            [23] => 05/11/15
            [24] => 06/11115
            [25] => 06/11/15
            [26] => 06/11/15
            [27] => 06/11/15
            [28] => 05/11/15
            [29] => 05/11/15
            [30] => 05/11/15
            [31] => 05/11/15
            [32] => 06/11115
            [33] => 06/11115
            [34] => 05111115

Submatches(3):
            [0] => 115210
            [1] => 990900
            [2] => 990900
            [3] => 111
            [4] => 116210
            [5] => 630000
            [6] => 121
            [7] => 116210
            [8] => 630000
            [9] => 251
            [10] => 240000
            [11] => 310100
            [12] => 990900
            [13] => 990900
            [14] => 310100
            [15] => 990900
            [16] => 750100
            [17] => 310100
            [18] => 990900
            [19] => 990900
            [20] => 990900
            [21] => 750100
            [22] => 990900
            [23] => 150100
            [24] => 310100
            [25] => 990900
            [26] => 990900
            [27] => 750100
            [28] => 310100
            [29] => 990900
            [30] => 990900
            [31] => 750100
            [32] => 310100
            [33] => 150100
            [34] => 310300

Submatches(4):
            [0] => 13120000002
            [1] => 38210000002
            [2] => 38210010400
            [3] => 11010030100
            [4] => 13220000002
            [5] => 60132000005
            [6] => 64012210100
            [7] => 13220000002
            [8] => 60132000002
            [9] => 70038130100
            [10] => 74338130100
            [11] => 13172000002
            [12] => 38210000002
            [13] => 38210010400
            [14] => 13172000002
            [15] => 38210000002
            [16] => 70132000003
            [17] => 13172000002
            [18] => 38210000002
            [19] => 38210010400
            [20] => 38210000002
            [21] => 70132000003
            [22] => 38210010006
            [23] => 70132060003
            [24] => 13172000002
            [25] => 38210000002
            [26] => 38210010006
            [27] => 70132060003
            [28] => 13172000002
            [29] => 38210000002
            [30] => 38210010400
            [31] => 70132000003
            [32] => 13172000002
            [33] => 70132000003
            [34] => 20570000001

Submatches(5):
            [0] => NBAD PRETS TERME
            [1] => POSITION DE CHANGE
            [2] => CV POSITION USD
            [3] => BDl' TARGET2
            [4] => NBAC EMPRT TERME
            [5] => INT SERVIS HO
            [6] => EURO NEAD LON
            [7] => NBAC EMPRT TERME
            [8] => INTS SERVIS LON E TERME
            [9] => E N COMPTE EUR
            [10] => PRETS IMMO PAR NR
            [11] => PROD A REC P TERME
            [12] => POSITION DE CHANGE
            [13] => CV POSITION USD
            [14] => PROD A REC P TERI'lE
            [15] => POSITION DE CHANGE
            [16] => INT PERCUS LON TERME USD
            [17] => PROD A REC P TERME
            [18] => POSITION DE CHANGE
            [19] => CV POSITION USD
            [20] => POSITION DE CHANGE
            [21] => INT PERCUS LDN TERME USD
            [22] => CV POSITION GBP
            [23] => INT PERC. LDN TER. GBP
            [24] => PROD A REC P TERME
            [25] => POSITION DEVISE
            [26] => CV POSITION GBP
            [27] => INT PERC. LON TER. GBP
            [28] => PROD A REC P TERME
            [29] => POSITION DE CHANGE
            [30] => CV POSITION USD
            [31] => INT PERCUS LDN TERHE USD
            [32] => PROD A REC P TERI'fE
            [33] => INT PERCUS LDN TERME USD
            [34] => PROD A REC PRET IMMO

Submatches(6):
            [0] => NBAD LON
            [1] => NBAD LON
            [2] => NBAC LON
            [3] => NBAC HO
            [4] => NBAC HO
            [5] => NBAD HO
            [6] => NBAD LON
            [7] => NBAD LON
            [8] => NBAD LON
            [9] => 3813
            [10] => 3813
            [11] => NBAD LON
            [12] => NBAD LON
            [13] => NBAD LON
            [14] => NBAD LON
            [15] => NBAD LON
            [16] => NBAD LON
            [17] => NBAD LON
            [18] => NBAD LON
            [19] => NBAD LDN
            [20] => NBAD LON
            [21] => NBAD LON
            [22] => NBAC LON
            [23] => NBAD LON
            [24] => NBAD LON
            [25] => NBAD LDN
            [26] => NBAD LDN
            [27] => NBAD LON
            [28] => NBAC LON
            [29] => NBAC LON
            [30] => NBAD LON
            [31] => NBAD LON
            [32] => NBAD LON
            [33] => NBAD LDN
            [34] => 3243

Submatches(7):
            [0] => CAPITAL
            [1] => INTERETS ECHUS
            [2] => CV IN COURUS
            [3] => MONTANT ECHU
            [4] => CAPITAL
            [5] => INTERETS ECHUS
            [6] => MONTANT ECHU
            [7] => CAPITAL
            [8] => INTERETS ECHUS
            [9] => MONTANT ECHU
            [10] => CAPITAL
            [11] => I.COURUS TOTAUX
            [12] => I.COURUS TOTAUX
            [13] => CV IN COURUS
            [14] => I.COURUS TOTAUX
            [15] => I.COURUS TOTAUX
            [16] => CV IN COURUS
            [17] => I.COURUS TOTAUX
            [18] => I.COURUS TOTAUX
            [19] => CV IN COURUS
            [20] => I.COURUS TOTAUX
            [21] => CV IN COURUS
            [22] => CV IN COURUS
            [23] => CV IN COURUS
            [24] => I.COURUS TOTAUX
            [25] => I.COURUS TOTAUX
            [26] => CV IN COURUS
            [27] => CV IN COURUS
            [28] => I.COURUS TOTAUX
            [29] => I.COURUS TOTAUX
            [30] => CV IN COURUS
            [31] => CV IN COURUS
            [32] => I.COURUS TOTAUX
            [33] => CV IN COURUS
            [34] => I.COURUS TOTAUX

Submatches(8):
            [0] => C
            [1] => C
            [2] => 0
            [3] => C
            [4] => D
            [5] => C
            [6] => C
            [7] => 0
            [8] => C
            [9] => 0
            [10] => C
            [11] => 0
            [12] => C
            [13] => 0
            [14] => C
            [15] => 0
            [16] => 0
            [17] => 0
            [18] => C
            [19] => D
            [20] => 0
            [21] => 0
            [22] => 0
            [23] => C
            [24] => C
            [25] => D
            [26] => C
            [27] => 0
            [28] => 0
            [29] => C
            [30] => D
            [31] => C
            [32] => C
            [33] => D
            [34] => 0

Submatches(9):
            [0] => 38.000.000,00
            [1] => 237,50
            [2] => 218,23
            [3] => 941.994.635,83
            [4] => 942.000.000,00
            [5] => 5.364,17
            [6] => 2.493.985.798,06
            [7] => 2.494.000.000,00
            [8] => 14.201,94
            [9] => 7.108,58
            [10] => 5.555,56
            [11] => 59,86
            [12] => 59,86
            [13] => 55,00
            [14] => 59,86
            [15] => 59,86
            [16] => 55,00
            [17] => 8,08
            [18] => 8,08
            [19] => 7,42
            [20] => 8,08
            [21] => 7,42
            [22] => 376,93
            [23] => 376,93
            [24] => 268,60
            [25] => 268,60
            [26] => 376,93
            [27] => 376,93
            [28] => 237,50
            [29] => 237,50
            [30] => 218,23
            [31] => 218,23
            [32] => 231,50
            [33] => 218,23
            [34] => 6,83

Open in new window

If I had made the pattern more detailed to only  pick up correct dates, then many fewer lines (14) would have matched the pattern.
Jeez, all that work and no points!  Harsh....
I just parsed the accounting above file with https://docparser.com and that worked pretty well, minimal post processing to merge some fields, but the image skew and poor quality of characters were not an issue.