Solved

Parsing Internet Banking data from multiple banks, cards.. ideas?

Posted on 2012-03-13
10
979 Views
Last Modified: 2012-03-18
("The Question" is at the end)
-------------------------------------

I use the term "banks" to include all sources such as credit cards and building societies.

I am a practicing accountant (which is relevant in this question as it provides context)

Clients (if I am lucky) send me CSV files of bank data, transaction lists, the same sort of thing cloud accounting systems import automatically, at least some do.

Cloud systems tend to use the user's private login details to log in and "scrape" the data or download files for import. These services cost from say 10 pounds/dollars/euros per month upwards, a bit less for a few in their minimum guise.

There is also a service not fully rolled out yet that direct links to banks and downloads data daily, this uses separate login and does not entail use of private login details. This costs about 10 pounds/dollars/euros per month (per client). Data is delivered in a homogenous layout which would solve this Question. Problem is not enough banks are covered, data cannot be historic (that's a killer) and costs make too large a hole in the fee.

The last para above illustrates my question. It's the layouts. Excel has a number of ways to parse data and does it very well.

So does the Mac assist with the ability to create "hot" folders that for example when I drop in a dozen csv files, will concatenate them into one. On my PC that's easily done with a batch file copy *.csv big.fil (or type > etc)

So now we have these files which are invariably delivered upside down (i.e. newest transactions first, which is a real pita because the files arrive oldest files first (filenames)) so we end up with the initial sheet having blocks of data oldest first but the data within newest first. Grr! IF the client has not been VERY careful to commence each download immediately after the last one, this causes time consuming checks because the balances month to do not tally.

The solution involves sorting the transactions into proper ascending order, while maintaining then original (but reversed) order, then adding a calculated balance column, and comparing with reported balance (usually in the csv, not always) and highlighting discrepancies, while allowing for more than one, at this point I fix the first and see what is next, a clever program could take the error, adjust the calculated balances and find following errors too. That would be cool.. actually the calc balance column could do that. I'm just so pissed off I don't bother.

The above is Problem 1.


Next problem is this, I have say 7 banks' csv files, some with many (that means 1,000 or so) transactions and most with a few, credit cards, savings accounts and the like.

These files are delivered with different column headings (field names) and there lies my problem.

Sure I can manually import and muck about with import headings and then spreadsheet table headings etc, which indeed I do and have done for some time - there has been a process where few then some and now nearly all can/will deliver in this form, some much better than others.  One uses pdf.. (expletive deleted).

I suppose what I am looking for, though I am open to suggestions here, is how to do a sort of "save settings" for each bank.

Which I just realised is exactly what one can do. However, it doesn't solve my problem, which is the order of the fields/headings. These differ in order (order of headings/columns/fields across the page) and content (balances, amounts, descriptions, references) from bank to bank.

Powerpivot springs to mind. PC Only mind, won't work on my mac, though MS Query will.

So that's Problem 2

Having acquired the data in the right columns order, which at this point would still need problem 1 to be addressed) there is then the issue that banks insist on concatenating all sorts of, dare I say rubbish into what would be a perfectly good payee name. Things like Tesco transaction numbers, such that instead of all Tesco entries being the same (great! I could code that) they are all unique. Bear in mind the payee name is random, we do not have a nice list of payees with which to identify.

I have noticed that judicious use of =mid(etc) can work reasonably effectively because the banks I suspect have fixed field lengths, but as I have found this cannot be guaranteed effective in the application, some entries fail. What I do there is insert a helper column and insert the mid's number of chars from which to start, ref that in the mid formula and manually over ride when one item fails.

Parsing what has already been parsed.

And so that's problem 3

THE QUESTION

I am NOT asking for applications to be written to solve this.

I am asking what features and perhaps techniques you think I could explore.

For example external data into Pivot Tables, Powerpivot or MS Query, all of which can be saved and I am familiar with, wherein lies part of the problem. However, I need a system not just one method. A means of managing and re-using so if I have done the work fo banks 1 2 & 3 I can save it such that I do not have to remember anything other than say which method I use, so PP or MSQ or whatever.

I do not want to use any non Excel third party applications (I am thinking of parsers here.... maybe I shouldn't be so quick to say that actually).

Anthony
0
Comment
Question by:anthonymellorfca
  • 6
  • 2
  • 2
10 Comments
 
LVL 17

Assisted Solution

by:nepaluz
nepaluz earned 250 total points
ID: 37716424
OBITER DICTA 1 - per your Problem 2, why on earth would you worry about the payee information in a csv file? If you are importing into any financial application (worth any salt, I may add) using financial data exchange files (OFX, QIF, QFX spring to mind), a non valid payee in a payee is ignored!
OBITER DICTA 2 - Are you stuck in MacWorld? If not, have a look at iCreateOFX Basic
0
 
LVL 9

Author Comment

by:anthonymellorfca
ID: 37717514
nepaluz,

Interesting converter . def worth a download and eval:

We endeavour to have a quick turn-around for submited headers, so check back later to download the script for the submitted header.




Anthony
0
 
LVL 26

Accepted Solution

by:
redmondb earned 250 total points
ID: 37717624
Hi, Anthony.

Definitely not for points...

Dave has an excellent solution here for converting a file to a "standard" output. His solution is for a single input format, but it would be very easy to extend this to handle multiple formats. (Perhaps a column holding an Input File Format ID. So, when you loaded a file you'd also specify its IFFID and it would then be automatically reorganised.)

A slightly more sophisticated version would also handle the parsing issues.

Regards,
Brian.
0
 
LVL 9

Author Comment

by:anthonymellorfca
ID: 37723893
sounds good.. where's Dave when I need him...



EDIT: given our time difference sleeping soundly I hope :-)
0
 
LVL 9

Author Closing Comment

by:anthonymellorfca
ID: 37732715
Points for helping me understand my question in non accounting terms. Thanks.
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 17

Expert Comment

by:nepaluz
ID: 37732756
I hope you found the solution to your problem. The solution I pointed you towards, in my experience, covers a several banks (I am an accountant and use this regularly and have recomended it to manyofmy customers) with the flexibility of adding new ones. The drawback for yourself is probably the Mac thing, however if you overlook that, then aside from writing your own macros to parse each individual file, I'd be interested in knowing how you've gone about tackling it (if you can).
0
 
LVL 26

Expert Comment

by:redmondb
ID: 37733048
Thanks, Anthony. Hopefully we'll see a follow-up on this!
0
 
LVL 9

Author Comment

by:anthonymellorfca
ID: 37733642
certainly, it's incubating!
0
 
LVL 9

Author Comment

by:anthonymellorfca
ID: 37734786
0
 
LVL 9

Author Comment

by:anthonymellorfca
ID: 37734791
nepaluz: yes I am very tempted by the idea because of the service for adding scripts and that it applies a, or can apply, a standard; wondering if it will store saved profiles and import multiple files as a batch for a bank.

That said I am often amazed at what the EE mavens come up with, so we will see, I've linked to my follow on above.


edit: to answer your question about how I go about it: nothing more than the usual grinding years of experience doing it by hand with Excel; concatenate files, load into excel, sort, balance, paste columns into main sheet, sort, code, pivot.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Delete texts with font color 16 30
Excel 2016 Not Responding Issues 6 27
Error in formula not increasing value 2 11
Dynamic Filter ? 4 12
Drop Down List with Unique/Distinct Values (enhancing the Combo-Box with a few steps and a little code) David miller (dlmille) Intro Have you ever created a data validation list from a database field or spreadsheet column (e.g., Zip Codes or Co…
This article descibes how to create a connection between Excel and SAP and how to move data from Excel to SAP or the other way around.
The viewer will learn how to simulate a series of coin tosses with the rand() function and learn how to make these “tosses” depend on a predetermined probability. Flipping Coins in Excel: Enter =RAND() into cell A2: Recalculate the random variable…
The viewer will learn how to create two correlated normally distributed random variables in Excel, use a normal distribution to simulate the return on different levels of investment in each of the two funds over a period of ten years, and, create a …

911 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

18 Experts available now in Live!

Get 1:1 Help Now