Angular: Need to use the UI to filter user confidential data

I need to allow users to upload their financial reports (PDF or CSV) to my site, but enable them to obscure private data (Business Name, Merchant Account Numbers, etc.) before any data gets stored to the disk.

I need to use Angular to create a list of user data fields extracted from the PDF and display them for approval, perhaps with a checkbox next to each field. Fields which contain that merchants proprietary data need to be then be checked off by the user so those fields would be populated with XXXXX's.

I do not want to store that PDF onto my server. Instead, I want the user to select the fields needing to be hidden. Or even ignore them altogether. I just do not want that PDF on my server.

So, the question is...

Can Angular extract the contents of a PDF that has been uploaded? And a PDF that does not exist on disk?

I could save the extracted data onto my server in some other fashion, not yet decided.

I prefer Angular 4, and hope this functionality exists in Angular 4.

It's an interesting challenge. I see Angular as a great platform for nested logic and as a place to store objects and lists of objects. And to do the first part of processing this data on the client is what intrigues me.

newbiewebSr. Software EngineerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

1. When you upload a file to a web server, it's going to be stored on disk unless you can find some plugin for your web server or special PHP extension (you mentioned WP in another post, so I'm assuming you're probably on PHP and Apache). You have to assume that if a file isn't uploaded and saved into a temp file on a physical disk, then it means that it has to be in memory, which doesn't scale well, so the default behavior is to save the uploaded data into a temp file on the hard disk and then tell PHP where to find it (along with the original filename, etc).

2. -Temporarily- storing something on disk might be acceptable, but it depends on the data and your compliance requirements. If you're storing CC data, for example, then any disk-based storage (even temporary) will require PCI compliance, which has a lot of other pieces. If you can store a file temporarily, then using PHP to extract the contents of the PDF would be your best bet, since it's server-side and would have access to the raw data.

3. Angular runs on the client side ultimately as Javascript, which means that it isn't going to be able to access file data unless something (like PHP) makes that data available. I'm not an Angular expert, so maybe someone else can correct me if I'm wrong, but I don't think Angular can do what you're asking it to do.

4. If you're ultimately concerned about data being processed BEFORE it gets to the web server, then you'll have to depend on something like a Java browser applet (note that Java is not the same thing as Javascript) or a desktop app like something written in .NET (e.g. C# or VB.NET) that will have access to the local file system and be able to parse a PDF file and extract/manipulate the data prior to sending it to the server.

The catch to that is compatibility. Some browsers don't support Java applets anymore at all. A .NET app isn't cross-platform. You could write a desktop Java app (instead of a browser applet) that would be cross-platform although it would require people to have that Java runtime installed.

Honestly, given what you're after (the "XXXXX"-ed out proprietary data), a desktop Java app seems like your best bet. It's more hassle to the user but it's the only way to mask the data prior to it being sent to your server.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
newbiewebSr. Software EngineerAuthor Commented:
So, this Java app would run on a Mac? And a different one for Windows?

On a a different vain...given that I need the PDF in the event there is a bug I need to fix during the initial parsing process...

What kinds of steps are needed to ensure PCI compliance? And what does PCI stand for???
newbiewebSr. Software EngineerAuthor Commented:
Java app would run the same on Mac and Windows and Linux and anything else that runs Java - cross-platform compatibility is one of the main advantages of Java, and it's also why Java apps look a little "different" from the normal Windows or Mac apps - it's because it doesn't use the OS's layout / UI engines. So you would only need to write and maintain one application.

If you end up building a Java app and there's a bug in the PDF parsing process, you would probably include something that allows the user the CHOICE to submit their PDF to you so you can update the app to work with their PDF structure. That way, a user who doesn't want to submit their data to you would have the choice to opt out of doing that. Then if they do send it to you, you could upload it to the server via HTTPS and then work with it.

PCI compliance is a fairly big concept to explain in a comment like this. I'll give you the high-level stuff but you'd need to read more about it in-depth. There's a lot of material that covers it, including books you can buy or check out from your local library. PCI stands for Payment Card Industry - the whole idea behind is a set of rules that you follow in order to properly protect credit card / financial data. It's USUALLY targeted at merchants that are processing credit cards, but it really applies to anyone who is potentially storing CC data. Compliance involves a variety of things, from POLICIES (e.g. if you have employees, then nobody has access to sensitive data except those who absolutely need to) to CODING RULES (e.g. you can NEVER store the CVV codes, among other types of data, from a card), to PROCESSES (e.g. perform regular vulnerability scans).

Here's a good FAQ on it:

Just google for PCI compliance and you'll find oodles of information.
newbiewebSr. Software EngineerAuthor Commented:
I loved your Java idea so much and would prefer to stay very clear from PCI compliance issues, you inspired a new idea...
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.