Jumpstarting my first Java app

I program in C# but need to write a Java app that can be downloaded and run on many platforms. The application will "scrub" confidential information from a merchant's credit card report (PDF).

It will remove the merchant's name, address, merchant ID (xxxx-xxxx-xxxx-xxxx) account number, etc.

It will do this by scanning the entire PDF and exporting non-confidential data to a CSV file.

I need this to be easy for non-techies to download and run.

What basic considerations should I have about what I should and should not consider before I start planning the design?

What frameworks help? Hurt?

What kind of installation program/script is needed?

I would host this download from my website...

newbiewebSr. Software EngineerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Dustin SaundersDirector of OperationsCommented:
Your question is.... well, sort of nebulous.  

Are all the PDFs going to be the same format?  Are they all able to be processed with OCR?  Who is the user?  Are end users downloading this software and then running it or people in an office who send out the results? ( i.e. if its downloaded to a computer isnt the confidential data already there?)
newbiewebSr. Software EngineerAuthor Commented:
I will not use OCR, and would only process downloaded PDF reports. I would likely have a list of "supported processors" and need to be very busy updating my code to extend this capability to other for other credit card processors.

The users would be merchants who would be shielding me from liability by scrubbing the reports on their own PC's. When a support format is used, the output would be the CSV file expected by my website, for an upload of the data in a "scrubbed report."
To work with PDFs in Java, use PDFBox.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
You probably need to say something about the relationship between the PDF file and the CSV file. An example 'fictional' PDF would help
newbiewebSr. Software EngineerAuthor Commented:
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Operating Systems

From novice to tech pro — start learning today.