Solved

Detecting illegal characters in CSV file using a ColdFusion Regular Expression

Posted on 2014-03-12
4
1,025 Views
Last Modified: 2014-03-12
I'm building a ColdFusion Application that allows a user to upload a CSV file, which then gets imported into a MySQL database.  

The properties of the uploaded CSV file are as follows:

1) is comma separated
2) has quotation mark qualifiers
3) contains some empty columns
4) has no column headers
5) contains only a single row of data
6) contains columns that sometimes have commas in them
7) is plain text and can be viewed easily using a text editor

I'm reading the file using CFFILE, and would like to check it for illegal characters.  More specifically, I only want to allow for letters (both upper and lower case), numbers, punctuation, carriage returns and line feeds -- you know -- the kinds of characters you would expect to find in a comma separated CSV file.  

There is one offending character that I constantly encounter in these uploaded CSV files.  I'm not exactly sure what it is, but you can clearly see an example of it you refer to the CSV file that I've attached to this post.  It's a "VT" that's enclosed in a black circle, .. and I need to alert the user regarding the presence of this unwanted character in the CSV file BEFORE they attempt to import it into the database ... since it always results in a ColdFusion error.  

Currently, I am using the following code:

<cffile action="read" file="C:\wwwroot\InspectionRequest_1.csv" variable="csvData">
					
<cfif refind("[^a-zA-Z0-9 ,.&'$()\-+*=/]",csvData)>
					
     <p class="errortext">AN ERROR HAS OCCURRED!!!</p>
					
</cfif>

Open in new window


How can I extend this regex so that it's forgiving of all the types of characters that you'd expect to find in a comma separated CSV file -- yet unforgiving of all other illegal characters?  

Thanks,
- Yvan
InspectionRequest-1.csv
0
Comment
Question by:egoselfaxis
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 2
  • 2
4 Comments
 
LVL 35

Accepted Solution

by:
Dan Craciun earned 500 total points
ID: 39924129
0x0B is the vertical tab.

With that out of the way, I would do it this way:
\w - all word characters and numbers. Will accept characters from other languages
\s - all space characters, like space, tab, etc
"' - all text fields will be enclosed in "" or ''
,.?!:; - punctuation signs
@~#$%^&*(){}-+=\[\] - accepted symbols, add to the list as needed. you need to escape []

For a test expression like this:
[\w\s"',.?!:;@~#$%^&*(){}-+=\[\]]

Open in new window


HTH,
Dan
0
 

Author Closing Comment

by:egoselfaxis
ID: 39924410
Thank you!  

Since you identified the mysterious character for me, ..  I was actually able to simplify my conditional statement as follows:

<cfif csvData CONTAINS chr(11)>

<p>Error!</p>

</cfif>

Cheers!
- Yvan
0
 
LVL 35

Expert Comment

by:Dan Craciun
ID: 39924426
Glad I could help!

Just as a curiosity: you're sure that is the only illegal character?
0
 

Author Comment

by:egoselfaxis
ID: 39924449
Now, I am not 100% sure just yet .. but this particular vertical tab character has been known to be particularly problematic for us for the last couple of years, so we've decided to handle each offending character on a case-by-case basis until we've identified all of them.

Thanks again!
- yg
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Recently while working on a project I got a very annoying cfdocument has no body error message. I had never seen this error before. So I checked the code. The code was pretty simple; it was Just showing me the cfdocumnt tag and inside that tag a …
This article will guide you to convert a grid from a picture into Excel format using Microsoft OneNote and no other 3rd party application.
Graphs within dashboards are meant to be dynamic, representing data from a period of time that will change each time the dashboard is updated with new data. Rather than update each graph to point to a different set within a static set of data, t…
This Micro Tutorial will demonstrate how to create pivot charts out of a data set. I also added a drop-down menu which allows to choose from different categories in the data set and the chart will automatically update.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question