Solved

Parse middle of string based on certain and sometimes repetitive characters

Posted on 2013-01-11
22
241 Views
Last Modified: 2013-01-16
Hello,

I have a table containing several thousand records of file names. I need to be able to pull out only part of the file name to use for other purposes. Below I have listed some example names and what the query result should be for each. Basically, I need to remove the .pdf file extension from the right, and I need the group of characters that contains the hyphen starting after the last underscore.

tblDrawings
fFileNames
SystemName: (query expression for parsed text)

IMC_FSB_L1_P_HUPWR-015.pdf                        HUPWR-015
IMC_FSB_L1U_P_CCW-001_R1_10.13.12.pdf      CCW-001_R1_10.13.12
IMC_FSB_P_L1_HPAr-024.1.pdf                        HPAr-024.1
IMC_FSB_P_L1_PCWR-006-R3.pdf                        PCWR-006-R3
IMC_FSB_P_L1_PCWS-021.1.pdf                        PCWS-021.1
IMC_FSB_P_L1_SWS-034_R1_12.27.12.pdf      SWS-034_R1_12.27.12
IMC_FSB_P_L1U_HFW-014.pdf                                HFW-014

Thanks for your help!
George
0
Comment
Question by:gedwardnelson
  • 7
  • 5
  • 3
  • +3
22 Comments
 
LVL 119

Expert Comment

by:Rey Obrero
ID: 38768352
try this query

select  fFileNames, Mid(Left([fFileNames], Len([fFileNames]) - 4), InStrRev([fFileNames], "_") + 1) as SystemName
from tbldrawings
0
 
LVL 74

Expert Comment

by:Jeffrey Coachman
ID: 38768360
There is no real consistency there

Perhaps it would be easier if you created a list of the known "Prefix" characters:
IMC_FSB_L1_P_
IMC_FSB_L1U_P_
IMC_FSB_P_L1U_
..and just got everything after these characters.
0
 
LVL 26

Expert Comment

by:jerryb30
ID: 38768372
Function shortfile(filename As String) As String
Dim a As Variant
Dim i  As Integer
Dim tempFile As String

a = Split(filename, "_")
For i = 3 To UBound(a)
tempFile = tempFile & a(i)
Next
shortfile = Replace(tempFile, ".pdf", "")
End Function
called in query
select shortfile(fileFieldName) from tableName
0
 
LVL 26

Expert Comment

by:jerryb30
ID: 38768379
Oh, left underscores out:

Function shortfile(filename As String) As String
Dim a As Variant
Dim i  As Integer
Dim tempFile As String

a = Split(filename, "_")
For i = 3 To UBound(a)
tempFile = tempFile & a(i) & "_"
Next
shortfile = Replace(tempFile, ".pdf", "")
shortfile = Left(shortfile, Len(shortfile) - 1)
End Function
0
 

Author Comment

by:gedwardnelson
ID: 38768387
I'm hoping it wouldn't need to be done that way because this will be ongoing and the prefix characters will have unlimited possible combinations. However, the information I need will always begin immediately following the final underscore. Is there a way to count the underscores prior to the "dash" or "hyphen" then parse/pull after that underscore and trim off the .pdf off the end?
0
 
LVL 119

Expert Comment

by:Rey Obrero
ID: 38768395
oops, you wiil need UDF codes to do that

can you upload a copy of the db with the table...

there maybe lot of variations of fFileNames..
0
 
LVL 26

Expert Comment

by:jerryb30
ID: 38768412
Should have added-Create a module with the posted code.
0
 
LVL 26

Expert Comment

by:jerryb30
ID: 38768439
Assumes there are always 4 elements separated by "_" before file name
The FOR statement should be
FOR i = 4 to ubound(a)
0
 

Author Comment

by:gedwardnelson
ID: 38768475
OK, Let me give an update to the suggestions from Capricorn and Jerry.

Cap,

Yours came close but worked only on the file names with no underscores after the after the characters I'm looking for. Any underscores after the dash are OK.

Jerry,

Yours also came close but it left in the last underscore and the characters between it and the previous underscore.

I'm attaching a copy of the table as capricorn1 suggested for you guys to look at. Thanks for the help.
FileName.accdb
0
 

Author Comment

by:gedwardnelson
ID: 38768478
Jerry, I did. Check out the copy I just uploaded. thanks. :-)
0
 
LVL 26

Expert Comment

by:jerryb30
ID: 38768513
Database is password protected.
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 
LVL 26

Expert Comment

by:jerryb30
ID: 38768529
If there are 4 elements before desired name,
FOR i = 4 to ubound(a)
If this may vary, send db without the credential requirement
0
 
LVL 119

Assisted Solution

by:Rey Obrero
Rey Obrero earned 100 total points
ID: 38768543
use this function

function getSystemName(str)

dim s, j
s=str

for j=1 to 4

     s = mid(s, InStr(s, "_") + 1)
Next
getSystemName = Left(s, Len(s) - 4)


end function


now use it in a query

select  fFileNames,getSystemName([fFileNames]) as SystemName
from tbldrawings
0
 
LVL 26

Accepted Solution

by:
jerryb30 earned 400 total points
ID: 38768572
If there are indeterminate number of "_" before desired file name:
Function shortfile(filename As String) As String
Dim a As Variant
Dim i  As Integer
Dim tempFile As String
Dim ctr As Integer
ctr = 0
Dim x As Integer
For x = 1 To Len(filename)
If Mid(filename, x, 1) = "-" Then
Exit For
End If
If Mid(filename, x, 1) = "_" Then
ctr = ctr + 1


End If
Next x
'MsgBox ctr
a = Split(filename, "_")
For i = ctr To UBound(a)
tempFile = tempFile & a(i) & "_"
Next i
shortfile = Replace(tempFile, ".pdf", "")
shortfile = Left(shortfile, Len(shortfile) - 1)
End Function

Open in new window

0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38768598
Is there any possibility that you will have values with different file extensions?

For example, ".txt" or ".csv" instead of ".pdf"?

If so, what would you want to happen?
0
 
LVL 49

Expert Comment

by:Gustav Brock
ID: 38769034
One line of code can do it:

Public Function ExtractValue(ByVal strFilename As String) As String
  ExtractValue = StrReverse(Mid(StrReverse(Split(strFilename, "_", 5)(4)), 5))
End Function

/gustav
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38769075
gedwardnelson,

You wrote:

...and I need the group of characters that contains the hyphen starting after the last underscore.

But then look at two of your examples:

IMC_FSB_L1U_P_CCW-001_R1_10.13.12.pdf      CCW-001_R1_10.13.12
IMC_FSB_P_L1_SWS-034_R1_12.27.12.pdf      SWS-034_R1_12.27.12

If those two examples are correct, then very clearly it is NOT the last underscore.

I think maybe you need to restate your question :)

Patrick
0
 
LVL 74

Expert Comment

by:Jeffrey Coachman
ID: 38769222
Ultimately you may want to ask the party who creates this filename to use a specific "delimiter" between the two segments.
For example, perhaps a comma?
IMC_FSB_L1U_P_,CCW-001_R1_10.13.12.pdf

Even a double Underscore would probably do in a crunch...
IMC_FSB_L1U_P__CCW-001_R1_10.13.12.pdf

As long as you have some specific delimiter that is placed consistently (and only used) between the segments, ...
...then what you are asking for here is ridiculously easy...

JeffCoachman
0
 
LVL 74

Expert Comment

by:Jeffrey Coachman
ID: 38769231
Many time you get a file from a "source"
You think that is the way it will always have to be.

Sometimes if you just ask: "Hey, can you insert a comma between the segments?". ...they might say: "Sure, no problem, why didn't you ask sooner?"

Problem solved...


;-)

Jeff
0
 

Author Comment

by:gedwardnelson
ID: 38771200
I want to thank everybody for their help. I got way more responses than I'm used to. :-)

I have re-attached the file. It contains an example of the solution. I apologize for the issue earlier. This is actually a sharepoint database and I forgot about the required password when I copied the table over.

JerryB, Your second solution, ID: 38768572, worked perfectly. Like you mentioned, it takes care of the unknown and varying number of underscores.

Jeff, I agree with your suggestion about asking to put in a delimiter. But in this situation we are bound by our end client's naming convention. Each underscore separates a different piece of information. And besides, we still have a ways to go to get that team to be consistent enough already without asking them to be consistent enough to start adding commas. LOL

Mathews, Regarding your "catch" on my description, you're pulling a me. LOL. But if you'd have read the following sentence then that context would have clarified the prior sentence that you quoted. And no issue with different file extensions. The directories these come from contain mostly .pdfs. But I have the other occasional files filtered out when I first pull in the file paths.

Cactus, your proposed solution would not be flexible enough for this scenario.

Cap, your second solution worked partially but it didn't take into account varying underscores. Worked perfectly with the way the files are currently named, but if they change on the next project it wouldn't.

So, considering all of this and the amount of effort put into the various solutions I'm not sure how to break up the scores. Jerry provided the complete solution. Capricorn1 provided a couple partial solutions. I want to be fair on how to spread the points. Like I mentioned earlier I don't normally get this much help so it's much easier to score.

Thanks again everyone for your help,
George
FileName.accdb
0
 
LVL 92

Expert Comment

by:Patrick Matthews
ID: 38772612
George,

With respect, what would the "following sentence" have been?  The sentence I quoted was the last one in your original post.

Indeed, your second comment made exactly the same requirement:

However, the information I need will always begin immediately following the final underscore.

So, either it is the last underscore, or it isn't.  If it's not, then what is it?

Also...

I need the group of characters that contains the hyphen

That was from the original question.  What happens for values that have no hyphen?  Your sample database included the following entries with no hyphen:

FName
IMC_FSB_P_L1_A_Process Waste Lift Stations Rack_10.05.12.pdf
IMC_FSB_P_L1_A_Process Waste Lift Stations_10.05.12.pdf
IMC_FSB_P_L1_Seismic Support B_R2_10.26.12.pdf
IMC_FSB_P_L1_Seismic Support Pipe Layout.pdf
IMC_FSB_P_L1_Seismic Support_A_10.16.12.pdf
IMC_FSB_P_L1_Seismic Support_E_10.16.12.pdf
0492_001.pdf

Open in new window


What should be returned for those values?

Just trying to figure out what you really need, because so far you have not made it at all clear :)

Patrick
0
 

Author Comment

by:gedwardnelson
ID: 38772731
Patrick,

I think I was pretty clear in my explanation and the follow-up attempt to clarify. Though I realize what's clear in my mind may not always be clear to the oerson(s) I'm explaining it too. After re-reading it I agree it could have been more clearly stated. But if you look at the examples I gave they do show the underscores after the hyphen as being part of result needed. I'll do better next time. :-)

I regards to the records you found that absolutely do not fit the criteria they will be handled individually. They're only a few. This is a work in progress. If I'd have remembered they were in the sample I would have removed them to avoid the confusion.

In my previous response I stated that JerryB's solution was the one that worked. That was what I was after.

Thanks for your help!
George
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

This article is a continuation or rather an extension from Cascading Combos (http://www.experts-exchange.com/A_5949.html) and builds on examples developed in detail there. It should be understandable alone, but I recommend reading the previous artic…
In a multiple monitor setup, if you don't want to use AutoCenter to position your popup forms, you have a problem: where will they appear?  Sometimes you may have an additional problem: where the devil did they go?  If you last had a popup form open…
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
With Microsoft Access, learn how to start a database in different ways and produce different start-up actions allowing you to use a single database to perform multiple tasks. Specify a start-up form through options: Specify an Autoexec macro: Us…

708 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

14 Experts available now in Live!

Get 1:1 Help Now