Smart Parsing and Validation Schema

Posted on 2006-05-03
Last Modified: 2010-05-01
I am working on an application where I receive text files from different sources:
Here is one record of one of the files:

I also have tables in SQL 2000 containing the parsing schema and validation rules. For this type of file (above) I have the following records in a table (fields are: FieldOrder, FieldName, StartPosition, Length, Rules):
1    OrderNumber       1     6   ANNNN
2    OrderDate           7    10   MM/DD/YYYY
2    OrderDate           7    10   <=today
3    ItemNumber       17     4   NNNN    
4    Quant                21     3   NNN
4    Quant                21     3   >0
5    UnitPrice           24    10   NNNNNNNNNN
5    UnitPrice           24    10   >0
6    TotalInvoice      34    10   NNNNNNNNNN
6    TotalInvoice      34    10   =Field4 * Field5

Where: A = alpha, N = number, etc
As you can notice a field can have more than 1 validation and some of them like Field 6 require access to other fields in the record (TotalInvoice = Quant * UnitPrice).
I am trying to create a routine in VB 6 to handle the parsing and validation based on the schema from the database so when the schema is changed (new fields, delete fields, new validation rules, etc), or a new schema is added to the database no need to change the program.

The VB 6 program would basically do this:
- open text file – I know how to do this
- read DB to identify schema (the one described above as an example) – I have several schemas – I know how to do this
- read each line of the text file and be smart enough to identify each field (X47222 is the ordernumber and so on) and do the validation based on the schema retrieved from the database. – I don’t know how to do this one…

If you think there is a better way to do it I am completely open, as long as the schema is on the database and the code is written in VB6.

Thanks in advance..

Question by:cetafudd
    LVL 76

    Expert Comment

    I think that the core of your problem is how to organise the parsing of each field.
    I suggest having a function for each rule, or group of rules, if some can be grouped. You would step through each field specification get the relevant part of the data string, and call the applicable function, something like this:

    Function ParseLine(strLine As String, rsParseTable As Recordset) As Boolean
        Dim strFieldData As String
        Dim strRule As String
        ParseLine = True
        Do Until rs.EOF
            strFieldData = Mid$(rs.Fields("StartPosition"), rs.Fields("Length"))
            strRule = rs.Fields("Rules")
            Select Case strRule
                Case "<=today"
                    ParseLine = IsBeforeToday(strFieldData) 'Suggested function provided below
                Case "MM/DD/YYYY"
                    ParseLine = IsDate(strFieldData)
                Case ">0"
                    ParseLine = IsGreaterThanZero(strFieldData) 'needs a function written
                Case Else
                    If Len(Trim$(Replace(Replace(strRule, "A", ""), "N", ""))) = 0 = 0 Then 'Only "A"s and "N"s
                        ParseLine = IsAlphaNumericPattern(strRule, strFieldData) 'needs a function written
                        If InStr(strRule, "Field") > 0 Then
                            ParseLine = OtherFieldValue(strRule, strFieldData) 'needs a function written
                            ParseLine = False
                            MsgBox "Unrecognised rule; " & strRule
                        End If
                    End If
            End Select
            If ParseLine = False Then
                Exit Function
            End If
    End Function

    Function IsBeforeToday(strDate As String) As Boolean
        If IsDate(strDate) Then
            If DateDiff("d", CDate(strDate), Now) > 0 Then
                IsBeforeToday = True
            End If
        End If
    End Function
    LVL 2

    Author Comment

    Thank you for replying Graham.. very nice code, I am sure I could use it.
    I still have a problem here, how do I validate field number 6 since it depends on 2 other fields? I can't have something hard coded because today the rule is that it has to be equal to Field4 * Field5, tomorrow could be Field4 * Field5 + 10% and so on....It has to be more dynamic so I dont have to change my VB code every time a rule changes or new rules are created.
    We would have basically 2 sets of rules:
    1) field formatting and datatype (those are field size, numeric, alpha, combination, dates, etc)
    2) field value - this can also involve some calculation like:
             a) a math method against this or any other record in the current record (like Field 6)
             b) condition...yikes....I will explain what I have in mind:
                          example of rule for field10: #field5 > 100,field5 * 0.10,field5 * 0.07#
                          this (or any other convention) would tell me that the correct value of field 10 will depend on the value of field 5, if field5 greater than 100 field 10 is 10% of it, otherwise it is 7 %....and so on

    It is more like having a generic routine in VB that would interpret the rules and do the right validation..
    If I am able to accomplish such code, I could change the rules for field10
    from #field5 > 100,field5 * 0.10,field5 * 0.07#
    to #(field5 / 2) + field57 #
    or have my rule for another field like #at(1, 4, 6, 15)# (the only valid value are 1, 4, 6, or 15)....and so on...

    Again, this design is still open and it can be changed as long as I keep the templates and rules in the DB and code is VB 6.

    What do you think?

    LVL 76

    Expert Comment

    If the maths is limited to a fairly strict pattern, you could write your own evaluator, otherwise you could consider using the Eval function that is available with the Microsoft Script control.
    LVL 2

    Author Comment


    It will follow a fairly strict pattern - could you provide some code based  the examples I mentioned?
    Besides the maths would I be able to do the conditions I mentioned on item 2???
    LVL 76

    Accepted Solution

    With the rules that reference other fields, I would plug in the values of the other fields first with a function like this:

    Function PlugFieldValues(strLine As String, strRule As String, strThisField, rsParseTable As Recordset)
        Dim rs As Recordset
        Dim strSplit() As String
        Dim iOtherFieldNumber As Integer
        Set rs = rsPS.Clone 'to avoid changing current record pointer of original recordset
        strSplit = Split(strRule, "Field")
        For i = 1 To UBound(strSplit)
            iOtherFieldNumber = Val(strSplit(i))
            Do Until i = iOtherFieldNumber
                i = i + 1
            strOtherFieldValue = Mid$(strLine, rs.Fields("StartPosition").Value, rs.Fields("Length").Value)
            strSplit(i) = Replace$(strSplit(i), CStr(iOtherFieldNumber), strOtherFieldValue)
        Next i
        PlugFieldValues = Join(strSplit)
    End Function

    So that #field5 > 100,field5 * 0.10,field5 * 0.07#

    becomes #99 > 100,99 * 0.10,99 * 0.07#

    Strip the #s off. Then use Eval to calculate and return the result:

    Function EvaluateConditional(strCondition As String) As String
        Dim strParts() As String
        strParts = Split(strCondition, ",")
        If Form1.ScriptControl1.Eval(strParts(0)) Then
            Conditional = Form1.ScriptControl1.Eval(strParts(1))
            Conditional = Form1.ScriptControl1.Eval(strParts(2))
        End If
    End Function


    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    Enabling OSINT in Activity Based Intelligence

    Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

    Introduction I needed to skip over some file processing within a For...Next loop in some old production code and wished that VB (classic) had a statement that would drop down to the end of the current iteration, bypassing the statements that were c…
    Most everyone who has done any programming in VB6 knows that you can do something in code like Debug.Print MyVar and that when the program runs from the IDE, the value of MyVar will be displayed in the Immediate Window. Less well known is Debug.Asse…
    Get people started with the process of using Access VBA to control Outlook using automation, Microsoft Access can control other applications. An example is the ability to programmatically talk to Microsoft Outlook. Using automation, an Access applic…
    Get people started with the utilization of class modules. Class modules can be a powerful tool in Microsoft Access. They allow you to create self-contained objects that encapsulate functionality. They can easily hide the complexity of a process from…

    737 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now