Link to home
Start Free TrialLog in
Avatar of anu2004
anu2004

asked on

how to parse the contents in a text file using vb.net

I have a text file as follows:

<Task>  Testing for task </Task>

<TaskDetails> Testing for task details.This
can span multiple lines
like this. </TaskDetails>

<SubTask> Testing for SubTask. This can also
span mutliple lines</SubTask>

<SubTaskDetails> Sometimes there can be space like empty lines which must be ignored. This can happen between tasks or subtasks.</SubTaskDetails>

<SubTask> This is another subtask with
lines

between

lines
Andall the lines between must be preserved. </SubTask>

<SubTaskDetails>
The line after the last subtask and the start of this subtaskdetails however must be ignored.
</SubTaskDetails>

<SubTask>  Another subtask </SubTask>
<SubTaskDetails> Sub Task Details go here </SubTaskDetails>

<SubTask> Subtasks can have tags in separate lines as well.
</SubTask>

<SubTaskDetails> There can be a < or > in the body of hte text and they must be processed correctly.
Subtask details can also be multiple
lines like this.
</SubTaskDetails>

<Task>
Anotehr task
</Task>
<TaskDetails>
Task details for this task go here
</TaskDetails>

<SubTask> This is a subtask that belongs to this task and not the previous task.
</SubTask>

<SubTaskDetails>
Details of the subtask go here. THere can be lot more subtasks for this task.
</SubTaskDetails>

I want to parse the contents (ie between tags ) and populate in a text fileds. how to do this in vb.net only
and file is a text file only and not xml or not html file
thanks
Avatar of zulu_11
zulu_11

I believe the text you are talking about is actually an XML and if that is the case then the best method would be to use the XML Namespace..

Zulu
Avatar of anu2004

ASKER

absolutely not xml it is only txt file.
 thanks
 
could you post you file here...

zulu
Avatar of anu2004

ASKER

how to post it?
<Task>  Testing for task </Task>

<TaskDetails> Testing for task details.This
can span multiple lines
like this. </TaskDetails>

<SubTask> Testing for SubTask. This can also
span mutliple lines</SubTask>

<SubTaskDetails> Sometimes there can be space like empty lines which must be ignored. This can happen between tasks or subtasks.</SubTaskDetails>

<SubTask> This is another subtask with
lines

between

lines
Andall the lines between must be preserved. </SubTask>

<SubTaskDetails>
The line after the last subtask and the start of this subtaskdetails however must be ignored.
</SubTaskDetails>

<SubTask>  Another subtask </SubTask>
<SubTaskDetails> Sub Task Details go here </SubTaskDetails>

<SubTask> Subtasks can have tags in separate lines as well.
</SubTask>

<SubTaskDetails> There can be a < or > in the body of hte text and they must be processed correctly.
Subtask details can also be multiple
lines like this.
</SubTaskDetails>

<Task>
Anotehr task
</Task>
<TaskDetails>
Task details for this task go here
</TaskDetails>

<SubTask> This is a subtask that belongs to this task and not the previous task.
</SubTask>

<SubTaskDetails>
Details of the subtask go here. THere can be lot more subtasks for this task.
</SubTaskDetails>

I want to parse the contents (ie between tags ) and populate in a text fileds. how to do this in vb.net only
and file is a text file only and not xml or not html file
thanks
what i meant was to paste the complete contents of the file that you want to parse and give the details of which all textFields will be filled with what all data...Need to be specific of what kind of functionality you want!!

Zulu
Avatar of anu2004

ASKER

entire file is that have pasted before.
I want to get contents between <Task></Task> and <TaskDetails></taskDetails> and <subTask></SubTaskDetails> etc.. Place them in  strings so that I will retrive into textboxes
 thanks
Regular expressions are your best friend.  You need to have at the top of your page this statement:

Imports System.Text.RegularExpressions

Then, in your code, declare a regular expresstion object.

Dim rex As Regex

'if you want to parse multiple delimiters in one line, use a collection
Dim rexColl As MatchCollection

rex = New Regex("[<Task>]")

rexColl = rex.Matches(line) 'this will make a collection of indexes where matches have occured for a single string.

'to extract the data you use a substring, such as the one below to extract everything between 2 matches
data = line.Substring(rexColl.Item(0).Index + 1, rexColl.Item(1).Index - rexColl.Item(0).Index - 1)

'or, in the case of your file, reading line by line

If rex.IsMatch(line) Then
         'do whatever when you find a match
End If

I hope this helps
Majin Loki
Avatar of anu2004

ASKER

I have used streamreader and when using this
data = line.Substring(rexColl.Item(0).Index + 1, rexColl.Item(1).Index - rexColl.Item(0).Index - 1)

 it is raising "IndexOutOfBounds" Exception

I'm sorry, I wasn't exactly clear. If you are reading line by line, you need to not use a collection.  A collection is best when you have multiple instances of a certian string that occur at variable intervals on a single line, such as comma delimited or tab delimited.  If you read line by line, at least in your sample, you are not going to have more than one match found.  Use the regular expression and do this test.

If rex.IsMatch(line) Then
         'do whatever when you find a match
End If


you can use another regular expression to test forthe ending clause.  In the example I gave you, we are testing for the <Task> tag.  I'll try to elaborate here a little more to make things more clear.  Most of what I've learned has been through trial and error, so as to why it works or how it works, I am unsure.  I didn't test this code, but you should be able to get the gist from it and do some experimenting yourself.

Dim startRex As Regex
            Dim endRex As Regex
            Dim data As String, line As String

            'create our starting and ending tage regular expressions
            startRex = New Regex("[<Task>]")
            endRex = New Regex("[</Task>]")

            'priming read from the file
            line = oRead.ReadLine()

            'unil EOF
            While oRead.Peek <> -1
                If startRex.IsMatch(line) Then
                    data = line.substring(startRex.Match(line).Index + 6)
                    If endRex.IsMatch(data) Then  'if the ending tag is on the same line
                        data = data.Substring(0, endRex.Match(data).Index)
                    Else 'we must read on.  I will assume that file will contain ending tags always
                        line = oRead.Read()
                        While Not endRex.IsMatch(line)
                            data = data & line
                        End While

                        'if we are outside the loop, we have found an end tag
                        data = data & line.Substring(0, endRex.Match(line).Index)
                    End If
                End If
            End While

If you need any more help or clarification, let me know.

Majin Loki
Avatar of anu2004

ASKER

Thank you.I will explain my task....
I have a text file as I have posted above.It contain tags<Task></Task>,<TaskDetails></TaskDetails>,<SubTask></SubTask><SubTaskDetails></SubTaskDetails>
these tags are fixed.. all the data will be with these tags only.
file can contain any no of <Task></Task>or<SubTask></SubTask> etc..
until other <Task> tag is found in the file, entire contents belong to first <Task> itself.
so while the reading the file if <Task> tag is encountered,the contents between <Task></Task> will be saved in the data base table named Task (which contain Taskid(autonumber),Task,TaskDetails columns) and the <TaskDetails></TaskDetails> content is saved in the table under same Taskid.
when encountered <SubTask></SubTask> content will be saved in subTask Table(which contain Taskid,SubTaskid(autonumber),SubTask,SubTaskDetails columns)(Tables are related with taskid columns).
 so on..
while the reading the file if another <Task> tag is encountered,the contents between <Task></Task> will be saved in the Task and SubTask table under different TaskId... and SubTaskId  respectively.
A Task can Contain any number of sub tasks..
every sub task of task should be saved with same taskid and different subtaskid 's.

Please let me know how to do it .
thanks




Avatar of anu2004

ASKER

I want the solution for my question urgent
Avatar of anu2004

ASKER

close my question ,I have solved the problem myself.
thanks
Avatar of anu2004

ASKER

close my question ,I have solved the problem myself.
thanks
Avatar of anu2004

ASKER

delete my question
ASKER CERTIFIED SOLUTION
Avatar of PashaMod
PashaMod

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial