Transform data set from list to table record

Good Afternoon Professionals,

I have a data set that came in the form of a pdf file.  The data was converted to *.txt format.

Issue:
The data set (as is) is formatted as below:


1/12/04
11:00 AM
EST
Confirmed
(Download
appointment to
Outlook or Palm)
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
1/23/04
10:00 AM
EST
Confirmed
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
6/21/04
9:00 AM
EST
Cancelled
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
6/29/04
9:00 AM
EST
Cancelled
Location:
Event: Hearing FINAL HEARING ON DIVORCE
Length:
Hearing Type: FH
7/13/04
9:00 AM
EST
Cancelled
Location:
Event: Hearing FINAL HEARING ON DIVORCE
Length:
Hearing Type: FH
7/20/04
2:15 PM
EST
Cancelled
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
7/26/04
11:00 AM
EST
Confirmed
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
8/3/04
9:00 AM
Location:
Page 2 of 87 Civil Case Detail
9/25/2009 https://www.doxpop.com/prod/court/ViewCaseDetails?caseId=1812404DR0001

EST
Cancelled
Event: Hearing FINAL HEARING ON DIVORCE
Length:
Hearing Type: FH
8/10/04
9:00 AM
EST
Cancelled
Location:
Event: Hearing FINAL HEARING ON DIVORCE
Length:
Hearing Type: FH
10/8/04
1:30 PM
EST
Cancelled
Location:
Event: Hearing OTHER
Length:
Hearing Type: OT
11/12/04

I need to get the data set into a record set of the format of, for example, I'll use the last record above to display the intended outcome:
f1: 10/8/04
f2: 1:30 PM
f3: EST
f4: Cancelled
f5: Location:
f6: Event: Hearing OTHER
f7: Length:
f8: Hearing Type: OT
Then, new record....   11/12/04

So I have to catch the "first record" starting with the date ##/##/####, THEN catch each line as a field of that record, UNTIL, the next date ##/##/####, and so on....

I am very open to all idea's !

Thank you,

MyRocco
MyroccoAsked:
Who is Participating?

Improve company productivity with a Business Account.Sign Up

x
 
SimonConnect With a Mentor Commented:
Hi Myrocco,

Having looked at the files, there are some very different formats to the example you posted in your question.

e.g. the entries appear to start with a line "Entry Date: 7/29/2009 " rather than just a date, and have "Minute entry:" fields in them with multiple lines of text.

These account for at least 80% of the file content.

I think we'd need to know which headings (columns) you'd like to capture the data under.

These are the entries followed by colons that appear most regularly in the example file you posted.
Location
Entry Date
Minute Entry
Notice Sent
Order on File
Type
Order
Event
Hearing Type
Length
EST
Confirmed
Cancelled
EDT
Order Location
0
 
SimonCommented:
The record you picked as an example seem straightforward enough, but how does this one fit to your column mapping:

8/3/04
9:00 AM
Location:
Page 2 of 87 Civil Case Detail
9/25/2009 https://www.doxpop.com/prod/court/ViewCaseDetails?caseId=1812404DR0001

EST
Cancelled
Event: Hearing FINAL HEARING ON DIVORCE
Length:
Hearing Type: FH 

Open in new window


It has more lines, so might need more columns unless some of it should be concatenated or disregarded?
0
 
MyroccoAuthor Commented:
Yes, that is correct.  The beginning of each record will be the date.  AND, you are also correct, there may be 5 fields, or up to 9 fields per record.
0
A proven path to a career in data science

At Springboard, we know how to get you a job in data science. With Springboard’s Data Science Career Track, you’ll master data science  with a curriculum built by industry experts. You’ll work on real projects, and get 1-on-1 mentorship from a data scientist.

 
SimonCommented:
So if there are only 5 fields, should they go in the first 5 columns, or attempt to get them in a consistent column, perhaps leaving columns 3,5,8 (for example) blank?

Is the LAST field (Hearing Type: ) consistent?

Is the FIRST field consistently a date on a line by itself - i.e. to distinguish it from this line (beginning with a date) that appears in the middle of a record:
9/25/2009 https://www.doxpop.com/prod/court/ViewCaseDetails?caseId=1812404DR0001
0
 
MyroccoAuthor Commented:
basically, what I am having an issue with is 'how to transpose the data from list to record form.  The current list will have the DATE as the beginning of the record, and, you are correct, there may be some blank fields in the table, but, who cares about that, I'll just concatenate those end fields.

So, the way I see this problem, is how to catch a beginning table primary key ie: the date, and get those items that are currently listed below the INTO a record, and, catch all data ASSOCIATED with that date.  

One way I was thinking was to write a query that shows the first record, then expr: 2nd record, expr1: 3rd record, expr:2 4 record, and so on, until the process SEE's a date, then, creates a new record.

what are your thoughts?
0
 
SimonCommented:
It's easy enough to parse a text file and start a new record every time it sees a date on a line by itself, but for me to spend time on it, I need answers to my questions first so that what I do meets your requirements without too much back and forth.

It would also help if you could post a full example of one of the text files so we can see how large they are and how much the data varies from record to record.
0
 
MyroccoAuthor Commented:
You got it !!

Please find the attached txt file.

I can not figure out how to start a new record by the date (field) and then pick up each line of data below the date (field) and transpose these lines into a new record.

Your assistance is MOST appreciated...

Thank you
0
 
MyroccoAuthor Commented:
0
 
MyroccoAuthor Commented:
Ummm, would there be a way to catch the field name of "location" as the first field, then grab the lines below the "location" and congeal that into one record.

I mean, I suppose I could manually create the records, however, this is only one of about 10 files that are formatted in this way.  

How would a person set up a query to recognize the "location" value, then, grab the subsequent lines, then, start a new record upon the lines named "location"

The variance between the records (number lines between 'each record') is kicking my butt...
0
 
MyroccoAuthor Commented:
I am going to re-post this question as the activity is lost. Points go to SimonAdept as it was his/her idea that gave me new direction.

The new direction is going to be something like this:

Given a listed data set with varying numbers of lines of data between "date points", I am thinking of a VBA solution that would go something like this. This is given using VBA with DAO and an incremented autonumber in the first field of the listed table (not seen below).

Data set:

1/12/04
11:00 AM
EST
Confirmed
(Download
appointment to
Outlook or Palm)
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
1/23/04
10:00 AM
EST
Confirmed
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
6/21/04
9:00 AM
EST
Cancelled
Location:
Event: Hearing PROVISIONAL ORDER HEARING
Length:
Hearing Type: PR
6/29/04

Proposed code construction model:

dim db1 as dao.recordset
dim db2 as dao.recordset
dim tbl1 as string
dim tbl2 as string
dim autonum as string

set db1 = currentdb.openrecordset ("NameOfTable", dbopendynaset)
set db2 = currentdb.openrecordset ("NameOfTable2", dbopendynaset)

db1.movefirst
db2.movefirst

set autonum =

Do while not db1.EOF

If db1.FieldName <> '##/##/####' then 'seeking the date as the first field of the record
   with db2
   .addnew
db2("Field2").value = db1![field2]

And so on.  I would need to figure out the VBA logic to pick up each subsequent line and place that into the newly created record based on a line that has a date in it.

Any help on this logic would function very well...

Thank you..
0
 
SimonCommented:
Thanks Myrocco.

When identifying the start of the record, remember that some entries appear to start with a line "Entry Date: 7/29/2009 " rather than just a date.

Look forward to seeing your revised question. It's all progress...
0
 
MyroccoAuthor Commented:
Thanks, I would run an update on the table first and replace the "entry date:" value with 'nothing' leaving just the  date itself.  Then, when a line with that date format would be picked up, I would use a counter string and update the record that was just created to place the below lined value in that record.  I just need to figure out the coded logic.... We'll what happens, because, i can not do anything in court until I get this historical data, and time IS of the essences. Thanks for your assistance !!
0
 
SimonCommented:
Your files aren't very large. You could post all 10 of them (or concatenate them into one file) in your next question. If you're going to use the data in court though, you need to be sure that whatever solution you get works reliably. You might need to store a column of metadata (to indicate which source file each result row came from).

Several different approaches might work, but as I showed in an earlier post, the data doesn't all 'look like' your initial example, so would be best to see the entirety of the data, if possible, before designing a solution.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.