Link to home
Start Free TrialLog in
Avatar of pdvsa
pdvsaFlag for United States of America

asked on

Importing Excel data that doesn't have an [ID]

Experts,

I want to know how an expert would import excel data that does not have a true [ID].
I am importing an excel spreadsheet that I do not have control over.
If I did, I would add an [ID] but I dont know if that [ID] must stay fixed for the life of that record.  

I have something that is close to an [ID] in the excel file but its not unique in every case so I have to use [CompanyName] and another field [TradeNo] and these 2 are essentially my [ID] when I build reports, append data and display data on forms but doing this bothers me as I dont have one unique [ID].

I want to propose a meeting with the dept that controls this excel file and tell them that to import data correctly, I need a unique identifier for each record and the data set doesn't have one.  However, if I let them control this [ID] field then I don't know if simply dragging down the [ID] in excel is the approach to creating [ID]'s.  I have to import this excel data set almost daily and I don't know if an [ID] that was on say record 2 was for example [ID] = 10 and now for that same record is [ID] = 20 on the next import and then [ID] = 30 for the next import and so on and so forth.  Maybe it doesnt matter if the [ID] dynamically changes like that. The owners of the excel file constantly add and delete records and to populate the [ID] field, they would simply drag down [ID] from row 1 to row x.  Maybe there is no issue created if you drag down this [ID] field in excel because its not necessary to keep that [ID] fixed for the life of that record but to me, it seems I am setting myself up for big issues as I would have access reports and data displayed based on the [ID] if this field was added. And, I think this is the most important point: I also make edits to the records on my side and I want to keep those edits for each record and not lose all my edits the next time I import data so that [ID] can never change once its assigned to a record.  

I imagine this is a real issue many experts have faced and I am interested in knowing how experts handle this.  Any thoughts are helpful.  
Thank you
Avatar of Tom Farrar
Tom Farrar
Flag of United States of America image

You can import an Excel file without an ID.  I sounds like the concatenation of CompanyName and TradeNo is a unique ID.  It sounds like the combination of those two fields result in a unique transaction.  It would be more ideal perhaps to have a transaction number created from the system capturing the transactions, but it doesn't sound like that is the case.  Creating a sequential ID for the sake of doing that without controlling doesn't seem logical to me.  I am sure there are other thoughts out there.  Let's see.
I'm slightly lost on the details, but think I have it right.  You are importing data daily from an Excel spreadsheet and (this is very important) you want the records in the new data to be attached (or to update) the data in the old data (as opposed to always being new, unrelated data).  Is this correct?

If it is correct, then it's a very simple question to ask those who manage the spreadsheet: How am I to connect the new data to the old data?  They'll have to come up with something very specific, such as "If they CompanyName and TradeID match, then they should be tied together".  You'll then have to decide if that is really a unique pairing.  If it is, then you can make an [ID] from it and all is well.  If not, then press them further after showing them that their answer isn't sufficient.

Non-technical people have difficulty with things that are easy for them to sort out (e.g. Fred Smith, Freddie Smith, and Fred G. Smith are the same person) and don't understand why computers (actually, programmers) have such difficulty with these decisions.  People who work with spreadsheets a lot shouldn't have these difficulties.
Avatar of pdvsa

ASKER

ok thank you for the thoughts.  I will keep the question open to here from other experts.  I am listening.


@CompProbSolv

< You are importing data daily from an Excel spreadsheet and (this is very important) you want the records in the new data to be attached (or to update) the data in the old data (as opposed to always being new, unrelated data).  Is this correct?

==>Yes.  



As others said you give too much info.

The simple scenario....just import the Excel to a table e.g. ExcelImporttbl_tmp

So you finished the import.

Now you have a table with the same structure but it has an additional field ID --> Autonumber e.g  ExcelImporttbl

So perform an append query that appends the data from  ExcelImporttbl_tmp to  ExcelImporttbl...

i guess this should be enough

IF you want some other control then you need Excel Automation to iterate all cells ...examine the values and import accordingly

Some other thoughts on your statements:

"I want to know how an expert would import excel data that does not have a true [ID]."

What does the "data" represent?  Is each combination of CompanyName & TradeNo a business transaction that will show up in the financial records?  What purpose do you think the true ID would serve (not that I am against the ID, just to understand)?

"I am importing an excel spreadsheet that I do not have control over."

What is the source of the spreadsheet you are importing?  Did it come from some business system (i.e.SAP, JDE) ?  Are you suggesting that the data (transactions) that you import one month can be changed the next month?  What type of control are you trying to achieve?  

"If I did, I would add an [ID] but I dont know if that [ID] must stay fixed for the life of that record."

Without understanding what each record stands for, and assuming it is a business transaction of some sort, I would agree that whatever ID is assigned to the transaction should not change.  If there are updates to the transaction as captured in one month, the update (adjustment) should come through as a different ID.  This would be true even if you were using CompanyName & TradeNo.

Lastly I cannot imagine data (transactions) coming from a legitimate business system would not have some type of ID that is unique for that transaction.  Maybe the party providing you the spreadsheet are not giving you everything you need.
Avatar of pdvsa

ASKER

Hi Tom,


< What purpose do you think the true ID would serve (not that I am against the ID, just to understand)?

I update records on my side and almost every day I re-import data.  Its an entirely new dataset but some of the records I already have in the db and I have made edits to the data and I need to keep that data and not lose it on a new import.  


< What is the source of the spreadsheet you are importing?  Did it come from some business system (i.e.SAP, JDE) ?  Are you suggesting that the data (transactions) that you import one month can be changed the next month?  What type of control are you trying to achieve?  

Its an excel file and it doesnt come from any system like SAP, JDE or any other system.  Its a manually updated spreadsheet.  The record set is not big..only about 50 records.  The only control I would like is making sure that the [ID] if it was added, would be fixed with that record.  


< Lastly I cannot imagine data (transactions) coming from a legitimate business system would not have some type of ID that is unique for that transaction.

The data is not form a system like SAP.  




Well it sounds like the process is all manual.  How are you managing (identifying and eliminating) duplicate records that come in more than once?  What is it about that (your process for managing) that is not working?  Why would adding an ID to the records coming across make it a better process for you?
Avatar of pdvsa

ASKER

I have to check for duplicates.  It does happen and I have query for it.  The append is not working as I would like because of the ID  issue and the problem is that if I were to add an ID to excel it might accidentally change due dragging down the ID and the owner of the file not being careful to keep the ID fixed for a record.  If an ID coujld be added and remained fixed with that record its entire life then I could import data with more confidence that its correct.  

Not actually performing the import/append you are talking about, I am not sure I have further thoughts.  If you think the solutions for a unique ID lies with the originator of the spreadsheet (and I agree), and you are finding a workaround in Excel onerous and risky, then you are right in insisting the originators of your spreadsheet establish a methodology for assigning a unique ID that follows the record.  I would be interested in how this transpires.

If you share a sample of your Excel we could guide you better.

I am not sure if I understood correctly but if the issue is duplicate data you could first import them and then get rid of duplicates... probably constructing a PK on the way

Avatar of pdvsa

ASKER

Hi John, I have attached the excel data.  You can see there is no "real" ID.  Trade_No is close to unique but if you sort you can see there are dupes and dupes on co name too.  I have put comments in the file on each column that is not unique.  As discussed previous, adding an ID would be easy but I think that (I am not sure on this) that if that ID were to change then I could have a problem because in my db that record would have a different ID and updating to the current (new) ID would be difficult since all the other data in the recordset is not unique. 


SampleData_Import.xlsx

Certainly CompanyName & TradeNo is not unique.  The closest thing I saw as unique is Trade and Trade2, but it isn't.  I am assuming each line is a record of buy and sale of gasoline quantities (or something like that), buying from one company and selling to another.  Perhaps the bigger question here is what is the importance of capturing and keeping historical records of these records?   If these are purchases and sales I would guess these transactions would need to be recorded somewhere in a general ledger for financial record purposes.  Trying to keep your db consistent with what is recording financially would be a chore.  But keeping separate (not integrated) reportings (your spreadsheet and financial general ledger) is not only a duplication of work, but will result in inconsistent data at some point if it hasn't already.

Of course what I have said is based upon my assumption as to what the data on your spreadsheet represents and how it is used.  The one thing I will be clear about is some ID is needed, but the question is for what.  Each record?
Avatar of pdvsa

ASKER

Hi Tom, thanks for the response. 

< Perhaps the bigger question here is what is the importance of capturing and keeping historical records of these records?

I need to keep the history as that is a necessary part of my job.  I can not lose that history.  


<The one thing I will be clear about is some ID is needed, but the question is for what.  Each record?

Yes, it would be for each record.  Unique.  

My point on the history is that there most likely are financial transactions that also represent the history.  To me, those two sources of the transaction history need to be kept in balance.  Else someone's history is not correct.  Having said that I am of the opinion you are correct in requesting some type of ID to uniquely maintain the history.
Avatar of pdvsa

ASKER

Thank you very much Tom.  I appreciate the advice.  

ASKER CERTIFIED SOLUTION
Avatar of John Tsioumpris
John Tsioumpris
Flag of Greece image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial