I require some assistance/guidance with a general data management scenario. Allow me to provide some background first.
• I have a spreadsheet that may include thousands of records across approximately 25+ columns. Columns can be of types [date], [number], [text]. Attached XLS is a very simplified version that only contains only 14 records across four columns.
• My actual worksheet, however, may contain 2,000 rows/records (+ header row). And, I will have duplicate records (for a reason that doesn’t need to be covered in this thread) for each single record.
• So, essentially, in the actual data set, I may have only 1,000 unique records (based on 3 or 4 key fields). In the example data set, however, I only included 14 records where seven records are unique records (not across all fields though).
• Again, there are “slight differences” in each of the seven (7) paired records. Please refer to additional notes in cell range B17:D31 that specifically explain details on the record variances.
What I Would Like to Achieve:
• In essence, I want to review the 14 records and remove their associate 2nd duplicate record and ultimately end up with only 7 records.
• The trick is though, there’s no “general algorithm” that states that I will always keep the 1st record and trash the 2nd one (or vice versa). Instead, I want to **merge** two records into one but let the user decide that, e.g., for the record dated “07/15”, I will use [Submission ID] = 142 and overwrite ID 163. Alternatively, for the record dated 10/02, [Submission ID] = 189 should be kept by overwriting ID 190.
• Similarly, the same holds true for the field “Example Summary”, the user will pick what he/she believes depicts a better summary. Please keep in mind, for the [Example Summary] field, a longer summary does not mean more accurate summary. Again, it’s the user’s decision to pick one of the two.
What Makes This Process Tricky:
• Now, for another reason which doesn’t require to be explained in detail here, the user prefers using an **Access database** over and **MS-Excel spreadsheet**.
• If Excel were the tool of choice, a simple “drag ‘n drop” or “copying/pasting” one cell onto another cell would allow to accomplish the task somewhat easy (but it could be very tedious).
• Please keep in mind though, the actual data set will have thousands of records and dozens of columns. And the user does not want to add “helper columns” with formulae equal to “=If(B2=B3, “Ok”, “Error”.
• So, again, this is where the user prefers using an Access database to automate or at least semi-automated the record update process.
• Thus, finally, I have attached the same XLS data structure in MS-Access as well.
• Based on the background & information provided in this thread (combined with the notes in the XLS), do you have a specific recommendation for developing, e.g., an Access query or module that executes a “pair-wise comparison” (again, in this example, only based on [Date] field)?
• In other words, in the actual dataset, I want to shrink my, e.g., 2,000 records (or 1,000) pairs down to only 300 records (and 150 pairs) to be reviewed. Then, depending on the field’s content, the user can either pick value from record #1 and overwrite value for record #2 or vice versa.
• If so, how could this update process be accomplished (e.g., using a form) in Access?
Thank you for your assistance or advice in advance.