Reducing Data Duplication

Tom FarrarConsultant
CERTIFIED EXPERT
Always working for better, faster and cheaper solutions!
Published:
Today companies are subjected to more-and-more data, and it won't stop any time soon.  But there are obvious opportunities for reducing data, particularly data duplicated among companies.
Reducing Data Duplication
"The enormous multiplication of books in every branch of knowledge is one of the greatest evils of this age; since it presents one of the most serious obstacles to the acquisition of correct information, by throwing in the reader's way piles of lumber in which he must painfully grope for the scraps of useful lumber, peradventure interspersed." -Edgar Allan Poe
 
Edgar Allan Poe died in 1849.  His quote’s reference to “piles of lumber” was foretelling for the time, but minuscule by today’s standard.  Today there are near-infinite piles of lumber for readers to extract “correct” information and it’s not just in books. Today’s corporate business systems are inundated in lumber too.

As computer technology advances, today’s data multiplies, and companies pedal faster to manage the ever-expanding data.  Though their efforts are herculean, data is growing faster than businesses can keep up.  This data overabundance complicates data management and impedes business productivity. 

So, what can companies do?  Excluding Big Data*, and it is a big exclusion, efforts are currently underway to improve data management, and eliminating data duplication is in the mix.  Though some efforts have eliminated much duplication within companies, cross-company duplication is not-fully-addressed low-hanging fruit.
 
 *Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. – Wikipedia”.  For purposes of this article, Big Data includes image, video and audio forms of data.  These forms can be addressed by Oracle, Google, Microsoft, Hadoop and other large data handling companies.
 
What Companies Have Done
Historically a company’s business applications were developed independently, and not integrated into a common, unified system.  Data was captured, and processed, in each application and not shared with other applications.  Sales, Purchases and General Ledger applications each had its data, with little-to-none sharing with the other applications.  Data, common to multiple applications, was duplicated.  This duplication has been addressed.

Today, the previous disparate applications are integrated.  Enterprise Resource Planning (ERP) systems (for example: JD Edwards and SAP) accomplished the integration with back-end databases where centralized data is shared across applications.  This ERP integration has eliminated first-tier duplication, and the errors the duplication caused. 

But even with the integration ERP’s provide, companies continue to drown in data as more (data) is digitized. System complexity continues to grow and productivity slows.  New approaches must be investigated.  Creative, alternative data-handling solutions for reducing cross-company duplication must be one of those approaches. 
  
Eliminating Cross-Company Duplication
“Vendors And Customers Are Two Sides Of The Same Coin.”
Simplistically, eliminating cross-company duplication is an extension of today’s ERP integration.  It’s just that, instead of integrating data within a company, the integration must occur across companies.  Such efforts would eliminate significant duplication.

Sharing “one set of data” between companies and eliminating duplicate, unnecessary data handling are two great starting points.  Though both options have been pursued to some degree, progress has been delayed perhaps due to data-sharing security issues.  Even so, moving forward is the only real option as much remains to be done.

“One Set of Data” – The “one set of data” approach reduces duplication where companies independently capture and process the same data.  This scenario is apparent where two companies process the same transaction, one as the vendor and the other as the customer.  There are variations of this scenario as an example shows below.

For the first example, take a business contract.  Each contract party has the contract terms and the related transactions, and a system to process them in.  This is duplication.  The result is two systems maintaining the same contract specifics, accounting for the same transactions, and tracking the same monies paid or collected. 

Another example, with a different twist, is maintaining personal name & address data.  Credit-card companies, government agencies, doctor offices, and banks all capture the same names and addresses.   How many places is any one person’s data captured?  A lot! 

And in both examples cited, over time the data gets stale and inaccurate in company files, creating multiple versions (errors) of the “truth”.  So in addition to handling the data more than once, the multiple versions of the truth create a whole set of other problems to deal with.  The one-set-of-data approach can mitigate both problems.

Duplicate Data Handling – Besides the “one set of data” option, data handling can be reduced where companies exchange “paper” (whether real paper or pdf email) documents.   When such exchanges occur, and these transferences occur a lot, companies stubbornly exchange data that requires the recapture of data already captured electronically. 

Take for example invoices and purchase orders.  These documents (rather than the underlying data creating them) require the receiving party to parse, cleanse and re-enter data already captured in the sending party’s system.  These data-capture tasks are onerous, unnecessary, and duplicative.

Automating “paper” data transfers would improve data-handling productivity, save trees and, perhaps one day, eliminate the need for accounting receivables and payables.  But the receivable/payable elimination is a discussion at another time.
  
Finally (As Finally Can Be)
“It ain't over till it's over.” – Yogi Berra
Corollary: It’s never over….
With the help of technology, companies have spawned creative, productive solutions to data management, but not as fast as new data has been created.  The net-data additions have put companies in a data-management deficit.  Reducing cross-company duplication can mitigate this deficit.

ERP systems, with their back-end database, eliminated substantial duplication within companies.  As central repositories, these database have consolidated data existing in previous disparate applications.  ERP’s have been a great first step.  Yet, much duplication remains.

Pursuing “one set of data” and automating data transfers can further consolidate business data, eliminating duplication and improving productivity.  Though cross-company efforts have been tried, these efforts have been resisted for data-security reasons.

Companies are protective of their data, and rightfully so.  Future data-sharing efforts must be well thought out, keeping data secure.  Yet even with security challenges, cross-company duplication will be addressed.  The concept of capturing data once, then sharing it, cannot be disputed. 

With cross-company data consolidations, companies will advance technology, and reap the productivity it brings. Though the solutions may not come easily, companies and their employees will adapt because they always do; it’s embedded in the human genome.
 
 
 
 
0
1,595 Views
Tom FarrarConsultant
CERTIFIED EXPERT
Always working for better, faster and cheaper solutions!

Comments (3)

Tom FarrarConsultant
CERTIFIED EXPERT

Author

Commented:
Thanks for your frankness, Jim.
Tom FarrarConsultant
CERTIFIED EXPERT

Author

Commented:
Thanks for the comments, LHerrou.  I will drop back and consider your suggestion.  - Tom
Tom FarrarConsultant
CERTIFIED EXPERT

Author

Commented:
Thanks, Jim.  I understand the controversy and issues around what I've proposesd.  I am also interested in how the ideas will be viewed.  - Tom

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.