Design Theory Advice


I am looking for advice basically on users idea of how to achieve the following. Lets Say

I add 2 apis to my application.
I make use of the following two functions from each api respectively

So I have two different datasources which:
 - have over 100K results
 - have integrated paging (i.e. TotalPages, TotalRecords, CurrentPage, RecordsPerPage)

Now my question is, or rather my search for others take on how they would approach this:

I will retrieve both resultsets based on CurrentPage, RecordsPerPage on two separate threads to speed up collection.  I then need to combine these two resultsets into 1.

I know there will be tons of theories on this including design patterns, state management, unified data structures. But I am just curious as to others take on how they would approach this.


LVL 20
Who is Participating?
ravs120499Connect With a Mentor Commented:
Here's what I would do.

You have two sources of data for the same datatype (sales data) and you need to combine the data from both. You want to retrieve data from both sources simultaneously to speed up.

i.e. you have two data pipes feeding into a data aggregator. The "pipes" are responsible for retrieving the data in the most optimal way (based on the details of the api, the network, the database, etc.). The data aggregator is responsible for the most efficient aggregation of this data -
for example, should it wait for all the data to be delivered by both pipes before trying to aggregate them? That is not usually the quickest approach, but if the data is too loosely typed, or even if there is no unique key, it might be the only feasible choice.

1. So at a high level, I would design the aggregator and pipe interfaces. I would definitely want the pipe classes to use a Factory pattern for instantiation - where there are two data sources, there could be many more. I would also want a Factory for my aggregator - it is not at all certain that one implementation of Aggregator would be optimal for all situations.

2. As discussed earlier, I would want the the pipes to retrieve data in chunks (pages/whatever) in the most efficient  approach for retrieval, and feed the data immediately to the aggregator, and let the aggregator decide when to do the aggregation.

So, the aggregator would implement an Observer pattern so it can get notified whenever the pipe(s) have a chunk of data available. On notification, the aggregator pulls data from the pipe (the alternative is that the entire chunk of data is sent in the notification itself, which I don't think is a great idea). So, the pipe has to be able to handle "clogging" - i.e. if the pipe retrieves data faster than the aggregator pulls it.

3. Neither the pipes nor the aggregator should care about whether they are operating in a multi-threaded environment or not. This design should work just as well in a single threaded environment, or (with appropriate modifications) in a multi-process environment where the pipes and the aggregator communicate across a remoting interface! So, setting up the threads etc. should be outside these objects, in the initialization of the application.

- Ravs
REA_ANDREWAuthor Commented:
That is great advice.  Cheers for the advice about using the Factory pattern too. Your points about that make perfect sense.


Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.