How to ensure that a API to API communication is working fine

ltpitt used Ask the Experts™
Hi all,

I am facing a challenge and I cannot think of a fool proof solution for this problem:
My API is querying an external API to get some entities.
Those entities are then used to accordingly create / remove folders on a server.

All was fine until somebody changed a request parameter on the other API and instead of having for example 1000 entities I got 500 entities.
The consequent action was that 500 folders were deleted from the server.

Now the issue is solved but I wanted to make my API more robust and make sure that the smallest amount of damage is taken in case of a similar issue.
Clearly checking for API functionality would've just returned a happy 200 OK and that is not sufficient in this case...
I thought about setting thresholds, for example if list retrieved in the last week is around 1000 units I will consider anything from 900 to 1100 acceptable and stop creating or removing folders in case I get a list smaller or bigger than that.

But I am still not 100% sure or happy about this solution either.

Do you have any pointer or ideas to create something that is as close as possible to a fool-proof solution?
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Architect - Coder - Mentor
There is no safety net regarding the change of the request query parameters. What you can do is to try to detect the differences between the requests, and act if there are big deviations. This can be done by keeping some audit info in a database, for example.


That is indeed a good point and it is the solution that (more or less) I was starting to jot.

Can I maybe just ask a little suggestion about what kind of data you'd suggest to keep?

Thanks for your useful input
ste5anSenior Developer

Never trust (user) input. Thus never trust API input.

My approach here is using ETL-processing. Thus always load what you get into a separate database, added with timestamps and the job or request which caused that. Same for uploading data. This will cover your missing "what happened" question.

Use a delta approach. Depending on the kind of "deletions" it seems that your logic is flawed that way, that you need a "mark for deletion" flag. And you need to actually delete not on the fly, but only at certain points. E.g. after the daily backup is made.

For detecting such changes in data flow, record numbers like requests send, objects read and just visualize it. Then you can see it.
Learn Ruby Fundamentals

This course will introduce you to Ruby, as well as teach you about classes, methods, variables, data structures, loops, enumerable methods, and finishing touches.

Eduard GherguArchitect - Coder - Mentor
If the data that you receive is sensitive, you can save it all in a database and create comprehensive delta reports. If it’s just a matter of detecting the deviations, it can be just a record count. You should decide based on your specific use case.


What do you think of a similar calculation (just jotted some Python code to give a very rough idea):
def is_automatic_provisioning_happening(previous_data, threshold):
    previous_data_lengths = [len(data) for data in previous_data]
    mean = sum(previous_data_lengths) / len(previous_data_lengths)
    variance = sum((xi - mean) ** 2 for xi in previous_data_lengths) / len(previous_data_lengths)
    if variance < threshold:
        return True
        return False

previous_data_1 = ["some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data"]
previous_data_2 = ["some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data"]
previous_data_3 = ["some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data"]
previous_data_4 = ["some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data"]
previous_data_5 = ["some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data"]
previous_data_6 = ["some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data", "some_data"]

previous_data = [previous_data_1, previous_data_2, previous_data_3,
                 previous_data_4, previous_data_5, previous_data_6]

print("Automatic Provisioning N. 1 - Normal scenario with variance < threshold")
print("is_automatic_provisioning_happening = {}".format(is_automatic_provisioning_happening(previous_data, 3)))

previous_data_7 = ["some_data", "some_data", "some_data", "some_data", "some_data"]

previous_data = [previous_data_1, previous_data_2, previous_data_3,
                 previous_data_4, previous_data_5, previous_data_6, previous_data_7]

print("\nAutomatic Provisioning N. 2 - Unhealthy scenario with variance > threshold")
print("is_automatic_provisioning_happening = {}".format(is_automatic_provisioning_happening(previous_data, 3)))

Open in new window

Eduard GherguArchitect - Coder - Mentor

Looks like a good start :)

Of course, you can decide later on about the amount and the structure of this data.
Top Expert 2016

somebody changed a request parameter on the other API and instead of having for example 1000 entities I got 500 entities.

errors like that can be detected by versioning both the client and the server API. assume your client API with version number 100 would require a response from server version number 50 but you get a response with version number 51. it is now easy to find out that there might be a mismatch and do appropriate action either to update the client API to accept version 51 responses, or do a version update at either side to get into sync again. you also could make each request and each response versioned. then the server even could support both old and new  API depending on the request.

note, if the data of the response is fix-sized it is good practice to pass the size of the data at begin of the transfer buffer. then the requester could check whether the requested size matches the received size, what is a simple but effective way to detect errors due to newer data.


Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial