We help IT Professionals succeed at work.

json to keyvalue pair in a specific order

277 Views
Last Modified: 2020-07-05
Hi,

please help

Have a below sample json file. Would like to generate a output of it like below. I can use python, shell, jq, perl etc. Struggling to come up with a logic to tell the custom script how to pick attributes and divide into 3 section within single line.

This question really helped me to a good start. since i have to handle things differently posting another for help.
https://www.experts-exchange.com/questions/29186144/Convert-Json-to-key-value-pair-in-linux.html

Kindly help me with a sample script please

# Sample input json
{
  "ignored": 0,
  "totalTime": 869325.6763669999,
  "threads": 5,
  "scenarios": 123,
  "failed": 79,
  "passed": 44,
  "elapsedTime": 181358,
  "failedRate": 64,
  "passedRate": 36,
  "squad": "teama",
  "microserviceteam": "searchapi",
  "environment": "qa",
  "measurement": "mymeasurement",
  "testingtype": "shakeout",
  "builduser": "jenkins",
  "buildnum": 8895,
}

Open in new window


Expected output:
mymeasurement,environment=qa,testingtype=shakeout,squad=teama,microserviceteam=searchapi ignored=0,threads=5,scenarios=123,failed=29,passed=94,failedRate=24,passedRate=76,totalTime=869325.6763669999,elapsedTime=181358,buildnum=8895

Open in new window


FYI
  1. First column value is for tablename
  2. All key value pair before the single space are for tags. E.g single space after value searchapi in above output.
  3. rest of the key value pair after the single space are fields
Comment
Watch Question

CERTIFIED EXPERT
Commented:
Unlock this solution and get a sample of our free trial.
(No credit card required)
UNLOCK SOLUTION

Author

Commented:
Thank you once again @pepr for your timely help
MURUGESAN NApplication Development and Automation Management
CERTIFIED EXPERT

Commented:
@enthuguy

I have tried the same using awk command :

        echo "{
  \"ignored\": 0,
  \"totalTime\": 869325.6763669999,
  \"threads\": 5,
  \"scenarios\": 123,
  \"failed\": 79,
  \"passed\": 44,
  \"elapsedTime\": 181358,
  \"failedRate\": 64,
  \"passedRate\": 36,
  \"squad\": \"teama\",
  \"microserviceteam\": \"searchapi\",
  \"environment\": \"qa\",
  \"measurement\": \"mymeasurement\",
  \"testingtype\": \"shakeout\",
  \"builduser\": \"jenkins\",
  \"buildnum\": 8895,
}" > ./input.json

Open in new window


/usr/bin/awk '{
        if ( ( "{" != $0 ) && ( "}" != $0 ) )
        {
                FIRSTQUOTE=index( $0, "\"")+1;
                SECONDQUOTE=index( $0, "\":")-4;
                arr[NR][1]=substr( $0, FIRSTQUOTE, SECONDQUOTE );
                arr[NR][2]=substr($0, index( $0, ": ")+2 );
        }
}
END {
        ENDNR=NR;
        for (indx1 in arr)
        for (indx2 in arr[indx1])
        {
                if ( 1 == indx2 )
                {
                        printf( "%s=", arr[indx1][indx2]);
                }
                else
                {
                        if ( indx1 == ENDNR-1 )
                        {
                                printf( "%s\n", substr( arr[indx1][indx2], 0, index( arr[indx1][indx2], "," )-1 ) );
                        }
                        else
                        {
                                if ( 0 != index(arr[indx1][indx2], "\"") )
                                {
                                        WITHOUTQUOTES=substr( arr[indx1][indx2], 2, length( arr[indx1][indx2] )-3 );
                                        printf( "%s,", WITHOUTQUOTES);
                                }
                                else
                                {
                                        printf( "%s", arr[indx1][indx2]);
                                }
                        }
                }
        }
}' ./input.json

Open in new window

Output:
ignored=0,totalTime=869325.6763669999,threads=5,scenarios=123,failed=79,passed=44,elapsedTime=181358,failedRate=64,passedRate=36,squad=teama,microserviceteam=searchapi,environment=qa,measurement=mymeasurement,testingtype=shakeout,builduser=jenkins,buildnum=8895

Open in new window

CERTIFIED EXPERT

Commented:
Hi Murugesan,

I doubt enthuguy needs you to provide code to recreate his input file, but your method for creating it seems overly complex, with all that escaping (\) of quotes.  Try a simple here document, like this:
#!/bin/bash

cat <<EOF >input.json
{
  "ignored": 0,
  "totalTime": 869325.6763669999,
  "threads": 5,
  "scenarios": 123,
  "failed": 79,
  "passed": 44,
  "elapsedTime": 181358,
  "failedRate": 64,
  "passedRate": 36,
  "squad": "teama",
  "microserviceteam": "searchapi",
  "environment": "qa",
  "measurement": "mymeasurement",
  "testingtype": "shakeout",
  "builduser": "jenkins",
  "buildnum": 8895,
}
EOF
Also, your final output seems quite different from enthuguy's expected output, and you haven't stated that (although it is evident from the output you provided).

What you've provided could be generated by a Perl one-liner like this:
perl -0ne 'print "$1=$2$3" while (/"(.+?)": "?(.*?)"?(,?)\n/g)' input.json
or this:
perl -ne 'push @a,"$1=$2" if /"(.+?)": "?(.+?)"?,?$/;END{print join ",",@a}' input.json
or this:
perl -ne 'print "$1=$2$3" if /"(.+?)": "?(.*?)"?(,)?$/' input.json
(Note that the above solutions rely on even the last value ending in a "," as in the sample input ("8895,"), so if that is not going to be the case then I'd have to make an adjustment.  I see Pepr also spotted this non-standard json issue.)

As you can see, the output is the same as yours, Murugesan:
ignored=0,totalTime=869325.6763669999,threads=5,scenarios=123,failed=79,passed=44,elapsedTime=181358,failedRate=64,passedRate=36,squad=teama,microserviceteam=searchapi,environment=qa,measurement=mymeasurement,testingtype=shakeout,builduser=jenkins,buildnum=8895

I would have expected a more concise awk script than you've written could do the above, but I don't know much awk so I'm not sure.

But I happened to notice in the question title ("json to keyvalue pair in a specific order") that output order is important.  And at the bottom of the original post, there are also other requirements which are too unclear for me to understand, namely:
  1. First column value is for tablename
  2. All key value pair before the single space are for tags. E.g single space after value searchapi in above output.
  3. rest of the key value pair after the single space are fields

As much as I'd love to help with a Perl solution to handle all the requirements:
1. I don't understand all the requirements (and I suggest enthuguy makes them clearer if he wants a full solution),
2. enthuguy has been ignoring some of my posts (which I put considerable effort into) and private messages regarding them, so I'm just not sure I'd get suitably rewarded for my efforts.

Author

Commented:
@Pepr, @tel2,
Sorry you are right last comma is not required. (Since I removed few attributes manually I left it there)

Thanks @Murugesan for your effort to help me out. I might help but yes looks bit complex to me because I dont know awk. So in case any slight modification I will struggle :(

HI tel2, apologies for not responding earlier, Thanks for your help today

Author

Commented:
HI @Pepr,
I liked your python way, easy to extend as well. however getting below, could you help please. I'm trying few things still not able to resolve

[ec2-user@ip-10-157-60-31 tig]$ python3 serialize.py
  File "serialize.py", line 40
    tag_lst = [f'{k}={d[k]}' for k in tags]
                           ^
SyntaxError: invalid syntax

Author

Commented:
I modified a bit to old school way. Not using pepr adv scripting (sorry pepr). Seems to be ok as well.

Thanks to pepr for providing another good start this weekend :)

import json

# The input file name with JSON data.
fname = 'input.json'

# Get the tablename somehow.
tablename = 'mymeasurement'

# Wanted tags in the order.
tags = [
  "environment",
  "testingtype",
  "squad",
  "microserviceteam",
]


# The wanted fields in the order.
fields = [
  "ignored",
  "totalTime",
  "threads",
  "scenarios",
  "failed",
  "passed",
  "elapsedTime",
  "failedRate",
  "passedRate",
  "measurement",
  "builduser",
  "buildnum"
]

# Loading the JSON data to get the dictionary.
with open(fname) as f:
    d = json.load(f)

# Getting the list of tags with the values.
tag_lst = ','.join(k + '='+ d[k] for k in tags)

# Getting the list of fields with the values.
field_lst = ','.join(k + '='+ str(d[k]) for k in fields)

# Putting all things together
finaltext = tablename + ',' + tag_lst + ' ' + field_lst

print(finaltext)


MURUGESAN NApplication Development and Automation Management
CERTIFIED EXPERT

Commented:
@enthuguy

most welcome.

I was in sleep mode when I wrote my comment yesterday and was thinking C program.

>> It might help but yes looks bit complex ...
>> So in case any slight modification .. struggle
ok :) np
Carry on
awk tutorial (during weekends /holidays)
https://www.tutorialspoint.com/awk/index.htm
this will be helpful on parsing/validating:
1. EDI files (Electronic Data interchange)
2. IDOC files (from SAP having multi language support document)
Hence written that using awk/gawk.exe

$ cat << IDOC_EOF >> Sample.idoc
@TEL2
Thank you for all your comments.

@enthuguy
I have written only for knowing other ways (I agree complexity).
I have written other ways => when old team members follow these kind of ways => we faced error during 2006.
Hence better to learn(including myself) all the ways.
Follow as per the comments from TEL2/...
EOF END OF FILE. Use EOF or related EOF_IDOC or EOF_EDI or ... as per the requirement :)
This might be used later in other script => easy to get the script name when we search using git/svn/find commands.
IDOC_EOF

Open in new window

CERTIFIED EXPERT

Commented:
Hi enthugy,

Good to hear you were able to adjust pepr's code to suit your needs, but do you realise it's not producing the same output as you specified in your original post?

Here's your expected output from your first post:

mymeasurement,environment=qa,testingtype=shakeout,squad=teama,microserviceteam=searchapi ignored=0,threads=5,scenarios=123,failed=29,passed=94,failedRate=24,passedRate=76,totalTime=869325.6763669999,elapsedTime=181358,buildnum=8895

and here's what your code above produces:

mymeasurement,environment=qa,testingtype=shakeout,squad=teama,microserviceteam=searchapi ignored=0,totalTime=869325.676367,threads=5,scenarios=123,failed=79,passed=44,elapsedTime=181358,failedRate=64,passedRate=36,measurement=mymeasurement,builduser=jenkins,buildnum=8895

So is the expected output in your original post wrong, or is your python code wrong, or what, exactly?


Hi Murugesan,
After learning some gawk, I came up with this gawk one-liner to produce the same output that your code does:
gawk 'match($0,/"([^"]+)": "?([^"]*)"?(,?)/,a){printf a[1]"="a[2]a[3]}' input.json
Notes:
- I used gawk (GNU awk) instead of awk because it seems the match() function is not included in the original awk.
- It also seems that awk/gawk regex doesn't support non-greedy matching (as Perl does with the '?'), so I had to use things like: "([^"]+)" instead of the more concise "(.+?)"
CERTIFIED EXPERT

Commented:
Hi enthuguy,

I should have warn you that you need at least Python 3.6 for the f-string interpolation like this.

tag_lst = [f'{k}={d[k]}' for k in tags]

Open in new window


Your solution is fine (I would only rename your  *_lst variables as they are not lists any more). The closest replacement of the f-string in older Python could also be -- .format() (appeared in Python 2.6; so, it is available in all supported versions of Python) or the oldest % (I do not know when this was introduced):

# Getting the list of tags with the values.
# The previous interpolation style.
tag_s = ','.join('{}={}'.format(k, d[k]) for k in tags)    # could also be '{0}={1}'.format(...

# Getting the list of fields with the values.
# The oldest interpolation style.
field_s = ','.join('%s=%s' % (k, d[k]) for k in fields)

Open in new window

Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a sample view!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.