Link to home
Create AccountLog in
Avatar of enthuguy
enthuguyFlag for Australia

asked on

Sed to replace text as it is without newline conversion

Hi,

Due to some product limitation
1. have to apply newline char in an xml content and store in a file updated.xml. So the entire content will be in a singleline.
2. Then I use sed command to update this content in a JSON file.
3. this step is to invoke an api to send the above json (right now it fails)

But what happens, after sed in the json file, I think it converts new line and I see each xml in a separate line which make this invalid request.

could you advice how to store xml content as is like in a single line in json please

Sed command
dtoXMLData=$(cat escaped/${response_xml_for_conversion})
sed -i -e "s|CHANGE_ME_DTO_XML_DATA|${dtoXMLData}|g" ${src_finalize_project_json}

Open in new window


Converted XML -
<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\r\n<packageItemStatuses>\r\n<importProcessId>461a831c-04d8-455f-9ea9-09afe75b2b90</importProcessId>\r\n<packageItemStatus>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9003</id>\r\n<source>20191115000000</source>\r\n</conflicts>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9004</id>\r\n<source>20191115000000</source>\r\n</conflicts>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9005</id>\r\n<source>20191115000000</source>\r\n</conflicts>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9006</id>\r\n<source>1000</source>\r\n</conflicts>\r\n<conflicts>\r\n<id>9007</id>\r\n</conflicts>\r\n<exists>true</exists>\r\n<id>157752420</id>\r\n<merge>true</merge>\r\n<skip>false</skip>\r\n<type>21</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158010210</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159064232</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>false</exists>\r\n<id>158009962</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>false</exists>\r\n<id>159062289</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158009960</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159066053</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158009958</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062265</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062351</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>157995292</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062282</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>157995289</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062298</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158009964</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159077274</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159066050</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n</packageItemStatuses>

Open in new window


after sed with incorrect mulitline
"body": {
                                        "mode": "raw",
                                        "raw": '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>^M
<packageItemStatuses>^M
<importProcessId>461a831c-04d8-455f-9ea9-09afe75b2b90</importProcessId>^M
<packageItemStatus>^M
<conflicts>^M
<dest>0</dest>^M
<id>9003</id>^M
<source>20191115000000</source>^M
</conflicts>^M
<conflicts>^M
<dest>0</dest>^M
<id>9004</id>^M
<source>20191115000000</source>^M

Open in new window

Avatar of Duncan Roe
Duncan Roe
Flag of Australia image

Pipe the sed output through unix2dos
Avatar of enthuguy

ASKER

thx Duncan, could you help me with the syntax pls?

https://winsmarts.com/new-line-characters-in-text-files-between-mac-and-windows-3695b7beb685
sed -e 's/$/\r/' inputfile > outputfile # unix2dos
@enthuguy

Can you try the following code:
Update the script
1)
src.xml => path/relaetdname.xml OR ./relaetdname.xml OR relaetdname.xml OR ./src.xml OR ...
2)
./updated.xml =>  => path/updated.xml OR ./updatedRelatedName.xml OR updated.xml OR ./updated.xml OR ...
#!/bin/bash
if [ -f src.xml ]
then
	# At my system file command not displaying CRLF
	# Hence using grep to validate if src.xml is DOS file or not
	head -1 src.xml  | od -bc | grep -E "\\\r" | grep -E -v "^$" >/dev/null 2>&1
	if [ 0 -eq $? ]
	then
		# If src.xml is DOS file
		# Update src.xml by removing \r and take a backup of src.xml.Origial.xml
		echo "sed -i.Original.xml \"s/\r//g;\" src.xml"
		sed -i.Original.xml "s/\r//g;" src.xml
	fi
	echo "ORIGINAL FILE     : src.xml.Original.xml"
	echo "LINUX FORMAT FILE : src.xml"
	awk '{
		printf( "%s", $0) > "./updated.xml";
	}' src.xml
	date
	echo "lines   words character count:"
	wc ./updated.xml
	ls -l ./updated.xml
else
	echo "Update this script using valid location of source file src.xml"
fi

Open in new window


Test result at my host:
$ ./29177762.sh
sed -i.Original.xml "s/\r//g;" src.xml
ORIGINAL FILE     : src.xml.Original.xml
LINUX FORMAT FILE : src.xml
Sun Apr  5 20:39:37 BST 2020
lines   words character count:
   0    4 2753 updated.xml
-rwxrwSrwx 1 murugesandins murugesandins 2753 Apr  5  2020 updated.xml
$ ./29177762.sh
ORIGINAL FILE     : src.xml.Original.xml
LINUX FORMAT FILE : src.xml
Mon Apr  6 01:10:03 IST 2020
lines   words character count:
   0    4 2753 updated.xml
-rwxrwSrwx 1 murugesandins murugesandins 2753 Apr  6  2020 updated.xml
$ ./29177762.sh
ORIGINAL FILE     : src.xml.Original.xml
LINUX FORMAT FILE : src.xml
Mon Apr  6 01:11:02 IST 2020
lines   words character count:
   0    4 2753 updated.xml
-rwxrwSrwx 1 murugesandins murugesandins 2753 Apr  6  2020 updated.xml

Open in new window

I meant for you to fetch and install the unix2dos utility from SourceForge.
BUT, when you posted
<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>\r\n<packageItemStatuses>\r\n<importProcessId>461a831c-04d8-455f-9ea9-09afe75b2b90</importProcessId>\r\n<packageItemStatus>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9003</id>\r\n<source>20191115000000</source>\r\n</conflicts>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9004</id>\r\n<source>20191115000000</source>\r\n</conflicts>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9005</id>\r\n<source>20191115000000</source>\r\n</conflicts>\r\n<conflicts>\r\n<dest>0</dest>\r\n<id>9006</id>\r\n<source>1000</source>\r\n</conflicts>\r\n<conflicts>\r\n<id>9007</id>\r\n</conflicts>\r\n<exists>true</exists>\r\n<id>157752420</id>\r\n<merge>true</merge>\r\n<skip>false</skip>\r\n<type>21</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158010210</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159064232</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>false</exists>\r\n<id>158009962</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>false</exists>\r\n<id>159062289</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158009960</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159066053</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158009958</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062265</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062351</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>157995292</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062282</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>157995289</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159062298</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>158009964</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>19</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159077274</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n<packageItemStatus>\r\n<exists>true</exists>\r\n<id>159066050</id>\r\n<merge>false</merge>\r\n<skip>false</skip>\r\n<type>20</type>\r\n</packageItemStatus>\r\n</packageItemStatuses>

Open in new window

what utility did you use to view this file? I assumed \r\n were the control characters Cr Lf but were they in fact literal text?
If they are control characters then the file was always multi-line, but with the CrLf (Windows) line ending.
If literal text, the solution will be different. Please advise about the \r\n characters.
I just tested, and for me sed does not alter \r\n whether control characters or literal text. Any extra information you can provide would be helpful.
Hi enthuguy,
I'm also confused about your requirements.
I have some questions, labelled below Q1 - Q4.
In your original post I see you've provided:
- Your "sed command".
- "Converted XML".  Q1. Is that the input data we should be testing with?
- "after sed with incorrect mulitline".  Q2. This is the output you're getting when you run that sed command on that converted XML, right?
Q3. Please provide the output that you want to get.
Q4. What is your answer to Duncan's question about literal \r & \n.

Please number your answers accordingly for clarity.

In general, I suggest that wherever appropriate, you always provide sample input data (which you seem to have), and required output data (which it seems you haven't in this case).  This makes your description of the problem clearer, and experts can see when their solutions have met your requirements, because their output should match yours.
sorry about the confusion.

@Duncan
1. Due to target application limitation, I have to insert this literal string "\r\n" in the request xml and make it single line. then I have to insert this xml into my request json as an atribute. When I do it manually, my request works fine and get a proper response.
2. when i use store injected xml with literal string in variable then use "sed" to find/replace in final json. During sed, it makes the \r\n to a multi line and that makes the request invalid.
3. I use vi to view the final json and "cat" as well.

@tel2
1. Converted XML is the input data (after I inject them as a literal string "\r\n") I use python to read raw xml and inject these literal strings and that is Converted XML
2. Yes, that the output I get after I use sed (3rd code snippet block)
3. Output should be same as converted xml (copy/paste) in line 3 of  "raw":. right now we see it as multiline but it has to same as converted xml in single line.
4. pls check above

Yesterday, I managed achieve desired result by using python. I think it worked fine. Today I will test further an update you all.

Thanks again all, much appreciated

python snippet. I'm not an expert in python
# read converted xml file
with open('converted.xml') as fp:
   convertedXML = fp.read()

# find and replace
fin = open(request.json, "rt")
data = fin.read()
data = data.replace('string_to_replace_in_json', convertedXML)
fin.close

# write back to same file
fin = open(request.json, "wt")
fin.write(data)
fin.close()

Open in new window


Thanks for your answers, enthuguy.
Yesterday, I managed achieve desired result by using python. I think it worked fine. Today I will test further an update you all.
So shall we just wait and leave it to you for now, then?
ASKER CERTIFIED SOLUTION
Avatar of Duncan Roe
Duncan Roe
Flag of Australia image

Link to home
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
See answer
Thanks Duncan!

For the alternate suggestion. if my test fails today with python...will try your suggestion