Link to home
Start Free TrialLog in
Avatar of Triskelion
TriskelionFlag for United States of America

asked on

CGI with ASM vs Pascal

Just for fun:
I created a CGI using Turbo Pascal and it works fine.
I created a CGI using 80x86 asm and it does not.
I exported the output from the two CGIs into files and ran a file comparison and the files are identical.
Does the CGI architecture not recognize the .COM file?
I renamed the file with various extensions and it still did not work.  
Is it a memory location thing?
Is it a security thing?

Here is the PASCAL Code:----------------------------------
BEGIN
   write('Content-type: text/html', #10, #10);
   writeln('<HTML>', #10,
   '<BODY bgcolor="yellow">', #10,
   '<br><h1>This is neat</h1>', #10,
   '</BODY>',#10,
   '</HTML>'
   );
END.

Here is the ASM code:----------------------------------
mov ah, 09h
mov dx, CONTENT
int 21h
int 20h
CONTENT:
db 'Content-type: text/html', 0ah, 0ah
HTML_2:
db   '<HTML>',0ah,
db   '<BODY bgcolor="yellow">', 0ah
db   '<br><h1>This is neat</h1>',0ah
db   '</BODY>',0ah,
db   '</HTML>',0dh, 0ah, 024h
Avatar of jhurst
jhurst

I have no idea but wonder, is asm output going to stdout, the web server tends to assume this.
Avatar of Triskelion

ASKER

Yes.  I can direct the output to a file from the command line.
Triskelion,

It's way back I took those Pascal class... ASM.. hmm... it's dinasour time.  Just kidding.

If I recall propely, in Pascal, or most 3rd generation programming language would insert an \r\n for each writeln statement, and only \n (\r = EOL, \r= RET).

Perhaps you might want to view the file under some HEX Editor. to see if they are "exactly" identical.

Another thing, your assumption about .COM program not being so Web friendly might be true.

A tried to place something.com on Apache Win32 cgi-bin, and I produces an error.  Not the "Server Error" in the web page, but someting that pop-ups from window.  I would believe that something to do with the server.

I viewed the output files with a hex editor and they are the same byte for byte.
I also used "fc /b" to compare both of them.
Identical.

I manually loaded the output from both in the browser (IE 6) and they looked fine (with the addition of the content-type flag sticking out first).
What is you ran the program, and piped it so some html file, (you did mention you can do this).


pascal-code.exe > pas.html
asm-code.com > asm.html

and try to view via web browser.

How was the result.  I would suspect that I would be OK.

If it is, then, I would belive that jhurst is getting on the right track.

we'll wait you result.
Yes, I tried that early on and the output looked good.
Its not the output that you see in IE that is the problem but the output after the HTTP header (content-type...) which should have two Carriage return line feeds ie
db 'Content-type: text/html', 0dh,0ah,0dh,0ah

Having said that your pascal program should fail for the same reason so I'm not too sure why that should be working.

However I would be a little suprised if a .com would actually work simply because IIS doesn't know, by default, that it should execute a .com. It normally expects its CGIs to be .exe (unless you install some other stuff such as activeperl).
I've had a look and can't see how you could even configure this in.

To be honest apart from as a point of interest there is no good reason to ever write a CGI as a .com. THe only real justification might be size of executeable etc (and I'm not sure if this is really true these days) but if this was an important criteria you would write an ISAPI application or filter.

Steve

Thanks for joining in.

The carriage returns (0x0d) are not necessary.  It only needs linefeeds (0x0a).  I have a lot of examples on Unix using other languages.  That's what makes this more confusing.

I'm running Apache on Windows 2000.

The ASM compiler I use (A86) only produces .com files.  Maybe if I used MASM, I could make it an EXE, and maybe it will respond differently... which is what I meant by asking if it is a memory location thing.

Simply renaming the file didn't work, so I think it'll actually have to be run from some other location than offset 0x100.
ASKER CERTIFIED SOLUTION
Avatar of BeyondWu
BeyondWu
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I'm sorry but you are wrong the 0dh is neccessary. This is something specific to the HTTP protocol that the header must end with an extra carriage return/line feed.

As per the RFC

full-Response = Status-Line
   *(General-Header
     | Response-Header
     | Entity-Header)
     CRLF

It is the appearence of two CRLF (the first being the last characters of the last line) that marks the end of the header and the start of the entity body.

You show me any program in any language that doesn't do this and I'll show you a CGI that doesn't work.

However as I said before I think your main problem is that it is a .com. Renaming will not do the trick because the internal structure of a com and exe are different. Using MASM I wouldn't think would make any difference - you would need to code as an exe as beyondwu has indicated.

Steve

 


[1]
I dug out my floppy with an old version of MASM.
Luckily it still had it's own linker on it.

I took BeyondWu's example and compiled and linked it.
It worked like a champ.  
This example contains only linefeeds (LFs) on the Content-type line.  
I even removed the carriage return at the end of the string and it continued to work.

[2]
I added carriage returns (CRs) to the existing linefeeds in the same example and it continued to work.  
Yes, I cleared the cache and deleted the original EXE before continuing.

[3]
I changed the LFs to be ONLY CRs (on the content-type line).
We did not discuss CRs only, but I wanted to complete the cycle.
It did not work.

[4]
I added CRLF pairs to my existing asm that produced the .com file and it still did not work.  I don't think the issue is the carriage returns in that case, just the memory model.

[5]
I added CRs to the existing LFs in the Pascal program.
It continued to work.

[6]
I used CRs only in the Pascal prog.  It failed.
---------
In these examples, adding or removing CRs did not matter as long as LFs existed.
I did not try the pairs in reverse order.

The CRLF issue is not the point.

My next task will be to find an old version of Turbo Pascal (that produces .com files) to see if it will fail.
I'm also going to try this with Turbo Prolog, COBOL and Fortran.
I know that is what I said!

"However I would be a little suprised if a .com would actually work simply because IIS doesn't know, by default, that it should execute a .com. It normally expects its CGIs to be .exe (unless you install some other stuff such as activeperl)."

The fact that it works once compiled as an exe still suprises me (as I again said ealier) because using anything other than a CRLF after the http header is contrary to the standard. Check the rfc for yourself if you don't believe me.

Just one thought, what browser are you using to view this?

I'm not saying I don't believe you.
I use IE 6.  The server is Apache.
not against anybody's opinion here:

if I recall, CGI is executed on server side.  And what it has to do with the browser?

RFC is/are suppose to be the standard, and as long as any product at-least satisfy what RFC is saying, then the product can be in compliance with RFC nnnn.  However some implementation do push their product to somehow "exceed" what is in the RFC.

well.. my 5 cents.


Samri it has to do with the browser because the browser is looking for the extra CRLF to determine the seperation between the header and the body of the message. (Any proxies in the route also need to make this distinction).

Triskelion: Forgetting the extra CRLF is one of the most common mistakes that people writing CGIs make. I have to confess that I haven't seen anyone actually do this since IE6 came out so maybe it is being a bit sloppy with the standard (as if MS would do such a thing). However as it runs contrary to the standard you can't assume that it will always work.

A demonbstartion of this principle that I saw a few years back was a large system that passed a lot of urls around that looked somthing like this

www.modomain.com/mycgi.cgi?loc=1:23:33:65

No problem on Netscape. Then Microsoft brought out IE2 and some bright spark thought they would just give it a quick test to find that it truncated the url at the first illegal colon. It wasn't that IE was wrong but that netscape had allowed illegal characters through.

Steve




Mouatts, please understand.
I do not consider the carriage return necessary.
Did you see the results of the tests?

ALL of my unix CGIs do not use CR.  They ONLY use LF.

My actions are not a mistake, they are intentional.
Most of the examples I've seen don't use the CR.
I do not plan on adding CRs or intentionally using them in the future.

I'm spending so much time here talking about Carriage returns when I've determined that they had NOTHING to do with the problem.

I said my car wouldn't start and you told me my passenger side rear-view mirror is missing.
Oh, yeah...
I forgot this


   :)