Solved

break down the code

Posted on 2014-03-28
19
429 Views
Last Modified: 2014-04-02
as you all know the MS share the MS-DOS code:

http://www.computerhistory.org/_static/atchm/microsoft-ms-dos-early-source-code/

after download it, how can we read that in human readable format?
0
Comment
Question by:marrowyung
  • 10
  • 4
  • 3
  • +2
19 Comments
 
LVL 1

Author Comment

by:marrowyung
ID: 39961128
0
 
LVL 52

Assisted Solution

by:Carl Tawn
Carl Tawn earned 50 total points
ID: 39961379
You can open the files for both in notepad, or any other text editor. The DOS one is Assembly language, the Word files are C.
0
 
LVL 142

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 50 total points
ID: 39961567
in human readable format as the code is "assembler" code, you cannot have a real human-readable code, unless you pay someone plenty of hours analyzing the code and writting some more high-level commenting text for the different parts.
sorry, but this goes far beyond any utiliy for common people, and far beyond experts on EE will be willing to do, I fear....
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39964742
it seems the source file can be open but all ASM and binary can't be open.
0
 
LVL 35

Expert Comment

by:mccarl
ID: 39965547
but all ASM and binary can't be open.
What do you mean by "can't be open"? True, they probably don't have a default file association on your machine, but you should still be able to open the files (at least the ASM files, forget about the binaries) in something like wordpad, notepad, etc.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39965661
What do you mean by "can't be open"? "

I think I should said it can open but can't be read !
0
 
LVL 35

Expert Comment

by:mccarl
ID: 39965668
Still don't know exactly what you mean? By "can't be read" do you mean "can't be understood" (by you) ?

If so, then you either need to learn assembler language so that you CAN understand it, or (as angel said above) pay someone that does know assembler to read it and explain it to you.

If by "can't be read" you meant something else, then please try and explain in more detail what is happening, what you are seeing. Perhaps a screenshot would help here.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39965690
"By "can't be read" do you mean "can't be understood" (by you) ?"

Yes, excellent !

"If so, then you either need to learn assembler language so that you CAN understand it, or (as angel said above) pay someone that does know assembler to read it and explain it to you."

but I also mean it is some kind of machine language no matter how you train.

for compiled language, it can't be read anyway ?

try to download that and tell me what you think !
0
 
LVL 35

Assisted Solution

by:mccarl
mccarl earned 200 total points
ID: 39965710
but I also mean it is some kind of machine language no matter how you train.
Just to be clear, so you are talking about something like this...
        MOV     AH,BYTE PTR [BX]
        OR      AH,AH                   ;End of directory?
        JZ      FREE
        CMP     AH,[DELALL]             ;Free entry?
        JZ      FREE
        MOV     SI,BX
        MOV     DI,OFFSET DOSGROUP:NAME1
        MOV     CX,11

Open in new window

Then yes, there are people out there that can read this and have a pretty good idea about what is going on. Secondly, it is quite likely (I can't say for 100% certainty but I would be pretty sure) that the author of this code wrote the code directly in Assembler, ie. there was never any *higher* level language that this was written in and then compiled to produce this. Believe it or not, this is how computers were programmed in the 'good old days'! :)

So, no, I don't think that there would be ANY tool or automated thing out there to take the assemlby code and somehow produce something more high level.
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 
LVL 1

Author Comment

by:marrowyung
ID: 39965717
"Then yes, there are people out there that can read this and have a pretty good idea about what is going on. Secondly, it is quite likely (I can't say for 100% certainty but I would be pretty sure) that the author of this code wrote the code directly in Assembler, ie. there was never any *higher* level language that this was written in and then compiled to produce this. Believe it or not, this is how computers were programmed in the 'good old days'! :)
"

no, you are checking the source folder,  I am sure this is the assembler language.

"So, no, I don't think that there would be ANY tool or automated thing out there to take the assemlby code and somehow produce something more high level. "

take a look on the v11object folder, you then know what I mean.

So the v11source is complied to the v11object, right?
0
 
LVL 35

Expert Comment

by:mccarl
ID: 39965744
no, you are checking the source folder,  I am sure this is the assembler language
Yes, I was looking at the "source" folders because I thought that we had already established that we were talking about the .ASM files (assembler files). In my first post I mentioned ASM and referred to "assembler" a few times so I thought that's what we were both talking about.

So yes, the "object" folders contain files that are compiled (or assembled) code but they would only ever have been compiled from these source assembler files. And there would probably be tools out there to take the .exe .com compiled files and "disassemble" them back into ASM source files (which you already have), but nothing any highler level than that. Also note that the disassembler still can't give you 100% what the source files were, because the compiled code still doesn't have everything from the source, such as comments, labels, etc.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39965972
"And there would probably be tools out there to take the .exe .com compiled files and "disassemble" them back into ASM source files (which you already have), but nothing any highler level than that."

what I am concerning is the number of files in the object folder is much more than the source folder.

The object folder has the fdisk.exe but source folder do not! so can't study the MS DOS v1.1 fdisk.exe

Where source folder of MS-DOS 2.0 has much more SOURCE files than v1.1.

"And there would probably be tools out there to take the .exe .com compiled files and "disassemble" them back into ASM source files (which you already have), but nothing any highler level than that."

that's what I am interesting on. Any suggestion on the tools?

"because the compiled code still doesn't have everything from the source, such as comments, labels, etc"

ok ,I think the compiled code can't be read again.
0
 
LVL 19

Expert Comment

by:simon3270
ID: 39970751
The v11source directory just contains the Assembler source for COMMAND.ASM (the DOS 1.1 command process, or "shell") and MSDOS.ASM (effectively the DOS kernel).  The v11objects directory as you say contains the compiled command.com (but not the compiled MSDOS.ASM), and a lot of utilities and BASIC programs.  So, you don't have the source for the v11 utilities.  Also the BASIC programs are in some odd Microsoft binary format, rather than just being the source text of the programs.

The v20source directory contains the same COMMAND.ASM and MSDOS.ASM (but for DOS 2 instead), along with the source for all of the utility programs.  The v20object directory contains compiled command.com and msdos.sys, along with the compiled utility programs, some descriptive documents and, oddly, the Pascal source for PROHOST.EXE (I've not used Pascal for 30 years - I'd forgotten how wordy it was!).

So, you have the source for all of the v2 programs, but not the v1.  As others have said, you can find disassembly programs which will take a binary program and produce the same form of Assembler source that you already have, but it will have no helpful text or labels.  For simple programs (and particularly for .COM programs which are a much simpler format than .EXE ones), working out what is going on *is* possible, but you need to have a very detailed knowledge of the entry points into MSDOS.SYS to see how the program interacts with the OS, and it would almost certainly be far more work than is sensible.  If I were you I'd stick with the v2 programs.

Edit: If you want to try disassembly, there's a list at http://en.wikibooks.org/wiki/X86_Disassembly/Disassemblers_and_Decompilers
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39971192
"The v11source directory just contains the Assembler source for COMMAND.ASM (the DOS 1.1 command process, or "shell") and MSDOS.ASM (effectively the DOS kernel).  The v11objects directory as you say contains the compiled command.com (but not the compiled MSDOS.ASM), and a lot of utilities and BASIC programs"

what I can see is, the object files is much more than the source files and this make me interest.

"  If I were you I'd stick with the v2 programs."

why ? much simple, but we can see that there are much more file than v1.1
0
 
LVL 19

Expert Comment

by:simon3270
ID: 39971460
I'm not really sure what you want to do with these files.

For v11, there are only a couple of human-readable source files of interest, and a number of compiled utility programs for which you don't have the source.  You could spend a really long time (or a lot of money) generating the source code form the binaries so that you could read what the programs are doing.  There's nothing interesting in the fact that you have compiled programs but no source - it's probably just that Tim Paterson found a disk with these few files on, and didn't come across one with the source files for DISKCOMP.COM, FORMAT.COM, BASIC.COM and so on.

Or you could take the simpler route and use the v20 sources - there you have the compiled programs *and* their source files, so that you can read the assembler code and see what the compiled program does.  You have more advanced versions of almost all of the programs that you have in v11 (but oddly, not BASIC.COM), and you have a few more new programs to look at too.

So, the important question is, what do you want to achieve?
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39971467
" it's probably just that Tim Paterson found a disk with these few files on, and didn't come across one with the source files for DISKCOMP.COM, FORMAT.COM, BASIC.COM and so on.
"

yeah, probably but I don't think he is that lazy!

"So, the important question is, what do you want to achieve? "

Ok, make it simple, just want to learn the complete set of code! not just like v1.1, we can't find the complete one.

if something is missing, we probably didn't know hte complete feature!

I think v 1.1 has a lot of thing missing just because v1.1 is the copy of someone else's job! so they simply can't give you all.

starting from v2.0, they implemented a lot of feature ! and then v3.0, they rise up the license cost for all Taiwan's PC
0
 
LVL 19

Accepted Solution

by:
simon3270 earned 200 total points
ID: 39971492
v1.1 has things missing because Tim couldn't find them - do you have a copy of every bit of code you wrote 30 years ago?

Be thankful that you have an apparently complete set of source for v2.0, and abandon any hope of reconstructing v1.1 - it really isn't worth it.

I think this question is answered - anything else we can add would just be a philosophical discussion and not really what this Experts Exchange topic is for.
0
 
LVL 1

Author Comment

by:marrowyung
ID: 39971566
"v1.1 has things missing because Tim couldn't find them - do you have a copy of every bit of code you wrote 30 years ago?"

ahhaha, good point! but I just expecting a complete set! for learning it is not good as we will have missing part, agree?

I might keep a copy as I might patient it and sue someone whoe steal my code 30 years later.

keeping that small amount of code is not hard for today's hardware.

but binary code/application, I intentionally do not keep it has the new applicatoin is much better!

but for the one I build myself I might still have it. it depends.


"and abandon any hope of reconstructing v1.1 - it really isn't worth it.
"

Yes I agree ! my college site in front of me has a good memory to tell me the history ! bascially on the first round of IBM PC OS, there are another one completing with MS-DOS. that one dead after IBM choice MS-DOS with some unknown reason!!

and the first BASIC language bill built for the Macintash is called N-BASIC, I will be happy if I get the source code too !
0
 
LVL 1

Author Closing Comment

by:marrowyung
ID: 39971569
Learn a lot from you guys.
0

Featured Post

Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

Join & Write a Comment

Displaying an arrayList in a listView using the default adapter is rarely the best solution. To get full control of your display data, and to be able to refresh it after editing, requires the use of a custom adapter.
Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
Viewers will learn how to properly install Eclipse with the necessary JDK, and will take a look at an introductory Java program. Download Eclipse installation zip file: Extract files from zip file: Download and install JDK 8: Open Eclipse and …
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

757 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

22 Experts available now in Live!

Get 1:1 Help Now