The early history of UNIX

Original author: Amit Singh
  • Transfer
This is a translation of a fragment from an article, which, in my opinion, is appropriate to submit to a separate post. Main article: habrahabr.ru/post/193798
The MAC project (Multiple Access Computer, Machine-Aided Cognition, Man and Computer) began as a purely research project at MIT in 1963. Then it grew into the Laboratory of Computer Science (LCS), and today it is called the Laboratory of Computer Science and Artificial Intelligence

In the early 60s there was a surge in interest in time-sharing systems. John McCarthy wrote a note entitled "Time-Attached Operator Program for the IBM 709 Project" in 1959. Corbato, Mervyn-Dagget and Daley wrote in an article in 1962 that "we are on the verge of a third global change in the way we use computers, due to time sharing." At first it was considered as a way to increase the efficiency of computer use, but very quickly came to the idea of ​​a multi-user system. Dennis Ritchie will later say that the slowest step in the write-compile-execute-debug cycle was determined by the person, not by the machine.

image

As part of the MAC project, a significant contribution was made to systems with shared time, including the development of an operating system (then there were no such words, but let's say that for definiteness - approx. transl.) CTSS (Compatible Time-Sharing System). In the second half of the 60s, several other time-sharing systems were created, for example BBN, DTSS, JOSS, SDC, etc. But all this is not relevant to this article. But the Multiplexed Information and Computing Service (MULTICS) - has.

Multics


This is a joint development of MIT, Bell Telephone Laboratories (BTL) and General Electric (GE) to create a time-sharing OS for the GE-645 computer.

At that time, "using a computer" meant almost exclusively "programming." That is, it was necessary to more efficiently execute the write-debug cycle mentioned above.

Multics was to become an application software that can support up to 1000 users at a time. Also from TK (cited from the Introduction and Review of the Multics System, Korbato, Vysotsky, 1965):
  • 24x7 operation without fail
  • The presence of a framework that can be added and improved as needed
  • Support for various programming languages ​​and user interfaces. The system itself was written mainly in the high-level language PL / I.
  • Support for a wide range of applications
  • Support convenient, flexible and fast remote access
  • Have a hierarchical structure of control, resource allocation and authorization
  • Have a reliable FS
  • Data Access Control Support
  • Availability of online documentation

BTL withdrew from this project in early 1969. Multics developed as a commercial product even after a series of resales. Honeywell bought GE's computer business, and Bull bought Honeywell. In general, the project was a success and noticeably influenced many subsequent ones. The last computer running Multics was shut down on October 31, 2000.

Unix


Although BTL left the project, some of its employees wanted to continue on their own. For example, Ken Thompson, Dennis Ritchie, Stu Feldman, Doug McIlroy, Bob Morris, Joe Ossanne. Thompson worked on the Space Travel game on the GE-635. It was first written for Multics, and then rewritten on Fortran under GECOS on the GE-635. The game simulated the bodies of the solar system, and the player had to land the ship somewhere on a planet or satellite. Neither the software nor the hardware of this computer were suitable for such a game. Thompson was looking for an alternative, and rewrote the game under the ownerless PDP-7. The memory was 8K 18-bit words in volume, and there was also a vector display processor for displaying beautiful graphics for that time. Thompson, with the help of Ritchie, rewrote the game for PDP-7 in assembler. In the process, we also got a software unit for working with a floating point. The game worked on bare metal, без ОС.

Zero Edition (end of 1969)


Thompson and Ritchie did the full cross-assembly development on GE and ported the code on punched tape. Thompson didn’t like it actively, and he started writing OS for PDP-7, starting with the file system. The system began to assemble on itself at the end of 1969. It has been the core of the editor, assembler, a simple shell and file utilities such as cat , cp , rm . It was UNICS, the name is subtle trolling Multics. Then it mutated in UNIX. This can be considered a zero edition.

The first version of the command cp processed the arguments in pairs:

# cp file11 file12 file21 file22 ...


The command dsw (delete using switches) allowed interactively deleting files.

The impact of Multics and the even earlier CTSS on modern Unix-like systems:
  • Шелл, в Multics он прямо так и назывался — shell. В UNIX подстановка результата выглядит как `command`, а в Multics — [command].
  • Many commands such as ls , pwd , chdir ( cwd in the Multics), mail , man ( help in Multics).
  • Configuration through files rc . CTSS had a program RUNCOM .
  • roff , team for rendering text. In CTSS and Multics, the documentation was prepared by the team RUNOFF
  • File as a simple stream of bytes
  • Text as a stream of characters and line breaks
  • Tree file system
  • A disk access API that hides low-level hardware features
  • The argument structure for I / O functions includes a file handler, a buffer, and a number of characters
  • I / O redirection

Ritchie wrote in a story article: “In general, UNIX is a very conservative system. Only a small fraction of the ideas it implements are really new. But for the CTSS legacy, even this is not bad.”
PDP-7 UNIX had a file system with inodes, but they contained very little information - a list of physical blocks and minimal metadata: size, creation time, and file type. Special files and directories were supported, but there were no file paths. But there was buffering. Another significant limitation:
  • Directories and special files can only be created when creating a file system, then you cannot
  • There can only be one drive
  • Multiprogramming was not supported. Only one program can be stored in memory at any given time.
  • Physical disk access completely blocked the system
  • There were no calls fork, exec, wait. The shell worked through crutches: when the program started, the shell was completed, the program at its completion was to restart the shell.

PDP-7 UNIX also marked the beginning of the high-level language B, which was created under the influence of the BCPL language. Dennis Ritchie said that B is C without types. BCPL was placed in 8 KB of memory and was carefully reworked by Thompson. I gradually grew up in C. I recall that the kernel and PDP-7 UNIX programs were completely written in assembler.

The Hanoi Tower Example on BCPL
UNIX also worked on the PDP-9. In 1969, the first ARPANET was also launched and the first RFC was published.

The UNIX development team chatted to BTL to buy a more advanced computer, the PDP-11/20 with 24 KB of memory. They promised to write a system for editing and editing documentation to run without an OS, and use UNIX only for development. UNIX was launched on a new computer in early 1971. 12 KB of memory was occupied by the kernel, a little more by programs, and everything else went under ramdrive.

First Edition (November 1971)


The first edition worked on the PDP-11/20 without MMU and hardware memory protection. So the stability of work and resistance to failures was not up to par. There was no multiprogramming either, but file paths had already appeared. There was documentation for such system calls:
Of the programming languages, assembler, B, BASIC, FORTRAN were supported. Since not yet.

Development Environment Files in B and Assembler:
/bin/as Assembler. The default output file is called the
/bin/ld link editor (the context is more like a linker, but it is very difficult to get such a translation from the original - approx . Only one user can work in one directory at a time due to a conflict of temporary files
/bin/nm displays the symbol table from the result of the assembler or bootloader operation;
/bin/strip removes unnecessary symbols from the binaries;
/bin/un lists the undefined symbols in the program; the
/etc/as2 second pass of the assembler
/etc/ba assembler B (prog.i -> prog.s)
/etc/bc compiler B (prog.b -> prog.i)
/etc/bilib interpreter library In
/etc/brt1, /etc/brt2 runtime,
/etc/liba.a assembler routines are a
/etc/libb.a library of routines for a
/usr/b/rc shell script to compile a B program into a binary. It works according to the program.b -> program.i -> program.s -> a.out chain.
In the first edition, copyright is not mentioned anywhere. The documentation was an impressive seven-volume edition: cm.bell-labs.com/cm/cs/who/dmr/1stEdman.html . Summary:
  1. Teams. Programs that the user calls directly
  2. System calls. Called through a special processor command
  3. Routines. Called by user programs
  4. Special files
  5. File formats
  6. Miscellaneous

The eighth volume on system maintenance has appeared in subsequent editions.

Each logical page of the manual was called man page and contained a title, a brief description, text, a list of affected files, links, diagnostics, bugs and the author. The documentation was prepared in the editor ed and formatted by the program roff . The very first page was dedicated to the team cat .

Second Edition (June 1972)


In the second edition, the C, compiler was added сс . It was written in another language. New commands and system calls have appeared:
:(1) Does nothing. Initially, this was a label for goto, it was necessary to teach the shell to ignore such lines by the
cc(1) compiler. The
m6(1) general-purpose macro processor
opr(1) sends a print job to the
tmg(1) compiler compiler. TMG is a language for writing compilers.
tss(1) interface for remote access to the Honeywell TSS OS.
salloc(3) library for working with strings of arbitrary length
In the second edition, there was also no multiprogramming or memory protection, but copyright appeared

Third Edition (February 1973)


This version worked on PDP-11/45 with memory protection and support for its large volume - up to 256 Kb.

Of the new features, attention should be paid to pipelines and multiprogramming. In addition:
cdb(1) debugger C
crypt(3) password encryption
proof(1) proto-diff
ps(8) process list
sno(1) compiler and interpreter SNOBOL III
speak(1) Speech synthesizer. The input receives a stream of words, gives their pronunciation a
typo(1) quote from the manual: "... searches for rare words, typos and hapax legomena typo(1) in the document and prints them to standard output")
yacc(6) compiler compilers

Fourth Edition (November 1973)


This is actually the third edition, rewritten in C. New PDP-11 models - / 60 and / 70 are also supported. Due to the development in a higher-level language, the volume of the system grew by a third. There were minor updates to the commands; language B was excluded from delivery.

What else is on this subject on Habré


habrahabr.ru/post/114588
habrahabr.ru/post/126369
habrahabr.ru/post/147774 - very well written. The text is repeated in places because both articles rely on the same original source
habrahabr.ru/post/46432