Processes

In pursuit of an OS' two central goals (multiprogramming and multitasking) it is important that it be able to "juggle" programs from the kernel, systems programs, and those of the user to give the user the illusion that they are all active at once.

In the old days, systems would only need to run a single program at a time, and so it was not necessary to address modern concerns of needing to share both processing and memory resources between programs.

This requirement meant that some programs would be the actively processing ones, and others would need to be temporarily "benched" until it was their turn.

This is precisely the distinction between a program and a process:

A program is a passive set of instructions stored on the disk (an executable) which does nothing until executed.

A process (AKA a job) is an active program in execution that is loaded into main memory, has an associated program counter tracking its execution, and memory reserved for its required variables and data.

By analogy: program : potential energy :: process : kinetic energy (OK maybe not the best analogy, you do better!)

Note, however, that a single program can spawn multiple processes. What are some examples of programs we use daily that do this?

So what exactly does a process look like? What are its components? What are things we can do with processes? Stay tuned!

Process Components

So what, precisely, is in a process? Let's start by looking at its footprint in memory.

Consider what you learned from CMSI 281 and from the Classwork 2 you just did. What are the 5 primary elements of a running process that are stored in RAM? (Hint: they are organized in the fashion represented below):

The five components are:

Text section: contains the instructions of the program itself.
Data section: used for storing global variables and consists of two components:
- BSS (Block Started by Symbol) section contains declared, but uninitialized globals. Used for reserving space that is yet to be used.
- Data Blocks contain declared global variables.
Heap: contains memory for dynamically allocated variables.
- Can grow / shrink into the free space as the process requests more memory.
- Situated above the data section and grows upwards.
Stack: contains memory for locally allocated variables.
- Can grow / shrink into the free space as the process pushes and pops frames from the call stack.
- Situated above the free space and grows downwards, as necessary.
Free space contains memory blocked out for the process to be used for the stack and heap (discussed more in-depth later).

From the (completed) image above, when would we know that a process has exhausted the memory that it has available to it?

When the stack's pointer (moving top-down) meets the heap's pointer (moving bottom-up).

We'll investigate more about how this memory is assigned and managed later in the course.

Process Control Blocks

We've heard the phrase "control block" in another context before -- what was the context and what was its purpose?

File control blocks were used to store file metadata; inodes were the Unix FCBs and held vital info about a file.

By the same token, we'll likely want to maintain some information on our processes.

A process control block (PCB) keeps a record of the OS's process-specific properties.

Since each process' PCB holds information for the OS' management of each process, where would it make sense to store each PCB (hint: the kernel is a process itself!)

The kernel's stack!

PCBs hold a lot of different properties about each process, which we'll examine next.

Fundamentals of PCB

PCBs contain rich information about a process from the OS' perspective, including details like:

Process number: a unique numerical ID corresponding to each process
Program counter: a pointer to the process' text section indicating where execution has left-off at.
CPU registers: contents of all process-centric registers.
Memory management: memory range allocated to the process.
I/O Status: I/O devices allocated to process + a list of any open files.
Etc. (others to be discussed)

To motivate one of the most important properties that a PCB tracks, consider the following question:

Multiprogramming was one of the high-level OS goals to make multiple programs appear to be running simultaneously even though there may be more processes than processors. As such, what must an OS do with processes to accomplish this?

It must "juggle" them by scheduling processes to use the processor's resources at any given time.

We'll look at process scheduling next time, but for now, we should consider that a process may be in a variety of different states that will determine when it is able to employ the CPU.

Process State

A process' state defines the current activity of a process, is used to determine how the process shall be scheduled, and is recorded in the PCB.

A process can be in any of 5 states:

New: the process has just been created, a PCB initialized for it, and is petitioning the OS for admittance.
Ready: process has been admitted for execution.
Running: process' instructions are being executed by the CPU.
Waiting: process is waiting for some event to occur (like for an I/O completion or a wait()) event (like sleep).
Terminated: process has completed execution.

Plainly, if a process has states, then there must be an associated state diagram!

We'll use the following figure to motivate scheduling in the next lecture, but consider what events might transition one state into another: