UNIX - Sample Questions and Answers

 


1.1. Windows NT is a 32-bit operating system (OS). What does it mean to be a 32-bit OS?
A 32-bit OS runs on at least a 32-bit processor and generally makes a flat, 32-bit virtual address space available to programs. Programs running on a 32-bit OS will generally use 32 bits as their fundamental word size. For instance, an integer will be a 32-bit value rather than a 16-bit value. The term “32-bit OS” has also come to imply a variety of technologies, such as preemptive multitasking and process isolation that are common to most 32-bit operating systems but not necessarily inherent to a 32-bit operating system.

1.2. Describe programs, processes, and threads.
A program is a collection of instructions and data that is kept in a regular file on disk. In its i-node the file is marked executable, and the file’s contents are arranged according to rules established by the kernel. This is another case of the kernel caring about the contents of a file.

Programmers can create executable files any way they choose. As long as the contents obey the rules and the file is marked executable, the program can be run. In practice, it usually goes like this: First, the source program, in some programming language (C or C++, say), is typed into a regular file, often referred to as a text file, because it is arranged into text lines. Next, another regular file, called an object file, is created that contains the machine-language translation of the source program. This job is done by a compiler or assembler (which are themselves programs). If this object file is complete (no missing subroutines), it is marked executable and may be run as is. If not, the linker (sometimes called a “loader” in UNIX jargon) is used to bind this object file with others previously created, possibly taken from collections of object files called libraries. Unless the linker could not find something it was looking for, its output is complete and executable.

In order to run a program, the kernel is first asked to create a new process, which is an environment in which a program executes. A process consists of three segments: instruction segment, user data segment, and system data segment. In UNIX jargon, the instruction segment is called the “text segment.” The program is used to initialize the instructions and user data. After this initialization, the process begins to deviate from the program it is running. Although modern programmers don’t normally modify instructions, the data does get modified. In addition, the process may acquire resources (more memory, open files, etc.) not present in the program.

While the process is running, the kernel keeps track of its threads, each of which is a separate flow of control through the instructions, all potentially reading and writing the same parts of the process’s data. Each thread has its own stack, however. When you are programming, you start with one thread, and that is all you get unless you execute a special system call to create another. So, beginners can think of a process as being single-threaded. Not every version of UNIX supports multiple threads. They are part of an optional feature called POSIX Threads, or “pthreads.”

Several concurrently running processes can be initialized from the same program. There is no functional relationship, however, between these processes. The kernel might be able to save memory by arranging for such processes to share instruction segments, but the processes involved can’t detect such sharing. By contrast, there is a strong functional relationship between threads in the same process.

A process’s system data includes attributes such as current directory, open file descriptors, accumulated CPU time, and so on. A process cannot access or modify its system data directly, since it is outside of its address space. Instead, there are various system calls to access or modify attributes.

A process is created by the kernel on behalf of a currently executing process, which becomes the parent of the new child process. The child inherits most of the parent’s system-data attributes. For example, if the parent has any files open, the child will have them open too. Heredity of this sort is absolutely fundamental to the operation of UNIX. This is different from a thread creating a new thread. Threads in the same process are equal in most respects, and there is no inheritance. All threads have equal access to all data and resources, not copies of them.

An application consists of one or more processes. A process, in the simplest terms, is an executing program. One or more threads run in the context of the process. A thread is the basic unit to which the operating system allocates processor time. A thread can execute any part of the process code, including parts currently being executed by another thread.

Each process provides the resources needed to execute a program. A process has a virtual address space, executable code, open handles to system objects, a security context, a unique process identifier, environment variables, a base priority, minimum and maximum working set sizes, and at least one thread of execution. Each process is started with a single thread, often called the primary thread, but can create additional threads from any of its threads.

A thread is the entity within a process that can be scheduled for execution. All threads of a process share its virtual address space and system resources. In addition, each thread maintains exception handlers, a scheduling priority, thread local storage, a unique thread identifier, and a set of structures the system will use to save the thread context until it is scheduled. The thread context includes the thread’s set of machine registers, the kernel stack, a thread environment block, and a user stack in the address space of the thread’s process. Threads can also have their own security context, which can be used for impersonating clients.

Microsoft Windows supports preemptive multitasking, which creates the effect of simultaneous execution of multiple threads from multiple processes. On a multiprocessor computer, the system can simultaneously execute as many threads as there are processors on the computer.

A job object allows groups of processes to be managed as a unit. Job objects are namable, securable, sharable objects that control attributes of the processes associated with them. Operations performed on the job object affect all processes associated with the job object.

1.3. What is the difference between a thread and a process?
A process is an OS-level task or service. A thread runs “inside” a process and may be virtual or simulated. Generally speaking, threads share resources like memory, where processes each have their own separate memory area, and need to take more elaborate steps to share resources.

Another name for thread is “lightweight process” to distinguish it from the “heavyweight” system processes.

In a nutshell, a process can contain multiple threads. In most multithreading operating systems, a process gets its own memory address space; a thread does not. Threads typically share the heap belonging to their parent process. Even though they share a common heap, threads have their own stack space. This is how one thread’s invocation of a method is kept separate from another’s. For instance, a JVM (Java Virtual Machine) runs in a single process in the host OS. Threads in the JVM share the heap belonging to that process; that is why several threads may access the same object. This is all a gross oversimplification, but it is accurate enough at a high level. Lots of details differ between operating systems.

Process is a execution of a program and program contains set of instructions but thread is a single sequence stream within the process. Single thread allows a OS to perform single task at a time. Thread is nothing but flow of execution whereas process is nothing but a group of instructions which is similar to that of a program except which may be stopped and started by the OS itself.

Similarities between process and threads are:

• Share CPU
• Sequential execution
• Create child
• If one thread is blocked then the next will be started to run like process.

Dissimilarities:

• Threads are not independent like process.
• All threads can access every address in the task unlike process.
• Threads are designed to assist one another and process might or might not be assisted on one another.

1.4. What is the difference between a lightweight and a heavyweight process?
Lightweight and heavyweight processes refer the mechanics of a multi-processing system.

In a lightweight process, threads are used to divide the workload. Here you would see one process executing in the OS (for this application or service.) This process would possess one or more threads. Each of the threads in this process shares the same address space. Because threads share their address space, communication between the threads is simple and efficient. Each thread could be compared to a process in a heavyweight scenario.

In a heavyweight process, new processes are created to perform the work in parallel. Here (for the same application or service), you would see multiple processes running. Each heavyweight process contains its own address space. Communication between these processes would involve additional communications mechanisms such as sockets or pipes.

The benefits of a lightweight process come from the conservation of resources. Since threads use the same code section, data section and OS resources, less overall resources are used.

The drawback is now you have to ensure your system is thread-safe. You have to make sure the threads do not step on each other. Fortunately, C++ or Java provides the necessary tools to allow you to do this.

1.5. What is a fiber?
A fiber is a unit of execution that must be manually scheduled by the application. Fibers run in the context of the threads that schedule them. Each thread can schedule multiple fibers. In general, fibers do not provide advantages over a well-designed multithreaded application. However, using fibers can make it easier to port applications that were designed to schedule their own threads.

1.6. What is multitasking?
A multitasking operating system divides the available processor time among the processes or threads that need it. The system is designed for preemptive multitasking; it allocates a processor time slice to each thread it executes. The currently executing thread is suspended when its time slice elapses, allowing another thread to run. When the system switches from one thread to another, it saves the context of the preempted thread and restores the saved context of the next thread in the queue.

The length of the time slice depends on the operating system and the processor. Because each time slice is small (approximately 20 milliseconds), multiple threads appear to be executing at the same time. This is actually the case on multiprocessor systems, where the executable threads are distributed among the available processors. However, you must use caution when using multiple threads in an application, because system performance can decrease if there are too many threads.

1.7. Describe synchronization with respect to multithreading.
With respect to multithreading, synchronization is the capability to control the access of multiple threads to shared resources. Without synchronization, it is possible for one thread to modify a shared variable while another thread is in the process of using or updating the same shared variable. This usually leads to significant errors.

1.8. Why is it sometimes necessary to synchronize the actions of multiple threads?
It is sometimes necessary to synchronize the actions of multiple threads to maintain read or write consistency between shared resources.

For example, say, account balance is $500.
Without synchronize
Assume two threads.
Thread one reads the value = $500
Thread two changes it to = $450
It gets blocked by CPU
Thread two reads the value = $500. It should have got the value $450.

Java implements this by monitors. C and C++ use semaphores.

1.9. What are mutex and semaphore? What is the difference between them?
A mutex is a synchronization object that allows only one process or thread to access a critical code block.

A semaphore on the other hand allows one or more processes or threads to access a critical code block. A semaphore is a multiple mutex.

1.10. How do you make methods thread-safe?
It is very difficult to prove that an object is thread-safe. The main rule of thumb for making thread-safe objects is, “Make all the instance variables private, and all the public accessor methods synchronized.” However, this is sometimes difficult to achieve in practice, due to exigencies of performance, architecture, or implementation.