The UNIXError! Hyperlink reference not valid. operating system was designed to let a number of programmers access the computer at the same time and share its resources.
The operating system coordinates the use of the computer's resources, allowing one person, for example, to run a spell check program while another creates a document, lets another edit a document while another creates graphics, and lets another user format a document -- all at the same time, with each user oblivious to the activities of the others.
The operating system controls all of the commands from all of the keyboards and all of the data being generated, and permits each user to believe he or she is the only person working on the computer.
This real-time sharing of resources make UNIX one of the most powerful operating systems ever.
Although UNIX was developed by programmers for programmers, it provides an environment so powerful and flexible that it is found in businesses, sciences, academia, and industry. Many telecommunications switches and transmission systems also are controlled by administration and maintenance systems based on UNIX.
While initially designed for medium-sized minicomputers, the operating system was soon moved to larger, more powerful mainframe computers. As personal computers grew in popularity, versions of UNIX found their way into these boxes, and a number of companies produce UNIX-based machines for the scientific and programming communities.
The uniqueness of UNIX
The features that made UNIX a hit from the start are:
Multitasking capability
Multiuser capability
Portability
UNIX programs
Library of application software
Multitasking
Many computers do just one thing at a time, as anyone who uses a PC or laptop can attest. Try logging onto your company's network while opening your browser while opening a word processing program. Chances are the processor will freeze for a few seconds while it sorts out the multiple instructions.
UNIX, on the other hand, lets a computer do several things at once, such as printing out one file while the user edits another file. This is a major feature for users, since users don't have to wait for one application to end before starting another one.
Multiusers
The same design that permits multitasking permits multiple users to use the computer. The computer can take the commands of a number of users -- determined by the design of the computer -- to run programs, access files, and print documents at the same time.
The computer can't tell the printer to print all the requests at once, but it does prioritize the requests to keep everything orderly. It also lets several users access the same document by compartmentalizing the document so that the changes of one user don't override the changes of another user.
System portability
A major contribution of the UNIX system was its portability, permitting it to move from one brand of computer to another with a minimum of code changes. At a time when different computer lines of the same vendor didn't talk to each other -- yet alone machines of multiple vendors -- that meant a great savings in both hardware and software upgrades.
It also meant that the operating system could be upgraded without having all the customer's data inputted again. And new versions of UNIX were backward compatible with older versions, making it easier for companies to upgrade in an orderly manner.
UNIX tools
UNIX comes with hundreds of programs that can divided into two classes:
Integral utilities that are absolutely necessary for the operation of the computer, such as the command interpreter, and
Tools that aren't necessary for the operation of UNIX but provide the user with additional capabilities, such as typesetting capabilities and e-mail.
Tools can be added or removed from a UNIX system, depending upon the applications required.
UNIX Communications
E-mail is commonplace today, but it has only come into its own in the business community within the last 10 years. Not so with UNIX users, who have been enjoying e-mail for several decades.
UNIX e-mail at first permitted users on the same computer to communicate with each other via their terminals. Then users on different machines, even made by different vendors, were connected to support e-mail. And finally, UNIX systems around the world were linked into a world wide web decades before the development of today's World Wide Web.
Applications libraries
UNIX as it is known today didn't just develop overnight. Nor were just a few people responsible for it's growth. As soon as it moved from Bell Labs into the universities, every computer programmer worth his or her own salt started developing programs for UNIX.
Today there are hundreds of UNIX applications that can be purchased from third-party vendors, in addition to the applications that come with UNIX.
How UNIX is organized
The UNIX system is functionally organized at three levels:
The kernel, which schedules tasks and manages storage;
The shell, which connects and interprets users' commands, calls programs from memory, and executes them; and
The tools and applications that offer additional functionality to the operating system
The three levels of the UNIX system: kernel, shell, and tools and applications.
The kernel
The heart of the operating system, the kernel controls the hardware and turns part of the system on and off at the programer's command. If you ask the computer to list (ls) all the files in a directory, the kernel tells the computer to read all the files in that directory from the disk and display them on your screen.
The shell
There are several types of shell, most notably the command driven Bourne Shell and the C Shell (no pun intended), and menu-driven shells that make it easier for beginners to use. Whatever shell is used, its purpose remains the same -- to act as an interpreter between the user and the computer.
The shell also provides the functionality of "pipes," whereby a number of commands can be linked together by a user, permitting the output of one program to become the input to another program.
Tools and applications
There are hundreds of tools available to UNIX users, although some have been written by third party vendors for specific applications. Typically, tools are grouped into categories for certain functions, such as word processing, business applications, or programming.
This chapter discusses the trials and tribulations of creating, maintaining, and repairing file systems. While these tasks may appear simple from a user's standpoint, they are, in fact, intricate and contain more than a handful of nuances . In the course of this chapter, we'll step through many of these nuances and, hopefully, come to a strong understanding of the hows and whys of file systems.
Before we really jump into the topic, you should have a good understanding of UNIX directories, files, permissions, and paths. These are the key building blocks in understanding how to administer your file systems, and I assume you already have a mastery of them. If the statement, "Be sure to have /usr/bin before /usr/local/bin in your $PATH" confuses you in any way, you should be reading something more fundamental first. Refer to Part I, "Introduction to UNIX," for some basic instructions in UNIX.
This chapter goes about the explanation of file systems a bit differently than other books. We first discuss the maintenance and repair of file systems, then discuss their creation. This was done because it is more likely that you already have existing file systems you need to maintain and fix. Understanding how to maintain them also helps you better understand why file systems are created the way they are.
The techniques we cover here are applicable to most UNIX systems currently in use. The only exceptions are when we actually create the file systems. This is where the most deviation from any standard (if there ever was one) occurs. We cover the creation of file systems under the SunOS, Solaris, Linux, and IRIX implementations of UNIX. If you are not using one of these operating systems, you should check the documentation that came with your operating system for details on the creation of file systems.
CAUTION: Working with file systems is inherently dangerous. You may be surprised at how quickly and easily you can damage a file system beyond repair. In some instances, it is even possible to damage the disk drive as well. BE CAREFUL. When performing the actions explained in this chapter, be sure you have typed the commands in correctly and you understand the resulting function fully before executing it. When in doubt, consult the documentation that came from the manufacturer. Most importantly, the documentation that comes from the manufacturer is always more authoritative than any book.
NOTE: You should read the entire chapter before actually performing any of the tasks below. This will give you a better understanding of how all the components work together, thereby giving you more solid ground when performing potentially dangerous activities.
The file system is the primary means of file storage in UNIX. Each file system houses directories, which, as a group, can be placed almost anywhere in the UNIX directory tree. The topmost level of the directory tree, the root directory, begins at /. Subdirectories nested below the root directory may traverse as deep as you like so long as the longest absolute path is less than 1,024 characters.
With the proliferation of vendor-enhanced versions of UNIX, you will find a number of "enhanced" file systems. From the standpoint of the administrator, you shouldn't have to worry about the differences too much. The two instances where you will need to worry about vendor-specific details are in the creation of file systems and when performing backups. We will cover the specifics of:
SunOS 4.1.x, which uses 4.2
Solaris, which uses ufs
Linux, which uses ext2
IRIX, which uses efs and xfs
Note that the ufs and 4.2 file systems are actually the same.
A file system, however, is only a part of the grand scheme of how UNIX keeps its data on disk. At the top level, you'll find the disks themselves. These disks are then broken into partitions, each varying in size depending on the needs of the administrator. It is on each partition that the actual file system is laid out. Within the file system, you'll find directories, subdirectories, and, finally, the individual files.
Although you will rarely have to deal with the file system at a level lower than the individual files stored on it, it is critical that you understand two key concepts: inodes and the superblock. Once you understand these, you will find that the behavior and characteristics of files make more sense.
inodes
An inode maintains information about each file. Depending on the type of file system, the inode can contain upwards of 40+ pieces of information. Most of it, however, is only useful to the kernel and doesn't concern us. The fields that do concern us are
mode The permission mask and type of file.
link count The number of directories that contain an entry with this inode number.
user ID The ID of the file's owner.
group ID The ID of the file's group.
size Number of bytes in this file.
access time The time at which the file was last accessed.
mod time The time at which the file was last modified.
inode time The time at which this inode structure was last modified.
block list A list of disk block numbers which contain the first segment of the file.
indirect list A list of other block lists.
The mode, link count, user ID, group ID, size, and access time are used when generating file listings. Note that the inode does not contain the file's name. That information is held in the directory file (see below for details).
Superblocks
This is the most vital information stored on the disk. It contains information on the disk's geometry (number of heads, cylinders, and so on), the head of the inode list, and free block list. Because of its importance, the system automatically keeps mirrors of this data scattered around the disk for redundancy. You only have to deal with superblocks if your file system becomes heavily corrupted.
Types of Files
Files come in 8 flavors:
Normal Files
Directories
Hard Links
Symbolic links
Sockets
Named Pipes
Character Devices
Block Devices
Normal Files These are the files you use the most. They can be either text or binary files; however, their internal structure is irrelevant from a System Administrator standpoint. A file's characteristics are specified by the inode in the file system that describes it. An ls -l on a normal file will look something like this:
-rw-------1 sshahadmin42 May 12 13:09 hello
Directories These are a special kind of file that contains a list of other files. Although there is a one-to-one mapping of inode to disk blocks, there can be a many-to-one mapping from directory entry to inode. When viewing a directory listing using the ls -l command, you can identify directories by their permissions starting with the d character. An ls -l on a directory looks something like this:
drwx------2 sshahadmin512 May 12 13:08 public_html
Hard Links
A hard link is actually a normal directory entry except instead of pointing to a unique file , it points to an already existing file . This gives the illusion that there are two identical files when you do a directory listing. Because the system sees this as just another file, it treats it as such. This is most apparent during backups because hard-linked files get backed up as many times as there are hard links to them. Because a hard link shares an inode, it cannot exist across file systems. Hard links are created with the ln command. For example, given this directory listing using ls -l, we see:
-rw-------1 sshahadmin42 May 12 13:04 hello
When you type ln hello goodbye and then perform another directory listing using ls -l, you see:
-rw-------2 sshahadmin42 May 12 13:04 goodbye
-rw-------2 sshahadmin42 May 12 13:04 hello
Notice how this appears to be two separate files that just happen to have the same file lengths. Also note that the link count (second column) has increased from one to two. How can you tell they actually are the same file? Use ls -il. Observe:
13180 -rw-------2 sshahadmin42 May 12 13:04 goodbye
13180 -rw-------2 sshahadmin42 May 12 13:04 hello
You can see that both point to the same inode, 13180.
WARNING: Be careful when creating hardlinks, especially when hardlinking to a directory. It is possible to corrupt a filesystem by doing so since the hardlink does not contain the fact that the i-node being pointed to needs to be treated as a directory.
A symbolic link (sometimes referred to as a symlink) differs from a hard link because it doesn't point to another inode but to another filename. This allows symbolic links to exist across file systems as well as be recognized as a special file to the operating system. You will find symbolic links to be crucial to the administration of your file systems, especially when trying to give the appearance of a seamless system when there isn't one. Symbolic links are created using the ln -s command. A common thing people do is create a symbolic link to a directory that has moved. For example, if you are accustomed to accessing the directory for your home page in the subdirectory www but at the new site you work at, home pages are kept in the public_html directory, you can create a symbolic link from www to public_html using the command ln -s public_html www. Performing an ls -l on the result shows the link.
drwx------2 sshahadmin512 May 12 13:08 public_html
lrwx------1 sshahadmin11 May 12 13:08 www -> public_html
Sockets
Sockets are the means for UNIX to network with other machines. Typically, this is done using network ports; however, the file system has a provision to allow for interprocess communication through socket files. (A popular program that uses this technique is the X Windows system.) You rarely have to deal with this kind of file and should never have to create it yourself (unless you're writing the program). If you need to remove a socket file, use the rm command. Socket files are identified by their permission settings beginning with an s character. An ls -l on a socket file looks something like this:
srwxrwxrwx1 rootadmin0 May 10 14:38 X0
Named Pipes
Similar to sockets, named pipes enable programs to communicate with one another through the file system. You can use the mknod command to create a named pipe. Named pipes are recognizable by their permissions settings beginning with the p character. An ls -l on a named pipe looks something like this:
prw-------1 sshahadmin0 May 12 22:02 mypipe
Character Devices
These special files are typically found in the /dev directory and provide a mechanism for communicating with system device drivers through the file system one character at a time. They are easily noticed by their permission bits starting with the c character. Each character file contains two special numbers, the major and minor. These two numbers identify which device driver that file communicates with. An ls -l on a character device looks something like this:
crw-rw-rw-1 rootwheel21,4 May 12 13:40 ptyp4
Block Devices
Block devices also share many characteristics with character devices in that they exist in the /dev directory, are used to communicate with device drivers, and have major and minor numbers. The key difference is that block devices typically transfer large blocks of data at a time versus one character at a time. (A hard disk is a block device, whereas a terminal is a character device.) Block devices are identified by their permission bits starting with the b character. An ls -l on a block device looks something like this:
brw-------2 rootstaff16,2 Jul 291992 fd0c
Managing File Systems
Managing file systems is relatively easy. That is, once you can commit to memory the location of all the key files in the directory tree on each major variation of UNIX as well as your own layout of file systems across the networkÉ
In other words, it can be a royal pain.
From a technical standpoint there isn't much to deal with. Once the file systems have been put in their correct places and the boot time configuration files have been edited so that your file systems automatically come online at every start up, there isn't much to do besides watch your disk space.
From a management standpoint, it's much more involved. Often you'll need to deal with existing configurations, which may not have been done "the right way," or you're dealing with site politics such as, "I won't let that department share my disks." Then you'll need to deal with users who don't understand why they need to periodically clean up their home directories. Don't forget the ever exciting vendor-specific nuisances and their idea of how the system "should be" organized.
This section covers the tools you need to manage the technical issues. Unfortunately, managerial issues are something that can't be covered in a book. Each site has different needs as well as different resources, resulting in different policies. If your site lacks any written policy, take the initiative to write one yourself.
Mounting and Unmounting File Systems
As I mentioned earlier in this chapter, part of the power in UNIX stems from its flexibility in placing file systems anywhere in the directory tree. This feat is accomplished by mounting file systems.
Before you can mount a file system, you need to select a mount point. A mount point is the directory entry in the file system where the root directory of a different file system will overlay it. UNIX keeps track of mount points, and accesses the correct file system, depending on which directory the user is currently in. A mount point may exist anywhere in the directory tree.
NOTE: While it is technically true that you can mount a file system anywhere in the directory tree, there is one place you will NOT want to mount it: the root directory. Remember that once a file system is mounted at a directory, that directory is overshadowed by the contents of the mounted file system. Hence, by mounting on the root directory, the system will no longer be able to see its own kernel or local configuration files. How long your system goes on before crashing depends on your vendor.
There is an exception to the rule. Some installation packages will mount a network file system to the root directory. This is done to give the installation software access to many packages that may not be able to fit on your boot disk. Unless you fully understand how to do this yourself, don't.
Mounting and Unmounting File Systems Manually To mount a file system, use the mount command:
mount /dev/device /directory/to/mount
where /dev/device is the device name you want to mount and /directory/to/mount is the directory you want to overlay in your local file system. For example, if you wanted to mount /dev/hda4 to the /usr directory, you would type:
mount /dev/hda4 /usr
Remember that the directory must exist in your local file system before anything can be mounted there.
There are options that can be passed to the mount command. The most important characteristics are specified in the -o option. These characteristics are:
rwread/write
roread only
bgbackground mount (if the mount fails,
place the process into the background
and keep trying until success.)
intrinterruptible mount (if a process is
pending I/O on a mounted partition, it
will allow the process to be interrupted
and the I/O call dropped)
An example of these parameters being used is:
mount -o rw,bg,intr /dev/hda4 /usr
See the man page on your system for vendor specific additions.
To unmount a file system, use the umount command. For example:
umount /usr
This unmounts the /usr file system from the current directory tree, unveiling the original directory underneath it.
There is, of course, a caveat. If users are using files on a mounted file system, you cannot unmount it. All files must be closed before this can happen, which on a large system can be tricky to say the least. There are three ways to handle this:
Use the lsof program (available at ftp://vic.cc.purdue.edu/pub/tools/unix/lsof) to list the users and their open files on a given file system. Then either wait until they are done, beg them to leave, or kill their processes off. Then unmount the file system. Often, this isn't very desirable.
Use the -f option with umount command to force the unmount. This is often a bad idea because it leaves the programs (and users) accessing the partition confused. Files which are in memory that have not been committed to disk may be lost.
Bring the system to single user mode, then unmount the file system. While the largest inconvenience, it is the safest way because no one loses any work.
Mounting File Systems Automatically At boot time, the system automatically mounts the root file system with read-only privileges. This enables it to load the kernel and read critical startup files. However, once it has bootstrapped itself, it needs guidance. Although it is possible for you to mount all the file systems by hand, it isn't realistic because you would then have to finish bootstrapping the machine yourself, and worse, the system could not come back online by itself. (Unless, of course, you enjoy coming into work at 2 a.m. to bring a system back up.)
To get around this, UNIX uses a special file called /etc/fstab (/etc/vfstab under Solaris). This file lists all the partitions that need to be mounted at boot time and the directory where they need to be mounted. Along with that information you can pass parameters to the mount command.
Each file system to be mounted is listed in the fstab file in the following format:
/dev/device Is the device to be mounted, for instance, /dev/hda4.
/dir/to/mount Is the location at which the file system should be mounted on your directory tree.
ftype Is the file system type. This should be 4.2 under SunOS, ufs under Solaris, ext2 under Linux, efs or xfs in IRIX (depending on your version), nfs for NFS mounted file systems, swap for swap partitions, and proc for the /proc file system. Some operating systems, such as Linux, support additional filesystem types, although they are not as likely to be used.
parameters Are the parameters we passed to mount using the -o option. They follow the same comma-delineated format. An example entry would look like rw,intr,bg.
fs_freq Is used by dump to determine whether a file system needs to be dumped.
fs_passno Is used by the fsck program to determine the order to check disks at boot time.
Any lines in the fstab file that start with the pound symbol (#) are considered comments.
If you need to mount a new file system while the machine is live, you must perform the mount by hand. If you wish to have this mount automatically active the next time the system is rebooted, you should be sure to add the appropriate entry to your fstab file.
There are two notable partitions that don't follow the same set of rules as normal partitions. They are the swap partition and /proc. (Note that SunOS does not use the /proc file system.)
Mounting the swap partition is not done using the mount command. It is instead managed by the swap command under Solaris and IRIX, and by the swapon command under SunOS and Linux. In order for a swap partition to be mounted, it must be listed in the appropriate fstab file. Once it's there, use the appropriate command (swap or swapon) with the -a parameter followed by the partition on which you've allocated swap space.
The /proc file system is even stranger because it really isn't a file system. It is an interface to the kernel abstracted into a file system style format. This should be listed in your fstab file with file system type proc.
TIP: If you need to remount a file system that already has an entry in the fstab file, you don't need to type in the mount command with all the parameters. Instead, simply pass the directory to mount as a parameter like this:
mount /dir/to/mount
mount automatically looks to the fstab file for all the details, such as which partition to mount and which options to use.
If you need to remount a large number of file systems that are already listed in the fstab file (in other words, you need to remount directories from a system that has gone down), you can use the -a option in the mount command to try and remount all the entries in the fstab file like this:
mount -a
If mount finds that a file system is already mounted, no action is performed on that file system. If, on the other hand, mount finds that an entry is not mounted, it automatically mounts it with the appropriate parameters.
In taking care of your system, you'll quickly find that you can use these commands and many of their parameters without having to look them up. This is because you're going to be using them all the time. I highly suggest you learn to love them.
NOTE: In reading this book you may have noticed the terms program and command are used interchangably. This is because there are no "built in" commands to the system, each one is invoked as an individual program. However, you will quickly find that both the people who use UNIX, as well as UNIX related texts (such as this one), use the both terms to mean the same thing. Confusing? A bit. But it's tough to change 25+ years of history.
NOTE: At the end of each command description, I mention the GNU equivalent. Linux users shouldn't worry about getting them, because Linux ships with all GNU tools. If you are using another platform and aren't sure whether you're using the GNU version, try running the command with the --version option. If it is GNU, it will display its title and version number. If it isn't GNU, it'll most likely reject the parameter and give an error.
df The df command summarizes the free disk space by file system. Running it without any parameters displays all the information about normally mounted and NFS mounted file systems. The output varies from vendor to vendor (under Solaris, use df -t) but should closely resemble this:
Filesystem1024-blocksUsed Available Capacity Mounted on
/dev/hda32478712129092216191%/
/dev/hda650717155073259132%/var
/dev/hda7481998154570870%/local
server1:/var/spool/mail
48970222242221831050%/var/spool/mail
The columns reported show: Filesystem Which file system is being shown. File systems mounted using NFS are shown as hostname:/dir/that/is/mounted
1024-blocks The number of 1 KB blocks the file system consists of. (Its total size.)
Used The number of blocks used.
Available The number of blocks available for use.
Capacity Percentage of partition currently used.
Mounted on The location in the directory tree this partition has been mounted on.
Common parameters to this command are:
directory Show information only for the partition on which the specified directory exists.
-a Show all partitions including swap and /proc.
-i Show inode usage instead of block usage.
The GNU df program, which is part of the fileutils distribution, has some additional print formatting features you may find useful. You can download the latest fileutils package at ftp://ftp.cdrom.com/pub/gnu.
du The du command summarizes disk usage by directory. It recurses through all subdirectories and shows disk usage by each subdirectory with a final total at the end. Running it without any parameters shows the usage like so:
409./doc
945./lib
68./man
60./m4
391./src
141./intl
873./po
3402.
The first column shows the blocks of disk used by the subdirectory, and the second column shows the subdirectory being evaluated. To see how many kilobytes each subdirectory consumes, use the -k option. Some common parameters to this command are
directory Show usage for the specified directory. The default is the current directory.
-a Show usage for all files, not just directories.
-s Show only the total disk usage.
Like the df program, this program is available as part of the GNU fileutils distribution. The GNU version has expanded on many of the parameters which you may find useful. The fileutils package can be downloaded from ftp://ftp.cdrom.com/pub/gnu.
ln The ln program is used to generate links between files. This is very useful for creating the illusion of a perfect file system in which everything is in the "right" place when, in reality, it isn't. This is done by making a link from the desired location to the actual location.
The usage of this program is
ln file_being_linked_to link_name
where file_being_linked_to is the file that already exists, and you wish to have another file point to it called link_name. The command above generates a hard link, meaning that the file link_name will be indistinguishable from the original file. Both files must exist on the same file system.
A popular parameter to ln is the -s option, which generates symbolic links instead of hard links. The format of the command remains the same:
ln -s file_being_linked_to link_name
the difference being that the link_name file is marked as a symbolic link in the file system. Symbolic links may span file systems and are given a special tag in the directory entry.
TIP: Unless there is an explicit reason not to, you should always use symbolic links by specifying the -s option to ln. This makes your links stand out and makes it easy to move them from one file system to another.
tar The tar program is an immensely useful archiving utility. It can combine an entire directory tree into one large file suitable for transferring or compression.
The command line format of this program is:
tar parameters filelist
Common parameters are:
c Create an archive
x Extract the archive
v Be verbose
f Specify a tar file to work on
p Retain file permissions and ownerships
t View the contents of an archive.
Unlike most other UNIX commands, the parameters do not need to have a dash before them.
To create the tarfile myCode.tar, I could use tar in the following manners:
tar cf myCode.tar myCode
where myCode is a subdirectory relative to the current directory where the files I wish to archive are located.
tar cvf myCode.tar myCode
Same as the previous tar invocation, although this time it lists all the files added to the archive on the screen.
tar cf myCode.tar myCode/*.c
This archives all the files in the myCode directory that are suffixed by .c
tar cf myCode.tar myCode/*.c myCode/*.h
This archives all the files in the myCode directory that are suffixed by .c or .h
To view the contents of the myCode.tar file, use:
tar tf myCode.tar
To extract the files in the myCode.tar file, use:
tar xf myCode.tar
If the myCode directory doesn't exist, tar creates it. If the myCode directory does exist, any files in that directory are overwritten by the ones being untarred.
tar xvf myCode.tar
Same as the previous invocation of tar, but this lists the files as they are being extracted.
tar xpf myCode.tar
Same as the previous invocation of tar, but this attempts to set the permissions of the unarchived files to the values they had before archiving (very useful if you're untarring files as root).
While the stock tar that comes with your system works fine for most uses, you may find that the GNU version of tar has some nicer features. You can find the latest version of GNU tar at ftp://ftp.cdrom.com/pub/gnu.
find Of the commands that I've mentioned so far, you're likely to use find the most. Its purpose is to find files or patterns of files. The parameters for this tool are
find dir parameters
where dir is the directory where the search begins, and parameters define what is being searched for. The most common parameters you will use are:
-name Specify the filename or wildcards to look for. If you use any wildcards, be sure to place them inside of quotes so that the shell doesn't parse them before find does.
-print Typically turned on by default, it tells find to display the resulting file list.
-exec Executes the specified command on files found matching the -name criteria.
-atime n File was last accessed n days ago.
-mtime n File's data was last modified n days ago.
-size n[bckw] File uses n units of space where the units are specified by either b,c,k, or w. b is for 512 byte blocks, c is bytes, k is kilobytes, and w is two-byte words.
This starts its search from the root directory and finds all files named core that have not been modified in seven days.
find / -xdev -atime +60 -a -mtime +60 -print
This searches all files, from the root directory down, on the local file system, which have not been accessed for at least 60 days and have not been modified for at least 60 days, and prints the list. This is useful for finding those files that people claim they need but, in reality, never use.
find /home -size +500k -print
This searches all files from /home down and lists them if they are greater than 500 KB in size. A handy way of finding large files in the system.
The GNU version of find, which comes with the findutils package, offers many additional features you will find useful. You can download the latest version from ftp://ftp.cdrom.com/pub/gnu.
Repairing File Systems with fsck
Sooner or later, it happens: Someone turns off the power switch. The power outage lasts longer than your UPS's batteries and you didn't shut down the system. Someone presses the reset button. Someone overwrites part of your disk. A critical sector on the disk develops a flaw. If you run UNIX long enough, eventually a halt occurs where the system did not write the remaining cached information (sync'ed) to the disks.
When this happens, you need to verify the integrity of each of the file systems. This is necessary because if the structure is not correct, using the file systems could quickly damage them beyond repair. Over the years, UNIX has developed a very sophisticated file system integrity check that can usually recover the problem. It's called fsck.
The fsck Utility
The fsck utility takes its understanding of the internals of the various UNIX file systems and attempts to verify that all the links and blocks are correctly tied together. It runs in five passes, each of which checks a different part of the linkage and each of which builds on the verifications and corrections of the prior passes.
fsck walks the file system, starting with the superblock. It then deals with the allocated disk blocks, pathnames, directory connectivity, link reference counts, and the free list of blocks and inodes.
The Superblock Every change to the file system affects the superblock, which is why it is cached in RAM. Periodically, at the sync interval, it is written to disk. If it is corrupted, fsck checks and corrects it. If it is so badly corrupted that fsck cannot do its work, find the paper you saved when you built the file system and use the -b option with fsck to give it an alternate superblock to use. The superblock is the head of each of the lists that make up the file system, and it maintains counts of free blocks and inodes.
Inodes fsck validates each of the inodes. It makes sure that each block in the block allocation list is not on the block allocation list in any other inode, that the size is correct, and that the link count is correct. If the inodes are correct, then the data is accessible. All that's left is to verify the pathnames.
What Is a Clean (Stable) File System?
Sometimes fsck responds
/opt: stable(ufs file systems)
This means that the superblock is marked clean and that no changes have been made to the file system since it was marked clean. First, the system marks the superblock as dirty; then it starts modifying the rest of the file system. When the buffer cache is empty and all pending writes are complete, it goes back and marks the superblock as clean. If the superblock is marked clean, there is normally no reason to run fsck, so unless fsck is told to ignore the clean flag, it just prints this notice and skips over this file system.
Where Is fsck?
When you run fsck, you are running an executable in either the /usr/sbin or /bin directory called fsck, but this is not the real fsck. It is just a dispatcher that invokes a file system type-specific fsck utility.
When Should I Run fsck?
Normally, you do not have to run fsck. The system runs it automatically when you try to mount a file system at boot time that is dirty. However, problems can creep up on you. Software and hardware glitches do occur from time to time. It wouldn't hurt to run fsck just after performing the monthly backups.
CAUTION: It is better to run fsck after the backups rather than before. If fsck finds major problems, it could leave the file system in worse shape than it was prior to running. Then you can just build an empty file system and reread your backup, which also cleans up the file system. If you did it in the other order, you would be left with no backup and no file system.
Because the system normally runs it for you, running fsck is not an everyday occurrence for you to remember. However, it is quite simple and mostly automatic.
First, to run fsck, the file system you intend to check must not be mounted. This is a bit hard to do if you are in multiuser mode most of the time, so to run a full system fsck you should bring the system down to single user mode.
In single user mode you need to invoke fsck, giving it the options to force a check of all file systems, even if they are already stable.
fsck -f(SunOS)
fsck -o f(Solaris)
fsck(Linux and IRIX)
If you wish to check a single specific file system, type its character device name. (If you aren't sure what the device name is, see the section on adding a disk to the system for details on how to determine this information.) For example:
fsck /dev/hda1
Stepping Through an Actual fsck fsck occurs in five to seven steps, depending on your operating system and what errors are found, if any. fsck can automatically correct most of these errors and does so if invoked at boot time to automatically check a dirty file system.
The fsck we are about to step through was done on a ufs file system. While there are some differences between the numbering of the phases for different file systems, the errors are mostly the same, requiring the same solutions. Apply common sense liberally to any invocation of fsck and you should be okay.
Checking ufs File Systems For ufs file systems, fsck is a five-phase process. fsck can automatically correct most of these errors and does so if invoked at boot time to automatically check a dirty file system. However, when you run fsck manually, you are asked to answer the questions that the system would automatically answer.
CAUTION: Serious errors reported by the ufs fsck at the very beginning, especially before reporting the sta8rt of phase 1, indicate an invalid superblock. fsck should be terminated and restarted with the -b option specifying one of the alternate superblocks. Block 32 is always an alternate and can be tried first, but if the front of the file system was overwritten, it too may be damaged. Use the hard copy you saved from the mkfs to find an alternate later in the file system.
Phase 1: Check Blocks and Sizes This phase checks the inode list, looking for invalid inode entries. Errors requiring answers include
UNKNOWN FILE TYPE I=inode number (CLEAR)
The file type bits are invalid in the inode. Options are to leave the problem and attempt to recover the data by hand later or to erase the entry and its data by clearing the inode.
PARTIALLY TRUNCATED INODE I=inode number (SALVAGE)
The inode appears to point to less data than the file does. This is safely salvaged, because it indicates a crash while truncating the file to shorten it.
block BAD I=inode number
block DUP I=inode number
The disk block pointed to by the inode is either out of range for this inode or already in use by another file. This is an informational message. If a duplicate block is found, phase 1b is run to report the inode number of the file that originally used this block.
Phase 2: Check Pathnames This phase removes directory entries from bad inodes found in phase 1 and 1b and checks for directories with inode pointers that are out of range or pointing to bad inodes. You may have to handle
ROOT INODE NOT DIRECTORY (FIX?)
You can convert inode 2, the root directory, back into a directory, but this usually means there is major damage to the inode table.
I=OUT OF RANGE I=inode number NAME=file name (REMOVE?)
UNALLOCATED I=inode number OWNER=O MODE=M SIZE=S MTIME=T TYPE=F
(REMOVE?)
BAD/DUP I=inode number OWNER=O MODE=M SIZE=S MTIME=T TYPE=F (REMOVE?)
A bad inode number was found, an unallocated inode was used in a directory, or an inode that had a bad or duplicate block number in it is referenced. You are given the choice to remove the file, losing the data, or to leave the error. If you leave the error, the file system is still damaged, but you have the chance to try to dump the file first and salvage part of the data before rerunning fsck to remove the entry.
fsck may return one of a variety of errors indicating an invalid directory length. You will be given the chance to have fsck fix or remove the directory as appropriate. These errors are all correctable with little chance of subsequent damage.
Phase 3: Check Connectivity This phase detects errors in unreferenced directories. It creates or expands the lost+found directory if needed and connects the misplaced directory entries into the lost+found directory. fsck prints status messages for all directories placed in lost+found.
Phase 4: Check Reference Counts This phase uses the information from phases 2 and 3 to check for unreferenced files and incorrect link counts on files, directories, or special files.
UNREF FILE I=inode number OWNER=O MODE=M SIZE=S MTIME=T (RECONNECT?)
The filename is not known (it is an unreferenced file), so it is reconnected into the lost+found directory with the inode number as its name. If you clear the file, its contents are lost. Unreferenced files that are empty are cleared automatically.
LINK COUNT FILE I=inode number OWNER=O MODE=M SIZE=S MTIME=T COUNT=X
(ADJUST?)
LINK COUNT DIR I=inode number OWNER=O MODE=M SIZE=S MTIME=T COUNT=X
(ADJUST?)
In both cases, an entry was found with a different number of references than what was listed in the inode. You should let fsck adjust the count.
BAD/DUP FILE I=inode number OWNER=O MODE=M SIZE=S MTIME=T (CLEAR)
A file or directory has a bad or duplicate block in it. If you clear it now, the data is lost. You can leave the error and attempt to recover the data, and rerun fsck later to clear the file.
Phase 5: Check Cylinder Groups This phase checks the free block and unused inode maps. It automatically corrects the free lists if necessary, although in manual mode it asks permission first.
What Do I Do After fsck Finishes?
First, relax, because fsck rarely finds anything seriously wrong, except in cases of hardware failure where the disk drive is failing or where you copied something on top of the file system. UNIX file systems are very robust.
However, if fsck finds major problems or makes a large number of corrections, rerun it to be sure the disk isn't undergoing hardware failure. It shouldn't find more errors in a second run. Then, recover any files that it may have deleted. If you keep a log of the inodes it clears, you can go to a backup tape and dump the list of inodes on the tape. Recover just those inodes to restore the files.
Back up the system again, because there is no reason to have to do this all over again.
Dealing with What Is in lost+found
If fsck reconnects unreferenced entries, it places them in the lost+found directory. They are safe there, and the system should be backed up in case you lose them while trying to move them back to where they belong. Items in lost+found can be of any type: files, directories, special files (devices), and so on. If it is a named pipe or socket, you may as well delete it. The process which opened it is long since gone and will open a new one when it is run again.
For files, use the owner name to contact the user and have him look at the contents and see if the file is worth keeping. Often, it is a file that was deleted and is no longer needed, but the system crashed before it could be fully removed.
For directories, the files in the directory should help you and the owner determine where they belong. You can look on the backup tape lists for a directory with those contents if necessary. Then just remake the directory and move the files back. Then remove the directory entry in lost+found. This re-creation and move has the added benefit of cleaning up the directory.
Creating File Systems
Now that you understand the nuances of maintaining a file system, it's time to understand how they are created. This section walks you through the three steps of:
Picking the right kind of disk for your system
Creating partitions
Creating the file system
Disk Types
Although there are many different kinds of disks, UNIX systems have come to standardize on SCSI for workstations. Many PCs also sport SCSI interfaces, but because of the lower cost and abundance, you'll find a lot of IDE drives on UNIX PC's as well.
SCSI itself comes in a few different flavors now. There is regular SCSI, SCSI-2, SCSI-Wide, SCSI-Fast and Wide, and now SCSI-3. Although it is possible to mix and match these devices with converter cables, you may find it both easier on your sanity and your performance if you stick to one format. As of this writing, SCSI-2 is the most common interface.
When attaching your SCSI drive, there are many important points to remember.
Terminate your SCSI chain. Forgetting to do this causes all sorts of non-deterministic behavior (a pain to track down). SCSI-2 requires active termination, which is usually indicated by terminators with LEDs on them.
If a device claims to be self-terminating, you can take your chances, but you'll be less likely to encounter an error if you put a terminator on anyway.
There is a limit of eight devices on a SCSI chain with the SCSI card counting as a device. Some systems may have internal SCSI devices, so be sure to check for those.
Be sure all your devices have unique SCSI IDs. A common symptom of having two devices with the same ID is their tendency to frequently reset the SCSI chain. Of course, many devices simply won't work under those conditions.
When adding or removing a SCSI disk, be sure to power the system down first. There is power running through the SCSI cables, and failing to shut them down first may lead to problems in the future.
Although SCSI is king of the workstation, PCs have another choice: IDE. IDE tends to be cheaper and more available than SCSI devices with many motherboards offering direct IDE support. The advantage of using this kind of interface is its availability as well as lower cost. They are also simpler and require less configuration on your part.
The down side to IDEs is that their simplicity comes at the cost of configurability and expandability. The IDE chain can only hold two devices, and not all motherboards come with more than one IDE chain. If your CD-ROM is IDE, you only have space for one disk. This is probably okay with a single person workstation, but as you can imagine, it's not going to fly well in a server environment. Another consideration is speed. SCSI was designed with the ability to perform I/O without the aid of the main CPU, which is one of the reasons it costs more. IDE, on the other hand, was designed with cost in mind. This resulted in a simplified controller; hence, the CPU takes the burden for working the drive.
While IDE did manage to simplify the PC arena, it did come with the limitation of being unable to handle disks greater than 540M. Various tricks were devised to circumvent this, however, the clean solution is now predominantly available. Known as EIDE (Enhanced IDE), it is capable of supporting disks up to 8G and can support up to 4 devices on one chain.
In weighing the pros and cons of EIDE versus SCSI in the PC environment, don't forget to think about the cost-to-benefit ratio. Having a high speed SCSI controller in a single person's workstation may not be as necessary as the user is convinced it is. Plus, with disks being released in 2+ gigabyte configurations, there is ample room on the typical IDE disk.
Once you have decided on the disk subsystem to install, read the documentation that came with the machine for instructions on physically attaching the disk to the system.
What Are Partitions and Why Do I Need Them?
Partitions are UNIX's way of dividing the disk into usable pieces. UNIX requires that there be at least one partition; however, you'll find that creating multiple partitions, each with a specific function, is often necessary.
The most visible reason for creating separate partitions is to protect the system from the users. The one required partition mentioned earlier is called the root partition. It is here that critical system software and configuration files (the kernel and mount tables) must reside. This partition must be carefully watched so that it never fills up. If it fills up, your system may not be able to come back up in the event of a system crash. Because the root partition is not meant to hold the users' data, you must create separate partitions for the users' home directories, temporarily files, and so forth. This enables their files to grow without the worry of crowding out the key system files.
Dual boot configurations are becoming another common reason to partition, especially with the ever-growing popularity of Linux. You may find your users wanting to be able to boot to either Windows or Linux; therefore, you need to keep at least two partitions to enable them to do this.
The last, but certainly not least, reason to partition your disks is the issue of backups. Backup software often works by dumping entire partitions onto tape. By keeping the different types of data on separate partitions, you can be explicit about what gets backed up and what doesn't. For example, daily backup of the system software isn't necessary, but backups of home directories are. By keeping the two on separate partitions, you can be more concise in your selection of what gets backed up and what doesn't.
Another example relates more to company politics. It may be possible that one group does not want their data to be backed up to the same tape as another group's. (Note: common sense doesn't always apply to inter-group politicsÉ) By keeping the two groups on separate partitions, you can exclude one from your normal backups and exclude the others during your special backups.
Which Partitions To Create As I mentioned earlier, the purpose of creating partitions is to separate the users from the system areas. So how many different partitions need to be created? While there is no right answer for every installation, here are some guidelines to take into account.
You always need a root partition. In this partition, you'll have your /bin, /etc, and /sbin directories at the very least. Depending on your version of UNIX, this could require anywhere from 30 to 100 megabytes. /tmp The /tmp directory is where your users, as well as programs, store temporarily files. The usage of this directory can quickly get out of hand, especially if you run a quota-based site. By keeping it a separate partition, you do not need to worry about its abuse interfering with the rest of the system. Many operating systems automatically clear the contents of /tmp on boot. Size /tmp to fit your site's needs. If you use quotas, you will want to make it a little larger, whereas sites without quotas may not need as much space.
Under Solaris, you have another option when setting up /tmp. Using the tmpfs filesystem, you can have your swap space and /tmp partition share the same physical location on disk. While it appears to be an interesting idea, you'll quickly find that it isn't a very good solution, especially on a busy system. This is because as more users do their work, more of /tmp will be used. Of course, if there are more users, there is a greater memory requirement to hold them all. The competition for free space can become very problematic.
/var The /var directory is where the system places its spool files (print spool, incoming/outgoing mail queue, and so on) as well as system log files. Because of this, these files constantly grow and shrink with no warning. Especially the mail spool. Another possibility to keep in mind is the creation of a separate partition just for mail. This enables you to export the mail spool to all of your machines without having to worry about your print spools being exported as well. If you use a backup package that requires its own spool space, you may wish to keep this a separate partition as well.
/home The /home directory is where you place your users' account directories. You may need to use multiple partitions to keep your home directories (possibly broken up by department) and have each partition mount to /home/dept where dept is the name of the respective department.
/usr The /usr directory holds noncritical system software, such as editors and lesser used utilities. Many sites hold locally compiled software in the /usr/local directory where they either export it to other machines, or mount other machines' /usr/local to their own. This makes it easy for a site to maintain one /usr/local directory and share it amongst all of its machines. Keeping this a separate partition is a good idea since local software inevitably grows.
swap This isn't a partition you actually keep files on, but it is key to your system's performance. The swap partition should be allocated and swapped to instead of using swap files on your normal file system. This enables you to contain all of your swap space in one area that is out of your way. A good guideline for determining how much swap space to use is to double the amount of RAM installed on your system.
TIP: Several new versions of UNIX are now placing locally compiled software in the /opt directory. Like /usr/local, this should be made a separate partition as well. If your system does not use /opt by default, you should make a symbolic link from there to /usr/local. The vice versa is true as well, if your system uses /opt, you should create a symbolic link from /usr/local to /opt.
To add to the confusion, the Redhat Distribution of Linux has brought the practice of installing precompiled software (RPMs) in the /usr/bin directory. If you are using Redhat, you may want to make your /usr directory larger since locally installed packages will consume that partition.
Most implementations of UNIX automatically create the correct device entry when you boot it with the new drive attached. Once this entry has been created, you should check it for permissions. Only root should be given read/write access to it. If your backups run as a nonroot user, you may need to give group read access to the backup group. Be sure that no one else is in the backup group. Allowing world read/write access to the disk is the easiest way to have your system hacked, destroyed, or both.
Device entries under Linux IDE disks under Linux use the following scheme to name the hard disks:
/dev/hd[drive][partition]
Each IDE drive is lettered starting from a. So the primary disk on the first chain is a; the slave on the first chain is b; the primary on the secondary chain is c; and so on. Each disk's partition is referenced by number. For example, the third partition of the slave drive on the first chain is /dev/hdb3.
SCSI disks use the same scheme except instead of using /dev/hd as the prefix, /dev/sd is used. So to refer to the second partition of the first disk on the SCSI chain, you would use /dev/sda2.
To refer to the entire disk, specify all the information except the partition. For example, to refer to the entire primary disk on the first IDE chain, you would use /dev/hda.
Device entries under IRIX SCSI disks under IRIX are referenced in either the /dev/dsk or /dev/rdsk directories. The following is the format:
/dev/[r]dsk/dksCdSP
where C is the controller number, S is the SCSI address, and P is the partition, s0,s1,s2, and so on. The partition name can also be vh for the volume header or vol to refer to the entire disk.
Device entries under Solaris The SCSI disks under Solaris are referenced in either the /dev/dsk or /dev/rdsk directories. The following is the format:
/dev/[r]dsk/cCtSd0sP
where C is the controller number, S is the SCSI address, and P is the partition number. Partition 2 always refers to the entire disk and label information. Partition 1 is typically used for swap.
Device entries under SunOS Disks under SunOS are referenced in the /dev directory. The following is the format:
/dev/sdTP
where T is the target number and P is the partition. Typically, the root partition is a, the swap partition is b, and the entire disk is referred to as partition c. You can have partitions from a through f.
An important aspect to note is an oddity with the SCSI target and unit numbering: Devices that are target three need to be called target zero, and devices that are target zero need to be called target three.
A Note About Formatting Disks
"Back in the old days," disks needed to be formatted and checked for bad blocks. The procedure of formatting entailed writing the head, track, and sector numbers in a sector preamble and a checksum in the postamble to every sector on the disk. At the same time, any sectors that were unusable due to flaws in the disk surface were marked and, depending on the type of disk, an alternate sector mapped into its place.
Thankfully, we have moved on.
Both SCSI and IDE disks now come pre-formatted from the factory. Even better, they transparently handle bad blocks on the disk and remap them without any assistance from the operating system.
CAUTION: You should NEVER attempt to low level format an IDE disk.
Doing so will make your day very bad as you watch the drive quietly kill itself. Be prepared to throw the disk away should you feel the need to low level format it.
In this section, we will cover the step by step procedure for partitioning disks under Linux, IRIX, SunOS, and Solaris. Since the principles are similar across all platforms, each platform will also cover another method of determining how a disk should be partitioned up depending on its intended usage.
Linux To demonstrate how partitions are created under Linux, we will setup a disk with a single user workstation in mind. It will need not only space for system software, but for application software and the user's home directories.
Creating Partitions For this example, we'll create the partitions on a 1.6 GB IDE disk located on /dev/hda. This disk will become the boot device for a single user workstation. We will create the boot /usr, /var, /tmp, /home, and swap partitions.
During the actual partitioning, we don't name the partitions. Where the partitions are mounted is specified with the /etc/fstab file. Should we choose to mount them in different locations later on, we could very well do that. However, by keeping the function of each partition in mind, we have a better idea of how to size them.
A key thing to remember with the Linux fdisk command is that it does not commit any changes made to the partition table to disk until you explicitly do so with the w command.
With the drive installed, we begin by running the fdisk command:
# fdisk /dev/hda
This brings us to the fdisk command prompt. We start by using the p command to print what partitions are currently on the disk.
Command (m for help): p
Disk /dev/hda: 64 heads, 63 sectors, 786 cylinders
Units = cylinders of 4032 * 512 bytes
Device BootBeginStartEndBlocksIdSystem
Command (m for help):
We see that there are no partitions on the disk. With 1.6 GB of space, we can be very liberal with allocating space to each partition. Keeping this policy in mind, we begin creating our partitions with the n command:
Command (m for help): n
eextended
pprimary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-786): 1
Last cylinder or +size or +sizeM or +sizeK ([1]-786): +50M
Command (m for help):
The 50 MB partition we just created becomes our root partition. Because it is the first partition, it is referred to as /dev/hda1. Using the p command, we see our new partition:
Command (m for help): p
Disk /dev/hda: 64 heads, 63 sectors, 786 cylinders
Units = cylinders of 4032 * 512 bytes
Device BootBeginStartEndBlocksIdSystem
/dev/hda1112652384+83Linux native
Command (m for help):
With the root partition out of the way, we will create the swap partition. Our sample machine has 32 MB of RAM and will be running X-Windows along with a host of development tools. It is unlikely that the machine will get a memory upgrade for a while, so we'll allocate 64 MB to swap.
Command (m for help): n
Command action
eextended
pprimary partition (1-4)
p
Partition number (1-4): 2
First cylinder (27-786): 27
Last cylinder or +size or +sizeM or +sizeK ([27]-786): +64M
Command (m for help):
Because this partition is going to be tagged as swap, we need to change its file system type to swap using the t command.
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): 82
Changed system type of partition 2 to 82 (Linux swap)
Command (m for help):
Because of the nature of the user, we know that there will be a lot of local software installed on this machine. With that in mind, we'll create /usr with 500 MB of space.
Command (m for help): n
Command action
eextended
pprimary partition (1-4)
p
Partition number (1-4): 3
First cylinder (60-786): 60
Last cylinder or +size or +sizeM or +sizeK ([60]-786): +500M
If you've been keeping your eyes open, you've noticed that we can only have one more primary partition to use, but we want to have /home, /var, and /tmp to be in separate partitions. How do we do this?
Extended partitions.
The remainder of the disk is created as an extended partition. Within this partition, we can create more partitions for use. Let's create this extended partition:
Command (m for help): n
Command action
eextended
pprimary partition (1-4)
e
Partition number (1-4): 4
First cylinder (314-786): 314
Last cylinder or +size or +sizeM or +sizeK ([314]-786): 786
Command (m for help):
We can now create /home inside the extended partition. Our user is going to need a lot of space, so we'll create a 500 MB partition. Notice that we are no longer asked whether we want a primary or extended partition.
Command (m for help): n
First cylinder (314-786): 314
Last cylinder or +size or +sizeM or +sizeK ([314]-786): +500M
Command (m for help):
Using the same pattern, we create a 250 MB /tmp and a 180 MB /var partition.
Command (m for help): n
First cylinder (568-786): 568
Last cylinder or +size or +sizeM or +sizeK ([568]-786): +250M
Command (m for help): n
First cylinder (695-786): 695
Last cylinder or +size or +sizeM or +sizeK ([695]-786): 786
Command (m for help):
Notice on the last partition we created that I did not specify a size, but instead specified the last track. This is to ensure that all of the disk is used.
Using the p command, we look at our final work:
Command (m for help): p
Disk /dev/hda: 64 heads, 63 sectors, 786 cylinders
Units = cylinders of 4032 * 512 bytes
Device BootBeginStartEndBlocksIdSystem
/dev/hda1112652384+83 Linux native
/dev/hda22727596652882Linux swap
/dev/hda3606031351206483Linux native
/dev/hda43143147869535685Extended
/dev/hda5314314567512032+83Linux native
/dev/hda6568568694256000+83Linux native
/dev/hda7695695786185440+83Linux native
Command (m for help):
Everything looks good. To commit this configuration to disk, we use the w command:
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
(Reboot to ensure the partition table has been updated.)
Syncing disks.
Reboot the machine to ensure that the partition has been updated and you're done creating the partitions.
Creating File Systems in Linux Creating a partition alone isn't very useful. In order to make it useful, we need to make a file system on top of it. Under Linux, this is done using the mke2fs command and the mkswap command.
To create the file system on the root partition, we use the following commands:
mke2fs /dev/hda1
The program only takes a few seconds to run and generates output similar to this:
Writing superblocks and file system accounting information: done
You should make a note of these superblock backups and keep them in a safe place. Should the day arise that you need to use fsck to fix a superblock gone bad, you will want to know where the backups are.
Simply do this for all of the partitions, except for the swap partition.
To create the swap file system, you need to use the mkswap command like this:
mkswap /dev/hda2
Replace /dev/hda2 with the partition you chose to make your swap space.
The result of the command will be similar to:
Setting up swapspace, size = 35090432 bytes
And the swap space is ready.
To make the root file system bootable, you need to install the lilo boot manager. This is part of all the standard Linux distributions, so you shouldn't need to hunt for it on the Internet.
Simply modify the /etc/lilo.conf file so that /dev/hda1 is set to be the boot disk and run:
lilo
The resulting output should look something like:
Added linux *
where linux is the name of the kernel to boot, as specified by the name= field in /etc/lilo.conf.
SunOS In this example, we will be preparting a Seagate ST32550N as an auxiliary disk to an existing system. The disk will be divided into three partitions: one for use as a mail spool, one for use as a /usr/local, and the third as an additional swap partition.
Once a disk has been attached to the machine, you should verify its connection and SCSI address by running the probe-scsi command from the PROM monitor if the disk is attached to the internal chain, or the probe-scsi-all command to see all the SCSI devices on the system. When you are sure the drive is properly attached and verified to be functioning, you're ready to start accessing the drive from the OS.
After the machine has booted, run the dmesg command to collect the system diagnostic messages. You may want to pipe the output to grep so that you can easily find the information on disks. For example:
dmesg | grep sd
On our system this generated the following output:
This result tells us that we have an installed disk on sd0 that the system is aware of and using. The information from the sd1 device is telling us that it found a disk, but it isn't usable because of a corrupt label. Don't worry about the error. Until we partition the disk and create file systems on it, the system doesn't know what to do with it, hence the error.
If you are using SCSI address 0 or 3, remember the oddity we mentioned earlier where device 0 needs to be referenced as 3 and device 3 needs to be referenced as 0.
Even though we do not have to actually format the disk, we do need to use the format program that come with SunOS because it also creates the partitions and writes the label to the disk.
To invoke the format program, simply run:
format sd1
where sd1 is the name of the disk we are going to partition.
The format program displays the following menu:
FORMAT MENU:
disk- select a disk
type- select (define) a disk type
partition- select (define) a partition table
current- describe the current disk
format- format and analyze the disk
repair- repair a defective sector
show- translate a disk address
label- write label to the disk
analyze- surface analysis
defect- defect list management
backup- search for backup labels
quit
format>
We need to enter type at the format> prompt so that we can tell SunOS the kind of disk we have. The resulting menu looks something like:
AVAILABLE DRIVE TYPES:
0. Quantum ProDrive 80S
1. Quantum ProDrive 105S
2. CDC Wren IV 94171-344
3. SUN0104
...
13. other
Specify disk type (enter its number):
Because we are adding a disk this machine has not seen before, we need to select option 13, other. This begins a series of prompts requesting the disk's geometry. Be sure to have this information from the manufacturer before starting this procedure.
The first question, Enter number of data cylinders: is actually a three-part question. After you enter the number of data cylinders, the program asks for the number of alternative cylinders and then the number of physical cylinders. The number of physical cylinders is the number your manufacturer provided you. Subtract two from there to get the number of data cylinders, and then just use the default value of 2 for the number of alternate cylinders. For our Seagate disk, we answered the questions as follows:
Enter number of data cylinders: 3508
Enter number of alternate cylinders [2]: 2
Enter number of physical cylinders [3510]: 3510
Enter number of heads: 11
Enter number of data sectors/track: 108
Enter rpm of drive [3600]:
Enter disk type name (remember quotes): "SEAGATE ST32550N"
selecting sd1:
[disk formatted, no defect list found]
No defined partition tables.
Note that even though our sample drive actually rotates at 7200 rpm, we stick with the default of 3600 rpm because the software will not accept entering a higher speed. Thankfully, this doesn't matter because the operating system doesn't use the information.
Even though format reported that the disk was formatted, it really wasn't. It only acquired information needed to later write the label.
Now we are ready to begin preparations to partition the disk.
These preparations entail computing the amount each cylinder holds and then approximating the number of cylinders we want in each partition.
With our sample disk, we know that each cylinder is composed of 108 sectors on a track, with 11 tracks composing the cylinder.
From the information we saw in dmesg, we know that each block is 512 bytes long. Hence, if we want our mail partition to be 1 GB in size, we perform the following math to compute the necessary blocks:
Obviously, there are some rounding errors since the exact one GB mark occurs in the middle of a cylinder and we need to keep each partition on a cylinder boundary. 1,765 cylinders is more than close enough. The 1,765 cylinders translates to 2,096,820 blocks.
The new swap partition we want to make needs to be 64 MB in size. Using the same math as before, we find that our swap needs to be 130,680 blocks long. The last partition on the disk needs to fill the remainder of the disk. Knowing that we have a 2 GB disk, a 1 GB mail spool, and a 64 MB swap partition, this should leave us with about 960 MB for /usr/local.
Armed with this information, we are ready to tackle the partitioning. From the format> prompt, type partition to start the partitioning menu. The resulting screen looks something like this:
format> partition
PARTITION MENU:
a- change 'a' partition
b- change 'b' partition
c- change 'c' partition
d- change 'd' partition
e- change 'e' partition
f- change 'f' partition
g- change 'g' partition
h- change 'h' partition
select - select a predefined table
name- name the current table
print- display the current table
label- write partition map and label to the disk
quit
partition>
To create our mail partition, we begin by changing partition a. At the partition> prompt, type a.
partition> a
This brings up a prompt for entering the starting cylinder and the number of blocks to allocate. Because this is going to be the first partition on the disk, we start at cylinder 0. Based on the math we did earlier, we know that we need 2,096,820 blocks.
partition a - starting cyl0, # blocks0 (0/0/0)
Enter new starting cyl [0]: 0
Enter new # blocks [0, 0/0/0]: 2096820
partition>
Now we want to create the b partition, which is traditionally used for swap space. We know how many blocks to use based on our calculations, but we don't know which cylinder to start from.
To solve this, we simply display the current partition information for the entire disk using the p command:
partition> p
Current partition table (unnamed):
partition a - starting cyl0, # blocks2096820 (1765/0/0)
partition b - starting cyl0, # blocks0 (0/0/0)
partition c - starting cyl0, # blocks0 (0/0/0)
partition d - starting cyl0, # blocks0 (0/0/0)
partition e - starting cyl0, # blocks0 (0/0/0)
partition f - starting cyl0, # blocks0 (0/0/0)
partition g - starting cyl0, # blocks0 (0/0/0)
partition h - starting cyl0, # blocks0 (0/0/0)
partition>
We can see that partition a is allocated with 2,096,820 blocks and is 1,765 cylinders long. Because we don't want to waste space on the disk, we start the swap partition on cylinder 1765.
(Remember to count from zero!)
partition> b
partition b - starting cyl0, # blocks0 (0/0/0)
Enter new starting cyl [0]: 1765
Enter new # blocks [0, 0/0/0]: 130680
partition>
Before we create our last partition, we need to take care of some tradition first, namely partition c. This is usually the partition that spans the entire disk. Before creating this partition, we need to do a little math.
108 cylinders x 11 heads x 3508 data cylinders = 4167504 blocks
Notice that the number of blocks we compute here does not match the number actually on the disk. This number was computed based on the information we entered when giving the disk type information.
It is important that we remain consistent.
Since the c partition spans the entire disk, we specify the starting cylinder as 0. Creating this partition should look something like this:
partition> c
partition c - starting cyl0, # blocks0 (0/0/0)
Enter new starting cyl [0]: 0
Enter new # blocks [0, 0/0/0]: 4167504
partition>
We have only one partition left to create: /usr/local. Because we want to fill the remainder of the disk, we need to do one last bit of math to compute how many blocks are still free.
This is done by taking the size of partition c (the total disk) and subtracting the sizes of the existing partitions. For our example, this works out to be:
Now we need to find out which cylinder to start from.
To do so, we run the p command again:
partition> p
Current partition table (unnamed):
partition a - starting cyl 0, # blocks2096820 (1765/0/0)
partition b - starting cyl1765, # blocks130680 (110/0/0)
partition c - starting cyl0, # blocks4167504 (3508/0/0)
partition d - starting cyl0, # blocks0 (0/0/0)
partition e - starting cyl0, # blocks0 (0/0/0)
partition f - starting cyl0, # blocks0 (0/0/0)
partition g - starting cyl0, # blocks0 (0/0/0)
partition h - starting cyl0, # blocks0 (0/0/0)
partition>
To figure out which cylinder to start from, we add the number of cylinders used so far. Remember not to add the cylinders from partition c since it encompasses the entire disk.
1765 + 110 = 1875
Now that we know which cylinder to start from and how many blocks to make it, we create our last partition.
partition> d
partition d - starting cyl0, # blocks0 (0/0/0)
Enter new starting cyl [0]: 1875
Enter new # blocks [0, 0/0/0]: 1940004
partition>
Congratulations! You've made it through the ugly part. Before we can truly claim victory, we need to commit these changes to disk using the label command. When given the prompt, Ready to label disk, continue? simply answer y.
partition> label
Ready to label disk, continue? y
partition>
To leave the format program, type quit at the partition> prompt, and then quit again at the format> prompt.
Creating File Systems Now comes the easy part. Simply run the newfs command on all the partitions we created except for the swap partition and the entire disk partition . Your output should look similar to this:
# newfs sd1a
/dev/rsd1a:2096820 sectors in 1765 cylinders of 11 tracks, 108 sectors
1073.6MB in 111 cyl groups (16 c/g, 9.73MB/g, 4480 i/g)
Be sure to note the superblock backups. This is critical information when fsck discovers heavy corruption in your file system. Remember to add your new entries into /etc/fstab if you want them to automatically mount on boot.
If you created the first partition with the intention of making it bootable, you have a few more steps to go. First, mount the new file system to /mnt.
# mount /dev/sd1a /mnt
Once the file system is mounted, you need to clone your existing boot partition using the dump command like this:
# cd /mnt
# dump 0f - / | restore -rf -
With the root partition cloned, use the installboot command to make it bootable:
Be sure to test your work by rebooting and making sure everything mounts correctly. If you created a bootable partition, be sure you can boot from it now. Don't wait for a disaster to find out whether or not you did it right.
Solaris For this example, we are partitioning a disk that is destined to be a web server for an intranet. We need a minimal root partition, adequate swap, tmp, var, and usr space, and a really large partition, which we'll call /web. Because the web logs will remain on the /web partition, and there will be little or no user activity on the machine, /var and /tmp will be set to smaller values. /usr will be a little larger because it may be destined to house web development tools.
TIP: In another wondrous effort on its part to be just a little different, Sun has decided to call partitions "slices." With the number of documents regarding the file system so vast, you'll find that not all of them have been updated to use this new term, so don't be confused by the mix of "slices" with "partitions"--they are both the same.
Once a disk has been attached to the machine, you should verify its connection and SCSI address by running the probe-scsi command from the PROM monitor if the disk is attached to the internal SCSI chain, probe-scsi-all to list all the SCSI devices on the system Once this shows that the drive is properly attached and verified to be functioning, you're ready to start accessing the drive from the OS. Boot the machine and login as root.
In order to find the device name, we are going to use for this, we again use the dmesg command.
From this message, we see that our new disk is device /dev/[r]dsk/c0t1d0s2. The disk hasn't been set up for use on a Solaris machine before, which is why we received the corrupt label error.
If you recall the layout of Solaris device names, you'll remember that the last digit on the device name is the partition number. Noting that, we see that Solaris refers to the entire disk in partition 2, much the same way SunOS refers to the entire disk as partition c.
Before we can actually label and partition the disk, we need to create the device files. This is done with the drvconfig and disks commands. They should be invoked with no parameters:
# drvconfig ; disks
Now that the kernel is aware of the disk, we are ready to run the format command to partition the disk.
# format /dev/rdsk/c0t1d0s2
This brings up the format menu as follows:
FORMAT MENU:
disk- select a disk
type- select (define) a disk type
partition- select (define) a partition table
current- describe the current disk
format- format and analyze the disk
repair- repair a defective sector
label- write label to the disk
analyze- surface analysis
defect- defect list management
backup- search for backup labels
verify- read and display labels
save - save new disk/partition definitions
inquiry- show vendor, product and revision
volname- set 8-character volume name
quit
format>
To help the format command with partitioning, we need to tell it the disk's geometry by invoking the type command at the format> prompt. We will then be asked to select what kind of disk we have. Because this is the first time this system is seeing this disk, we need to select other. This should look something like this:
format> type
AVAILABLE DRIVE TYPES:
0. Auto configure
1. Quantum ProDrive 80S
2. Quantum ProDrive 105S
3. CDC Wren IV 94171-344
. . .
16. other
Specify disk type (enter its number): 16
The system now prompts for the number of data cylinders. This is two less than the number of cylinders the vendor specifies because Solaris needs two cylinders for bad block mapping.
Enter number of data cylinders: 3508
Enter number of alternate cylinders[2]: 2
Enter number of physical cylinders[3510]: 3510
The next question can be answered from the vendor specs as well.
Enter number of heads: 14
The followup question about drive heads can be left as default.
Enter physical number of heads[default]:
The last question you must answer can be pulled from the vendor specs as well.
Enter number of data sectors/track: 72
The remaining questions should be left as default.
Enter number of physical sectors/track[default]:
Enter rpm of drive[3600]:
Enter format time[default]:
Enter cylinder skew[default]:
Enter track skew[default]:
Enter tracks per zone[default]:
Enter alternate tracks[default]:
Enter alternate sectors[default]:
Enter cache control[default]:
Enter prefetch threshold[default]:
Enter minimum prefetch[default]:
Enter maximum prefetch[default]:
The last question you must answer about the disk is its label information. Enter the vendor name and model number in double quotes for this question. For our sample disk, this would be:
Enter disk type name (remember quotes): "SEAGATE ST32550N"
With this information, Solaris makes creating partitions easy. Dare I say, fun?
After the last question from the type command, you will be placed at the format> prompt. Enter partition to start the partition menu.
format>partition
PARTITION MENU:
0- change '0' partition
1- change '1' partition
2- change '2' partition
3- change '3' partition
4- change '4' partition
5- change '5' partition
6- change '6' partition
7- change '7' partition
select - select a predefined table
modify - modify a predefined partition table
name- name the current table
print- display the current table
label- write partition map and label to the disk
quit
partition>
At the partition> prompt, enter modify to begin creating the new partitions. This brings up a question about what template to use for partitioning. We want the All Free Hog method.
partition> modify
Select partitioning base:
0. Current partition table (unnamed)
1. All Free Hog
Choose base (enter number)[0]? 1
The All Free Hog method enables you to select one partition to receive the remainder of the disk once you have allocated a specific amount of space for the other partitions. For our example, the disk hog would be the /web partition because you want it to be as large as possible.
As soon as you select option 1, you should see the following screen:
PartTagFlagCylindersSizeBlocks
0rootwm00(0/0/0)
1swapwu00(0/0/0)
2backupwu0 - 35071.99GB(3508/0/0)
3unassignedwm00(0/0/0)
4unassignedwm00(0/0/0)
5unassignedwm00(0/0/0)
6usrwm00(0/0/0)
7unassignedwm00(0/0/0)
Do you wish to continue creating a new partition
table based on above table [yes]? yes
Because the partition table appears reasonable, agree to use it as a base for your scheme. You will now be asked which partition should be the Free Hog Partition, the one that receives whatever is left of the disk when everything else has been allocated.
For our scheme, we'll make that partition number 5.
Free Hog Partition[6]? 5
Answering this question starts the list of questions asking how large to make the other partitions. For our web server, we need a root partition to be about 200 MB for the system software, a swap partition to be 64 MB, a /tmp partition to be 200 MB, a /var partition to be 200 MB, and a /usr partition to be 400 MB. Keeping in mind that partition 2 has already been tagged as the "entire disk" and that partition 5 will receive the remainder of the disk, you will be prompted as follows:
Enter size of partition '0' [0b, 0c, 0.00mb]: 200mb
Enter size of partition '1' [0b, 0c, 0.00mb]: 64mb
Enter size of partition '3' [0b, 0c, 0.00mb]: 200mb
Enter size of partition '4' [0b, 0c, 0.00mb]: 200mb
Enter size of partition '6' [0b, 0c, 0.00mb]: 400mb
Enter size of partition '7' [0b, 0c, 0.00mb]: 0
As soon as you finish answering these questions, the final view of all the partitions appears looking something like:
PartTagFlagCylindersSizeBlocks
0rootwm0 - 344200.13mb(345/0/0)
1swapwu345 - 45564.39mb(111/0/0)
2backupwu0 - 35071.99GB(3508/0/0)
3unassignedwm 456 - 800200.13mb(345/0/0)
4unassignedwm801 - 1145200.13mb(345/0/0)
5unassignedwm1146 - 2817969.89mb(1672/0/0)
6unassignedwm2818 - 3507400.25mb(690/0/0)
7unassignedwm00(0/0/0)
This is followed by the question:
Okay to make this the correct partition table [yes]? yes
Answer yes since the table appears reasonable. This brings up the question:
Enter table name (remember quotes): "SEAGATE ST32550N"
Answer with a description of the disk you are using for this example. Remember to include the quote symbols when answering. Given all of this information, the system is ready to commit this to disk. As one last check, you will be asked:
Ready to label disk, continue? y
As you might imagine, we answer yes to the question and let it commit the changes to disk. You have now created partitions and can quit the program by entering quit at the partition> prompt and again at the format> prompt.
Creating file systems To create a file system, simply run:
# newfs /dev/c0t1d0s0
where /dev/c0t1d0s0 is the partition on which to create the file system. Be sure to create a file system on all the partitions except for partitions 2 and 3, the swap, and entire disk, respectively. Be sure to note the backup superblocks that were created. This information is very useful when fsck is attempting to repair a heavily damaged file system.
After you create the file systems, be sure to enter them into the /etc/vfstab file so that they are mounted the next time you reboot.
If you need to make the root partition bootable, you still have two more steps. The first is to clone the root partition from your existing system to the new root partition using:
# mount /dev/dsk/c0t1d0s0 /mnt
# ufsdump 0uf - / | ufsrestore -rf -
Once the file root partition is cloned, you can run the installboot program like this:
Be sure to test your new file systems before you need to rely on them in a disaster situation.
IRIX For this example, we are creating a large scratch partition for a user who does modeling and simulations. Although IRIX has many GUI-based tools to perform these tasks, it is always a good idea to learn the command line versions just in case you need to do any kind of remote administration.
Creating partitions Once the drive is attached, run a program called hinv to take a "hardware inventory." On the sample system, you saw the following output:
...
Integral SCSI controller 1: Version WD33C93B, revision D
Disk drive: unit 6 on SCSI controller 1
Integral SCSI controller 0: Version WD33C93B, revision D
Disk drive: unit 1 on SCSI controller 0
...
Our new disk is external to the system, so we know it is residing on controller 1. Unit 6 is the only disk on that chain, so we know that it is the disk we just added to the system.
To partition the disk, run the fx command without any parameters. It prompts us for the device name, controller, and drive number. Choose the default device name and enter the appropriate information for the other two questions.
On our sample system, this would look like:
# fx
fx version 6.2, Mar9, 1996
fx: "device-name" = (dksc)
fx: ctlr# = (0) 1
fx: drive# = (1) 6
fx: lun# = (0)
...opening dksc(1,6,0)
...controller test...OK
Scsi drive type == SEAGATE ST32550N0022
----- please choose one (? for help, .. to quit this menu)-----
[exi]t[d]ebug/[l]abel/
[b]adblock/[exe]rcise/[r]epartition/
fx>
We see that fx found our Seagate and is ready to work with it. From the menu we select r to repartition the disk. fx displays what it knows about the disk and then presents another menu specifically for partitioning the disk.
fx> r
----- partitions-----
parttypecylsblocksMegabytes(base+size)
7: xfs3 + 35213570 + 41899902 + 2046
8: volhdr0 + 30 + 35700 + 2
10: volume0 + 35240 + 41935600 + 2048
capacity is 4194058 blocks
----- please choose one (? for help, .. to quit this menu)-----
[ro]otdrive[u]srrootdrive[o]ptiondrive[re]size
fx/repartition>
Looking at the result, we see that this disk has never been partitioned in IRIX before. Part 7 represents the amount of partitionable space, part 8 the volume header, and part 10 the entire disk.
Because this disk is going to be used as a large scratch partition, we want to select the optiondrive option from the menu. After you select that, you are asked what kind of file system you want to use. IRIX 6 and above defaults to xfs, while IRIX 5 defaults to efs. Use the one appropriate for your version of IRIX.
Our sample system is running IRIX 6.3, so we accept the default of xfs:
fx/repartition> o
fx/repartition/optiondrive: type of data partition = (xfs)
Next we are asked whether we want to create a /usr log partition. Because our primary system already has a /usr partition, we don't need one here. Type no.
fx/repartition/optiondrive: create usr log partition? = (yes) no
The system is ready to partition the drive. Before it does, it gives one last warning allowing you to stop the partitioning before it completes the job. Because you know you are partitioning the correct disk, you can give it "the go-ahead":
Warning: you must reinstall all software and restore user data from backups after changing the partition layout. Changing partitions causes all data on the drive to be lost. Be sure you have the drive backed up if it contains any user data. Continue? y
The system takes a few seconds to create the new partitions on the disk. Once it is done, it reports what the current partition list looks like.
----- partitions-----
parttypecylsblocksMegabytes(base+size)
7: xfs3 + 35213570 + 41899902 + 2046
8: volhdr0 + 30 + 35700 + 2
10: volume0 + 35240 + 41935600 + 2048
capacity is 4194058 blocks
----- please choose one (? for help, .. to quit this menu)-----
[ro]otdrive[u]srrootdrive[o]ptiondrive[re]size
fx/repartition>
Looks good. We can exit fx now by typing .. at the fx/repartition> prompt and exit at the fx> prompt.
Our one large scratch partition is now called /dev/dsk/dks1d6s7.
Creating the filesystem To create the file system, we use the mkfs command like this:
Remember to add this entry into the /etc/fstab file so that the system automatically mounts the next time you reboot.
Summary
As you've seen in this chapter, creating, maintaining, and repairing filesystems is not a trivial task. It is, however, a task which should be well understood. An unmaintained file system can quickly lead to trouble and without its stability, the remainder of the system is useless.
Let's make a quick rundown of the topics we covered:
Disks are broken into partitions (sometimes called slices).
Each partition has a file system.
A file system is the primary means of file storage in UNIX.
File systems are made of inodes and superblocks.
Some partitions are used for raw data such as swap.
The /proc file system really isn't a file system, but an abstraction to kernel data.
An inode maintains critical file information.
Superblocks track disk information as well as the location of the heads of various inode lists.
In order for you to use a file system, it must be mounted.
No one must be accessing a file system in order for it to be unmounted.
File systems can be mounted anywhere in the directory tree.
/etc/fstab (vfstab in Solaris) is used to by the system to automatically mount file systems on boot.
The root file system should be kept away from users.
The root file system should never get filled.
Be sure to watch how much space is being used.
fsck is the tool to use to repair file systems.
Don't forget to terminate your SCSI chain!
In short, file systems administration is not a trivial task and should not be taken lightly. Good maintenance techniques not only help maintain your uptime, but your sanity as well.