You are on page 1of 33

File Management

Files
Contents

Overview of Unix File System


Boot Block
Super Block
Inode Table
Data Block
Data structures used by Kernel to represent an open file
Links
Hard Link
Soft Link/Symbolic Link
System Calls related to file : open,creat,read,write,lseek,dup,stat,fstat

2
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File System in UNIX

Everything in UNIX is a File.


The file system is a collection of files and directories on a disk in standard UNIX file
system format.
Each UNIX file system contains four major parts:
Boot Block
Super block
Inode Table
Data Blocks

3
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Boot Block

A boot block may contain several physical blocks(512 bytes).


A boot block contains a short loader program for booting (Bootstrap Loader)
and other initialization programs.

4
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Super Block

Super block contains the key information of the file system.


Super block information:
Size of a file system and status
Label
Size
Date
Information of i-nodes
Number of i-nodes
Number of free i-nodes
Information of data blocks

5
The contents here are for Aricent internal training purposes only and do not carry any commercial value
INODE

Inode contains the following information of the file


Mode
Type:file,directory,pipe,link
Access:read,write,execute
owner: who own the i-node (file, directory, ...)
timestamp: creation, modification, access time
size: the number of bytes
block count: the number of data blocks
direct blocks: pointers to the data
single indirect: pointer to a data block which pointers to the data blocks (128 data
blocks).(block size =512 bytes,block address = 4 bytes)
Double indirect: (128*128=16384 data blocks)
Triple indirect: (128*128*128 data blocks)

6
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Data Blocks

A data block has 512 bytes. (dependent on UNIX distribution).

A data block may contains data of files or data of a directory.

File is a stream of bytes.

7
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Addresses in an Inode

The inode of a file contains 15 pointer to the disk blocks containing the
file's data contents.
First 12 point to direct blocks.
Next three point to indirect blocks.
First indirect block pointer is the address of a single indirect blocks,
an index block containing the addresses of blocks that do contain data.
Second is a double-indirect-block pointer, the address of a block that contains the
addresses of blocks that contain pointer to the actual data blocks.
The third is a triple indirect pointer.

8
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Direct and Indirect Blocks in an Inode

Data
Inode Blocks

direct0
direct1
direct2
direct3
direct4
direct5
direct6
direct7
direct8
direct9
single
indirect
double
indirect
triple
indirect

9
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Addresses in an INODE

File Size: blocksize = 1K, block address is of three bytes in the disk resident inode,
converted to four bytes in memory (1024 / 4 = 256 block addresses in one block):
direct block address: 10K
indirect block addresses: 1024/4*1K=256K
double indirect block addresses: (1024/4)^2*1K = 64M
tripe indirect block addresses: (1024/4)^3 * 1K= 16G

10
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Data structures for open files

How do the kernel represents the open files?


Three tables are maintained by the kernel:
Process Table
Open File Table
Inode Table

11
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Process Table

Each process has an entry in the process table.Within each process table entry
is a table of open file descriptors .The information associated with each file
descriptor are as follows:
The file descriptor flags
A pointer to a file table entry

For every process ,by default file descriptor 0,1, and 2 are opened which
represents as follows:
Entry 0: standard input
Entry 1: standard output
Entry 2: error output

12
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Table

The kernel maintains a file table for all open files.Each file table entry
contains:
The file status flags for the file
The current file offset
A pointer to the i-node table entry for the file

13
The contents here are for Aricent internal training purposes only and do not carry any commercial value
I-node Table

Each open file has an inode that contains information about the type of file and
the other information related to file.
Mapping of a File Descriptor to an Inode.
System calls that refer to open files indicate the file is passing a file descriptor as an
argument.
The file descriptor is used by the kernel to index a table of open files for the current
process.
Each entry of the table contains a pointer to a file structure.
This file structure in turn points to the inode.

14
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Kernel data structures for open files

A single process having two different files (open on standard input(0) and on
standard output(1))
15
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Links

Hard Links
An entry in a directory consisting of a name and an i-number is called a hard link.
Hard Link is simply an alias to an existing file.
All hard links for a given inode refer to a same file ,so any changes for the file is
reflected both in the hard link as well as in the file.
To create a hard link.
ln file hardlink
For ex: ln abcfile abclink
Abclink is a hardlink to abcfile.
Contents of abclink is same as abcfile.
Removing abcfile will not free inode.

16
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Links

Soft Link/symbolic Link


A symbolic link is an inode which points to the original file.
A symbolic link is a small file containing the text of a path to the object you want
to link to.
A symbolic link can reference another symbolic link means that the kernel keeps
dereferencing symbolic links until it finds something that isnt a symbolic link.
To create a symbolic link
ln s filename softlink
ln s abcfile abclink
To remove a symbolic link
unlink (softlink)
unlink (abclink)

17
The contents here are for Aricent internal training purposes only and do not carry any commercial value
stat/fstat

stat/fstat: A process may query the status of a file (locked) file type, file owner,
access permission. file size, number of links, inode number, access time.
syntax:
stat (pathname, statbuffer);
fstat (fd, statbuffer);

18
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Information

stat( /usr/stud/gb/d.h, (struct stat *) buf)


Stat system call gives the following information:
dev_t st_dev; / *Id of device containing a directory entry for this file. */
ino_t st_ino; /* inode number */
umode_t st_mode; /* protection /file mode*/
nlink_t st_nlink; /* number of hard links */
uid_t st_uid; /* user ID of owner */
gid_t st_gid; /* group ID of owner */
dev_t st_rdev; /* device type (if inode device) */
off_t st_size; /* total size, in bytes */
unsigned long st_blksize; /* blocksize for filesystem I/O */
unsigned long st_blocks; /* number of blocks allocated */
time_t st_atime; /* time of last access */
time_t st_mtime; / * time of last modification */
time_t st_ctime; /* time of last change */

19
The contents here are for Aricent internal training purposes only and do not carry any commercial value
dup Functions

An existing file descriptor is duplicated by either of the following functions:


int dup(int filedes);
int dup2(int filedes, int filedes2);
Both return: new file descriptor if OK, -1 on error

The new file descriptor returned by dup is guaranteed to be the lowest numbered available file
descriptor. With dup2 we specify the value of the new descriptor with the filedes2 argument. If
filedes2 is already open, it is first closed. If filedes equals filedes2, then dup2 returns filedes2
without closing it
The new file descriptor that is returned as the value of the functions shares the same file table
entry as the filedes argument

20
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Kernel data structures after dup

Assuming the process executes newfd=dup(1)

21
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Handling System calls

Open()
Creat()
Read()
Write()
Lseek()
Close()
Dup()
Stat()/fstat()

22
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Handling System Calls

A file is opened or created by calling the open function


open (pathname , flag , mode)
Opens the specified file according to the value of flag & returns a file descriptor for
use in other system calls.
read (fd, buf, count) ,
Reads up to count bytes from file descriptor fd in to the user array buf and returns the
number of bytes read
write (fd , buf , count)
Write writes count bytes of data from user address space buf to the file whose
descriptor is fd.
close ( int fd)
Close closes the file descriptor fd obtained from a prior open , creat, dup, pipe or fcntl
system call or a file descriptor inherited from a fork call.

23
The contents here are for Aricent internal training purposes only and do not carry any commercial value
File Handling System calls

A file is created by calling the creat system call.


creat (filename, mode);
where filename is a pointer to null terminated character string that names the file
and mode specifies the file access permissions.
lseek (filedes, offset, whence);
filedes identifies the I/O channel and offset and whence works together to
describe how to change the file pointer.
Where whence denotes
SEEK_SET = 0 beginning of file
SEEK_CUR = 1 current offset of the file
SEEK_END = 2 end of the file

dup (filedes);
It duplicates the existing file descriptor. The new file descriptor is guaranteed to be
the lowest numbered

24
The contents here are for Aricent internal training purposes only and do not carry any commercial value
An Example of creat function

/*Program to demonstrate how to create File */

#include<stdio.h>
#include<sys/types.h>
#include<sys/stat.h>
main ()
{
int fd;
fd = creat ("datafile.dat , S_IRWXU);
if (fd == -1)
printf ("\nerror in opening file");
else
printf ("\ndatafile created for read and write and is currently empty\n");
close (fd);
exit (0);
}
25
The contents here are for Aricent internal training purposes only and do not carry any commercial value
An Example of lseek and open function

/* An example to demonstrate lseek() */


#include <stdio.h>
#include <fcntl.h>
int main()
{
int fd;
long position;
fd = open("open1.c", O_RDONLY);
if ( fd != -1)
{
position = lseek(fd, 0L, 2); /* seek 0 bytes from end-of-file */
if (position != -1)
printf("The length of datafile.dat is %ld bytes.\n", position);
else
perror("lseek error");
}
else
printf("can't open datafile.dat\n");
~ close(fd);
}
26
The contents here are for Aricent internal training purposes only and do not carry any commercial value
An Example of dup function

/*Program to demonstrate(dup) redirection of standard output to a file.*/


#include <stdio.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>
int main()
{
int fd;
fd = open("dup.txt",O_WRONLY | O_CREAT, S_IREAD | S_IWRITE );
if (fd == -1)
{
perror(dup.txt");
exit (1);
}
close(1); /* close standard output */
dup(fd); /* fd will be duplicated into standard out's slot */
close(fd); /* close the extra slot */
printf("Hello, world!\n"); /* should go to file dup.txt*/
exit (0); /* exit() will close the files */
}
27
The contents here are for Aricent internal training purposes only and do not carry any commercial value
An Example of open ,read,write function

/* Example to demonstrate for open system call */


#include<stdio.h>
#include<sys/types.h> /*define types used by sys/stat.h */
#include<sys/stat.h>
#include<fcntl.h>
static char message[] = "Good morning";
main ()
{
int fd;
char buffer[80];

/* open dup.txt for read/write access (O_RDWR)


create datafile.dat if it does not exist (O_CREAT)
return error if datafile already exists (O_EXCL)
permit read/write access to file (S_IWRITE | S_IREAD) */

fd = open ("foo.bar", O_RDWR | O_CREAT|O_TRUNC ,S_IREAD | S_IWRITE);

28
The contents here are for Aricent internal training purposes only and do not carry any commercial value
An Example of open,read,write ,lseek system call

if (fd != -1)
{
printf ("\ndatafile opened for read/write access\n");
write (fd, message, sizeof (message));~
lseek (fd, 0L, 0); /* go to beginning of file */

if (read (fd, message, sizeof (message)) == sizeof (message))


printf ("\n%s was written to data file", buffer);
else
printf ("\nerror reading datafile");
close (fd);
}

else
printf ("\nFile already exists");
exit (0);
}

29
The contents here are for Aricent internal training purposes only and do not carry any commercial value
stat
#include<sys/stat.h>
#include<stdio.h>
#include<fcntl.h>
#include<sys/types.h>
main()
{ int fd1,fd2,fd3;
struct stat bufStat1,bufStat2;

fd1=open("/etc/passwd",O_RDONLY);
fd2=open("filefstat.c",O_RDONLY);
fstat(fd1,&bufStat1);
fstat(fd2,&bufStat2);

printf("\nfd1=%d,inode no=%d block size =%d blocks=%d: total size=%d Dev


ice type=%d",fd1,bufStat1.st_ino,bufStat1.st_blksize,bufStat1.st_blocks,
bufStat1.st_size,bufStat1.st_rdev);

printf("\nfd2=%d,inode no=%d block size =%d blocks=%d: total size =%d De


vice Type = %d",fd2,bufStat2.st_ino,bufStat2.st_blksize,bufStat2.st_bloc
ks,bufStat2.st_size,bufStat2.st_rdev);
}
30
The contents here are for Aricent internal training purposes only and do not carry any commercial value
References

Advanced Programming UNIX Environment

Stevens

The design of the Unix Operating System

Maurice J Bach

Unix Internals

Uresh Vahalia

31
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Disclaimer

Aricent makes no representations or warranties with respect to contents of


these slides and the same are being provided as is. The content/materials in
the slides are of a general nature and are not intended to address the specific
circumstances of any particular individual or an institution; any materials not
specifically acknowledged is purely unintentional

32
The contents here are for Aricent internal training purposes only and do not carry any commercial value
Thank you

33
The contents here are for Aricent internal training purposes only and do not carry any commercial value

You might also like