You are on page 1of 10

MODULE II

Data types - Specification of data types, implementation of elementary data types, Declarations, type checking and type conversion - Assignment and Initialisation - Structured data types Specification of data structure types, Implementation of data structure type - Declarations and type checking for data structures.

2.1 DATA TYPES


A data type is a class of data objects together with a set of operations for creating and manipulating them. A program deals with particular data objects such as an array A, the integer variable X, or the file F, a programming language necessarily deals more commonly with data types such as the class of arrays,integers or files and the operations provided for manipulating array, integers or files. Every language has a set of primitive data types that are built into the language. In addition a language may provide facilities to allow the programmer to define new data types. The basic elements of a specification of a data type are as follows 1. The attribute that distinguish data objects of that type. 2. The values that the data objects of that type may have, and 3. The operations that define the possible manipulations of data objects of that type. The following are the basic elements of the implementation of a data type: 1. The storage representation that is used to represent the data objects of the data type in the storage of the computer during program execution, and 2. The manner in which the operations defined for the data type are represented in terms of algorithms or procedures that manipulate the chosen storage representation of the data object.

SPECIFICATION OF ELEMENTARY DATA TYPES An elementary data object contains a single data value.a class of such data objects over which various operations are defined is termed as an elementary data type. It include Attributes
1

Values Operations

Attributes
Distinguish data objects of a given type Data type and name - invariant during the lifetime of the object Approaches: stored in a descriptor and used during the program execution used only to determine the storage representation, not used explicitly during execution

Values
The data type determines the values that a data object of that type may have Specification: Usually an ordered set, i.e. it has a least and a greatest value

Operations
Operations define the possible manipulations of data objects of that type. Primitive - specified as part of the language definition Programmer-defined (as subprograms, or class methods) Operation signature . Specifies the domain and the range the number, order and data types of the the number, order and data type of the arguments in the domain, resulting range

mathematical notation for the specification: op name: arg type x arg type x x arg type result type

The action is specified in the operation implementation Sources of ambiguity in operations Undefined operations for certain inputs.
2

Implicit arguments, e.g. use of global variables Implicit results - the operation may modify its arguments

Self-modification - usually through change of local data between calls, e.g. random number generators change the seed.

1. IMPLEMENTATION OF ELEMENTARY DATA TYPES The implementation of elementary data type includes: Storage representation Implementation of operations

Storage representation
Influenced by the hardware Described in terms of : Size of the memory block required Layout of attributes and data values within the block

Implementation of operations

Hardware operation: direct implementation. E.g. integer addition Subprogram/function, e.g. square root operation

In-line code. Instead of using a subprogram, the code is copied into the program at the point where the subprogram would have been invoked

DECLARATIONS
Information about the name and type of data objects needed during program execution.

Explicit programmer defined Implicit system defined Examples FORTRAN - the first letter in the name of the variable determines the type Perl - the variable is declared by assigning a value $abc = 'a string' $abc = 7 $abc is an integer variable $abc is a string variable

Declaration of Operations
prototypes of the functions or subroutines that are programmer-defined.

Examples: declaration: float Sub(int, float) signature: Sub: int x float --> float

Purpose of Declaration Choice of storage representation Storage management Polymorphic operations Static type checking

TYPE CHECKING AND TYPE CONVERSION Type checking: checking that each operation executed by a program receives the proper number of arguments of the proper data types. Static type checking is done at compilation. Dynamic type checking is done at run-time Strong typing: all type errors can be statically checked Type inference: implicit data types, used if the interpretation is unambiguous.

Type Conversion and Coercion: Coercion: Implicit type conversion, performed by the system. Explicit conversion : routines to change from one data type to another. Pascal: the function round - converts a real type into integer C - cast, e.g. (int)X for float X converts the value of X to type integer

Coercion:
Two opposite approaches No coercions, any type mismatch is considered an error : Pascal, Ada Coercions are the rule. Only if no conversion is possible, error is reported.

ASSIGNMENT AND INITIALIZATION

Assignment:
- the basic operation for changing the binding of a value to a data object. The assignment operation can be defined using the concepts L-value and R-value L-value: Location R-value: Contents of that location.
5

for

an

object.

Value, by itself, generally means R-value Example A=A+B; Pick up contents of location A: Add contents of location B: Store result into address A: R-value of A R-value of B L-value of A

Initialization:
Uninitialized data object - a data object has been created, but no value is assigned, i.e. only allocation of a block storage has been performed. Initialization can be done in two ways Implicit and Explicit initializations.

STRUCTURED DATA TYPES


A data structure is a data object that contains other data objects as its elements or components. 1. 2.

Specifications
Number of components Fixed size - Arrays Variable size stacks, lists. Pointer is used to link components. Type of each component Homogeneous all components are the same type Heterogeneous components are of different types Selection mechanism to identify components index, pointer Two-step process: referencing the structure selection of a particular component
6

Maximum number of components Organization of the components: simple linear sequence simple linear sequence multidimensional structures: separate types (Fortran) vector of vectors (C++) Operations on data structures Component selection operations Sequential Random Insertion/deletion of components Whole-data structure operations Creation/destruction of data structures
3.

Implementation of data structure types


Storage representation Includes: a. storage for the components b. optional descriptor - to contain some or all of the attributes Sequential representation: the data structure is stored in a single contiguous block of storage, that includes both descriptor and components. Used for fixed-size structures, homogeneous structures (arrays, character strings) Linked representation: the data structure is stored in several noncontiguous blocks of storage, linked together through pointers. Used for variable-size structured (trees, lists) Stacks, queues, lists can be represented in either way. Linked representation is more flexible and ensures true variable size, however it has to be software simulated. Implementation of operations on data structures Component selection in sequential representation: Base address plus offset calculation. Add component size to current location to move to next component. Component selection in linked representation: Move from address location to address location following the chain of pointers.
7

Storage management Access paths to a structured data object - to endure access to the object for its processing. Created using a name or a pointer. Two central problems: Garbage the data object is bound but access path is destroyed. Memory cannot be unbound. Dangling references the data object is destroyed, but the access path still exists.

Declarations and type checking for data structures


What is to be checked: Existence of a selected component Type of a selected component

Vectors and arrays


A vector - one dimensional array A matrix - two dimensional array Multidimensional arrays A slice - a substructure in an array that is also an array, e.g. a column in a matrix. Implementation of array operations: Access - can be implemented efficiently if the length of the components of the array is known at compilation time. The address of each selected element can be computed using an arithmetic expression. a. Whole array operations, e.g. copying an array - may require much memory. Associative arrays Instead of using an integer index, elements are selected by a key value, that is a part of the element. Usually the elements are sorted by the key and binary search is performed to find an element in the array. Records .

A record is a data structure composed of a fixed number of components of different types. The components may be heterogeneous, and they are named with symbolic names. Specification of attributes of a record: Number of components Data type of each component Selector used to name each component. Implementation: Storage: single sequential block of memory where the components are stored sequentially. Selection: provided the type of each component is known, the location can be computed at translation time. Note on efficiency of storage representation: For some data types storage must begin on specific memory boundaries (required by the hardware organization). For example, integers must be allocated at word boundaries (e.g. addresses that are multiples of 4). When the structure of a record is designed, this fact has to be taken into consideration. Otherwise the actual memory needed might be more than the sum of the length of each component in the record. Here is an example:

struct employee { char Division; int IdNumber; };

The first variable occupies one byte only. The next three bytes will remain unused and then the second variable will be allocated to a word boundary. Careless design may result in doubling the memory requirements. Other structured data objects Records and arrays with structured components: a record may have a component that is an array, an array may be built out of components that are records. Lists and sets: lists are usually considered to represent an ordered sequence of elements, sets - to represent unordered collection of elements. Executable data objects
9

In most languages, programs and data objects are separate structures (Ada, C, C++). Other languages however do not distinguish between programs and data - e.g. PROLOG. Data structures are considered to be a special type of program statements and all are treated in the same way.

10

You might also like