You are on page 1of 40

Multimedia

Perception Representation

presentation
Storage Transmission Paper & monitors are Physicalinformation isby Various physicalspace Presentation space has How means used presentation means Physicalof information Nature media used to All data means-cables one or morecomputer represented dimensions Value determines how for storingreproduce system to internally perceived information transport by humans or wireless information sto humans informationfrom various Monitors have two to thedata computer Media represented Information Exchange Presentation space & Value Presentation Dimension

Key properties of multimedia

Discrete & continuous media


Independent media Computer-controlled system Integration

The multimedia application should support process at leastintegrated to and one Computer-controlled independent media stream can be one discrete from a global We medium Media used of multimediamedia in a computer-controlled way in processing should be independent continuous need system capablethat, they provide a certain function system so

Characterizing Data Streams

Asynchronous transmission mode Synchronous transmission mode Isochronous transmission mode

Periodic signals ,pertaining to transmission in which the time interval The relationship of two or more repetitive signals that have simultaneous of the sender and any two corresponding transmission is equaldata can be interval or separating the receiver do not need coordinate before to the unit transmitted significant instance to a multiple of the unit interval

Characterizing continuous media DataStream

Periodicity

Variation of consecutive packet size

Contiguous packets

Strongly periodic Are contiguous packets are transmitted directly one after Strongly Regular
PCM coded any delay? transmission Uncompressed digital data another/ speech in telephony behaves this way. Weakly Periodic Weakly regular Can seen as the utilization of A periodicbestandard frame size I:P:B is 10:1:2 resource Mpeg Cooperative applications with shared windows Irregular Connected / unconnected data streams

Audio Technology
Sound Frequency Is is the scientific study Sound is caused by Frequency (f), measured of sound perception. in More (Hz), isis theit is Amplitudedisturbances hertz specifically, physical (A) the number ofmolecules at magnitude cycles signal of air of the a wave the branch of science completes in one second a givenstudyingin time (t) instant the (s) psychological and physiological responses associated with sound (including speech and music).

Amplitude

Sound Perception & psychoacoustic

Sound Perception & psychoacoustic

The Physical Acoustic Perspective

The Psychoacousti c Perspective

Audio Representation on computer


The digitization process takes place in circuits called analog-to-digital converters (ADCs)

Process of Digitization
Sampling Quantizing

Sampling

Quantizing

WHAT IS THREE-DIMENSIONAL SOUND PROJECTION WITH NEAT DIAGRAM?


:The first sound playback before an auditorium can be compared to the first movies. After some experimentation time with various sets of loud speakers ,it was found that the use of a two channel sound(stereo) produces ,the best hearing effect for most people

the current multimedia and virtual-reality implementations that includes such components show a considerable concentration on spatial sound effects and three-dimensional sound projections.

SPATIAL SOUND
An auditor perceives threedimensional spatial sound in a closed room .as in the diagram the walls and other objects in a room absorb and reflects the sounds dispersion paths.
The shortest path between the sound source and the auditor is called the direct sound path .this path carries the first sound waves towards the auditors head. all other sound paths are reflected which means that they are temporally delayed before they arrive at the auditors ear.

these delays are related to the geometric inside the these delays are related to the geometric length of the sound pathslength room and they usually occur because the reflecting sound path is longer than of direct one. the sound paths inside the room and they the usually occur because the reflecting sound path is longer than the direct one.

SOUND DISPERSION IN A CLOSED ROOM

DIRECT SOUND PATH

FIRST AND EARLY REFLECTION

SUBSEQUENT AND DISPERSED ECHOES

PULSE RESPONSE IN A CLOSED ROOM

The above diagram shows the energy of a sound wave arriving at an auditors ear as a function over time. The sources sound stimulus has a pulse form ,so that the energy of the direct sound path appears as a large swing in the figure .this swing is called the pulse response.

All subsequent parts of the pulse response refer to reflecting sound paths. Subsequent and dispersed echoes represent a bundle of sound paths that have been reflected several times.

these paths cannot be isolated in the pulse response because the pulse is diffuse and diverted both in the time and in the frequency range all sound paths leading to the human ear are additionally influenced by the auditors individual HRTF(head-related transfer function).HRTF is a function of the paths direction(horizontal and vertical angles) to the auditor.

REFLECTION SYSTEMS:-

Spatial sound systems are used in many different applications and each of them has different requirements. A rough classification groups these applications into a scientific and a consumer-oriented approach. Approach Scientific Attribute Simulation , precise , complex, offline Imitation , imprecise , impressive, real-time Applications Research, architecture , computer music Cinema, music, home movies, computer games

Consumer-oriented

REFLECTION SYSTEM APPLICATIONS

The scientific approach uses simulation options that enable professionals such as architects to predict the acoustics of a room when they plan a building on the basis of CAD models the calculations required to generate a pulse response of a hearing situations based on database of a CAD models are complex ,and normally only experts hear the results output by such systems.

Consumer systems concentrate mainly on applications that create a spatial or a virtual environment. Special interactive environments for images and computer games, including sound or modern art projects ,can be implemented, but in order to create a spatial sound ,hey require state of the art technologies.

List the four major categories kinds of MIDIs.

Music can be represented in the symbolic way. Computer and electronic musical instrument use the similar technique, and most of them employee the Musical Instrument Digital Interface (MIDI), a standard development in the early 1980s. The MIDI standard defines how to code all the elements of musical scores, such as sequences of notes, timing conditions, and the instrument to play each note MIDI represents a set of specification used in instrument development so that instrument from different manufactures can easily changes musical information. The MIDI protocol is an entire music description language in binary form.

A MIDI interface is composed of two different components: Hardware to connect the equipment. MIDI hardware specifies the physical connection of musical instruments. It add a MIDI port to an instrument, it specifies a MIDI cable A data format that encodes information to be processed by the hardware. The MIDI data format does not include the encoding of individual sampling values, such as audio data formats.

MIDI Devices An instrument that complies with both components defined by the MIDI standard Is the MIDI device able to communicate with other MIDI devices over channel. The MIDI standard specifies 16 channel. A MIDI device is mapped onto a channel . The MIDI standards identifies 128 instruments by means of numbers, including noice effects.

MIDI and SMPTE Timing standard The MIDI clock is used by the receiver to synchronize itself to the senders clock. To allow synchronization, 24 identifiers for each quarter note are transmitted. The SMPTE timing code can be sent to allow receiver-sender synchronization. SMPTE defines a frame format by hours:minutes:seconds:, for eg: 30 frames/s. The information transmitted ina rate that would exceed the bandwidth of existing MIDI connections.for this reason the MIDI code is normally used for synchronization because it does not transmit the entire time representation of each frame.

Speech signals

Speech is based on the spoken language, which means that it has a semantic content. Speech understanding means the efficient adaptation to speakers and their speaking habits. It has much more difficult for human to filter signals receive in one year only.

Speech signals have two important characteristics that can be used by speech processing Applications. Voiced speech signals have an almost periodic structure over a certain time intervals, so that these signals remains quasi- stationary for about 30 ms. The spectrum of some sounds has characteristic maxima that normally involve up to five frequencies.

SPEECH SYNTHESIS

Computer can translate an encoded description of a message into speech. This scheme is called speech synthesis.

SPEECH OUTPUT

Speech output deals with the machine generation of speech. Considerable work has been achieved in this field. A major challenge in speech output is how to generate these signals in real time for a speech output system to be able, for instance, to convert text to speech automatically.

REPRODUCIBLE SPEECH PLAYOUT Reproduction speech play out is a straightforward method of speech output. The speech is spoken by a human and recorded. To output the information. The stored sequence is played out. The speaker can always be recognized. The method uses a limited vocabulary or a limited set of sentences that produce an excellent output quality.

SOUND CONCATENATION IN THE TIME RANGE

Speech can also be output by concatenating sounds in the time range. This method puts together speech units like building blocks. Such a composition can occur on various levels. In the simplest case, individual phonemes are understood as speech units. Figure represents phonemes of the word stairs. It is possible to generate an unlimited vocabulary with few phonemes.

DIAGRAM:
e S t Z

If we group two phonemes we obtain a diaphone. Figure shows the word stairs again, but this time consisting of an ordered quantity of diaphones.

DIAGRAM:

st te

e@

@z

-s

z-

To further mitigate problematic transitions between speech units, we can form carrier words. We see in fig that speech is composed of a set of such carrier words.

Stairs

DIAGRAM:

ste@z Consonant vowel consonant The best pronunciation of a word is achieved by storing the entire word. By previously storing a speech sequence as whole entity. We use playout to move in to the speech synthesis area.

All these cases have a common problem that is due to transitions between speech sound units. This effect is called co articulation. Co articulation means mutual sound effects across several sounds, this effect is caused by the influence of the relevant sound environment or more specially, by the idleness of our speech organs.

SOUND CONCATENATION IN THE FREQUENCY RANGE


As in alternative to concatenating sound in the time range, we can affect sound concatenation in the frequency range, for example, by formant synthesis. Formant synthesis uses filters to simulate the vocal tract.

Characteristic values are the central filter frequencies and the central filter bandwidths. SPEECH SYNTHESIS

Speech synthesis can be used to transform an existing text into an acoustic signal.

DIAGRAM: Speech synthesis


Dictionary Sound transfer

Text

Transcription

Synthesis

sound script

Speech

The first step involves a transcription, or translation of the text into the corresponding phonetic transcription. Most methods use a lexicon, containing a large quantity of words or only syllables or tone groups. The second step converts the phonetic transcription into an acoustic speech signal, where concatenation can be in the time or frequency range. While the first step normally has a software solution, the second step involves signal processors or dedicated processors.

. Briefly explain capturing graphics and images

Graphics and the images are non textual information that can be displayed and printed. They may appear screens or printers. But it cant be displayed on the devices only capable of handling characters. This topic is discussed above graphics and images their respective properties and how they can be acquired manipulated on the computers as an output. Graphics are normally created in a graphics application and internally represented as an assemblage of objects like curves or circles

For creating images some attributes are needed such as style, width, colors are defining in a graphics images.

Capturing the graphic image The process of capturing the digital images depends initially upon the images origin i.e., the real world pictures and digital images. A digital image consists of n lines with m pixels.
The image format is specified by two parameters: - spatial resolution, which is specified as pixel into pixel and color encoding, which is specified bits per pixel. Both parameter values depend on software & hardware for input & output of images

For image capturing on a SPARC station, the video pix card & its software can be used. The spatial resolution is 320 x 240 pixels.
The color can be encoded with one bit, 8 bit or 24 bit. The new multimedia kit in the new SPARC station includes the sun video card, a color video camera and the cdware for CD-ROM disk.

The new sun video card is a capture & compression card and its technology captures & compresses 30 frames/sec in real time under the Solaris 2.3 OS.

Explain the different kinds of image format with example? Image formats:The literature describes many different image formats and distinguishes normally between image capturing and image storage formats, that is, the format in which the image is created during the digitizing process and which the format in which images are stored. Image capturing formats : The format of an image is defined by two parameter the spatial resolution indicated in pixels The color encoding

Measured in bits per pixel

The values of both parameters depend on the hardware and software used to input and output images.

Image Storage Formats To storage an image,the image is represented in a two-dimensional matrix,in which each value corresponds to the data assosiated with one lmage pixel. In bit maps these values are binary numbers. In color images the value can be one of the following. Three numbers that normally specify the intensity of the red,green,and blue components. Three numbers representing references to a table that contains the red,green,and blue intensities. A single number that works as a reference to a table containing color triples.

An index pointing to another set of data structures,which represent colors. Assuming that there is sufficient memory available,an image can be stored in uncompressed RGB triples. when storing an image the information about each pixel,i.e..,the value of each color channel in each pixel,has to be stored. Image as a whole ,such as width and height,depth or the name of the person who created the image. The necessity to store such image properties led to a nimber of flexible formats,such as RIFF or BRIM which are often used in databasesystems. RIFF includes formats for bitmaps,vector drawings,animation,audio and video. The most popular image storing formats include postScript,GIF,XBM,JPEG,TIFF,PBm and BMP.

PostScript postSript is a fully fledged programming language optimized for printing graphics and text It was introduced by Adobe in 1985. The mainpurpose of PostScript was to provide a convenient language In which to describe images in a device-independent manner e.g printer resolution PostScript has been developed in levels,the most recent being Level 3:

Level 1 postscript: The first generation was designed mainly as a page description language,introducing the concepts of scalable fonts. A font was available in either 10 or 12 points,but not in an arbitrary intermediate size. This format was the first to allow high-quality font scaling. This so-called Adobe Type 1 font format.

Level2 postScript In contract to the first generation,Level 2 postscript made a huge step forward as it allowed filling of patterns and regions,though normally unnoticed by the nonexpert user. The improvements of this generation include better control of free storage areas in the interpreter,large number of graphics primitives,more efficient text processing,and a complete color concept for both device-dependent and deviceindependent color management. Level 3 postScript Level 3 takes the postscript standard beyond a page description language into a fully optimized printing system that addresses the broad range of new requirements in todays increasingly complex and distributed printing environments. It expands the previous generations advanced features for modern digital document processing,as document creators draw on a variety of sources and increasingly rely on color to convey their messages.

Graphics Interchange Format The Graphics Interchange format(GIF) was developed by compuserver Information service in 1987. Three variations of the GIF format are in use the original specification . GIF87a,became a de facto standard because of its many advantages over other formats. GIF images are compressed to 20 to 25 percent of their original size with no loss in image quality using a compression algorithm called LZW. The original GIF specifications ,which supports only 256 colors,the GIF24 update supports true 24-bit colors

Tagged Image File Format(TIFF)

The Tagged Image File formatwas designed by Aldus corporation and Microsoft in 1987 to allow portability and hardware independence for image encoding. It can save images in an almost infinite number of variations. TIFF documents consists of two components

1.The baseline part describes the properties that should support display programs. 2.The extensions are used to define properties,that is the use of CMYK color model to represent print colors. TIFF supports a broad range of compression methods,including run-length encoding.

Huffman encoding can be used to reduce the image size. TIFF differ from other image formats in its generics. TIFF format can be used to encode graphical contents in differ ways. Eg.to provide for quick review of images in image archieves without the need to open the image file.

X11 Bitmap(XBM) and x11 Pixmap(XPM) X11 Bitmap(XBM) and x11 Pixmap(XPM) are graphic formats frequently used in the UNIX world to store program icons or background images. These formats allow the definition of monochrome(XBM) or color(XPM) images inside a program code. The two formats use no compression for image storage. In the monochrome XBM formats ,the pixels of an image are encoded and written to a list of byte values . In XPM format,image data are encoded and written to a list of strings together with a header.

Portable Bitmap plus(PBM plus) PBMplus is a software package that allows conversion of images between various image formats and their script-based modifications . PBMplus includes four different image formats. 1 Portable Bitmap for binary images. 2.Portable Graymap for gray-value images. 3 Portable Pixmap for true-value images. 4.Portable Anymap for format-independent manipulation of images. These formats support both text and binary encoding. The contents of PBMplus files are the following. A magic number identifying the file type. Blanks,tabs,carriage returns,and line feeds. Decimal ASCII characters that define image width,height.. ASCII numbers plus blanks that specify the maximum value of color components and additional color information. Filter tools are used to manipulate internal image formats.

Bitmap(BMP) BMP files are device-independent bitmap files most frequently used in windows systems. The BMP formats is based on the RGB color model. BMP does not compress the original image. The BMP format defines a header and a data region. The header region contains information about size,color depth,color table and compression method. The data region contains the value of each pixel in a line. The BMP format uses the run-length encoding algorithm to compress images with a color depth of 4 or 8bits/pixel,where two bytes are handled as an information unit.

You might also like