You are on page 1of 24

block storage

Block storage is data storage typically used in storage-area network (SAN)


environments where data is stored in volumes, also referred to as blocks.
Each block is assigned an arbitrary identifier by which it can be stored and
retrieved, but no metadata providing further context. Database storage is a
common use for block storage.

Each block acts as an individual hard drive and is configured by the storage
administrator. These blocks are controlled by the server-based operating
system, and are generally accessed by Fibre Channel, iSCSI or Fibre
Channel over Ethernet protocols.

Because the volumes are treated as individual hard disks, block storage
works well for storing a variety of applications. File
systems and databases are common uses for block storage because they
require consistently high performance. Email servers such as Microsoft
Exchange use block storage in lieu of file- or network-based storage systems.

RAID arrays are a prime use case for block storage as well. With RAID,
multiple independent disks are combined for data protection and performance.
The ability of block storage to create individually controlled storage volumes
makes it a good fit for RAID.

Virtual machine file systems are another common use for block-level storage.
Virtualization vendors such as VMware support block storage protocols, which
can improve migration performance and improve scalability. Using a SAN for
block storage also aids virtual machine (VM) management, allowing for non-
standard SCSI commands to be written.

While there are benefits to using block storage, there are also alternatives that
may be better suited to certain organizations or uses. Two options stand out
when it comes to facing off with block-level storage: file storage and object
storage.

Block vs. file storage

If simplicity is the goal, file storage may win out over block-level storage. But while
block storage devices tend to be more complex and expensive than file storage, they
also tend to be more flexible and provide better performance.

File storage provides a centralized, highly accessible location for files, and generally
comes at a lower cost than block storage. File storage uses metadata and directories to
organize files, which makes it a convenient option for an organization looking to
simply store large amounts of data.

The relatively easy deployment of file storage makes it a viable tool for data
protection, and the low costs and simple organization can be helpful for local
archiving. File sharing within an organization is another common use for file storage.

The simplicity of file storage can also be its downfall. While it has a hierarchical
organization to it, the more files added, the more difficult and tedious it becomes to
sift through file storage. If performance is the deciding factor, object or block-level
storage win out over file storage.

Some products, such as Hewlett Packard Enterprise (HPE) 3PAR's File Persona
service, have converged file and block storage to provide the benefits of both
technologies.

Block vs. object storage

Rather than splitting files into raw data blocks, object storage clumps data together as
one object that contains data and metadata. Blocks of storage do not contain metadata,
so in that regard object storage can provide more context about the data, which can be
helpful in classifying and customizing the files. Each object also has a unique
identifier, which makes quicker work of locating and retrieving objects from storage.

Block storage can be expanded, but object storage is unmatched when it comes to
scalability. Scaling out an object storage architecture only requires adding nodes to
storage cluster.
The flexibility and scalability of object storage may be appealing, but some
organizations may choose to prioritize performance and choose file or block storage.
While block storage allows for editing incremental parts of a file, object stores must
be edited as one unit. If one part of an object needs to be edited, the entire object must
be accessed and updated, then rewritten, which can negatively affect performance.

Both object and block-level storage are used in the enterprise, but object storage use
cases lean more toward scenarios dealing with large amounts of data, such as big data
storageand backup archives. Because of this, modern data storage environments such
as the cloud are arguably trending toward object-based storage over file and block
storage options. However, individual needs will always be the determining factor for
which form of storage is used.

Block storage vendors

Along with HPE, several larger and smaller storage vendors provide block storage.
The largest storage vendors are Dell EMC, HPE, Hitachi Vantara, IBM and NetApp.
Additional vendors include DataDirect Networks, Huawei, Infinidat, Kaminario,
Nutanix, Oracle, Pure Storage, Tintri and Western Digital. The largest vendors all
have several block storage platforms, as well as unified storage that runs block and
file on the same arrays.

OpenStack Block Storage (Cinder) is an open source form of block storage, which
provisions and manages storage blocks. It also provides basic storage capabilities such
as snapshot management and replication. OpenStack Block Storage is supported by
other vendors such as IBM, NetApp, Rackspace, Red Hat and VMware.

Amazon Elastic Block Store (EBS) is persistent block storage for Amazon Elastic
Cloud Compute. EBS is scalable and designed for workloads such as big data
analytics, NoSQL databases and data warehousing.
Why block storage is gaining momentum

With large vendors like Dell EMC and Amazon on board with block storage products,
it is clearly going to be a supported technology for the foreseeable future. While there
are pros and cons to its use, many of the negatives can be chalked up to features that
are better provided by a different storage system. These needs may vary by
organization, and while file or object storage may be better suited to some cases,
block storage will likely be the right choice for others.

If an organization is looking to incorporate the cloud, then they will find block
storage to be a common partner for cloud computing.

The main disadvantage to SAN environments, where block storage systems


are most often found, is the cost and complexity associated with building and
managing the environment. As long as organizations are willing to take on
those obstacles, SAN environments will remain a viable option. With virtual
and converged SAN options on the market today, SAN arrays, and block
storage with them, are likely to continue to grow and meet consumer needs.

A block is a contiguous set of bits or bytes that forms an identifiable unit of


data. The term is used in database management, word processing, and
network communication.

1) In some databases, a block is the smallest amount of data that a program


can request. It is a multiple of an operating system block, which is the smallest
amount of data that can be retrieved from storage or memory. Multiple blocks
in a database comprise an extent.

2) In word processing, a block is a contiguous set of characters. Often it


consists of a phrase, a sentence, a paragraph, or a set of paragraphs that is
selected by the user for copying/pasting, cutting, or moving. But a block can
consist of any contiguous set of characters, whether or not it forms a logical
unit of text.

3) In network communication, a block is a group of data bits or bytes that is


transferred as a standard unit. The size (or length) of such a block depends
on the communications protocol.

Oracle extents

Posted by: Margaret Rouse

WhatIs.com

Follow:

Contributor(s): Adam Hughes






Oracle extents are sets of logically contiguous blocks allocated in an


Oracle database. Extents are units of database space distribution made up of
data blocks.
In the Oracle database architecture, the first set of contiguous blocks, set up
automatically when a database segment is created, is called the initial extent.
After the initial extent has been filled, the Oracle Database software allocates
more extents to the segment automatically. These are known as next extents.
The total number of Oracle extents that can be allocated in a database is
limited by the amount of storage space available or, in some cases, by the
program used.

The term extent is also sometimes used in reference to any contiguous space
-- for example, a set of sectors -- on a hard drive that is reserved for a
particular file, folder or application.

Data blocks and extents

Oracle Database stores data in data blocks, which can also be called logical
blocks. A data block is the smallest unit of data within a database. One data
block can correspond to a certain number of bytes of physical database space
on a disk.

Extents are made up of groups of data blocks. The amount of logical database
storage that is greater than an extent is a database segment.

Extent allocation and deallocation

Oracle allocates extents depending on whether they are locally managed or


dictionary managed. When there is free space within a tablespace, an extent
can be allocated by first determining a candidate datafile and then searching
the datafile's bitmap for a certain number of free data blocks. If it does not
have enough free space, the database looks to another datafile.

Oracle's database provides a segment advisor for IT pros to determine if an object has
any space available for deallocation based on the amount of space fragmentation
within the subject. The extents of a segment do not return to the tablespace unless the
schema object within the segment is dropped.

Extents and database segments

A database segment is a set of extents that contains all of the data necessary for a
logical storage structure within a tablespace.

For every table that is created, Oracle Database allocates extents to form the table's
data segment. Oracle provides space for segments within extents, so when a segment's
existing extents are full, the software provides another extent for that segment.

Leverage existing network-


attached storage and block
storage for better data storage
management

Learn data storage management practices to better leverage your existing


NAS and block storage and improve access to both types of storage.







Improving your enterprise data storage management – essentially, better
leveraging your network-attached storage (NAS) data or block storage data,
as well as access to both – is a challenge for many IT organizations.

When considering best practices for improving data storage management, Jeff
Boles, senior analyst and director, validation services at Hopkinton, Mass.-
based Taneja Group, explains how to better leverage NAS data storage
systems via data classification and tiering tools and techniques. When it
comes to block storage, Boles advises users to consider performance
management toolsand thin provisioning.

He also recommends virtualized I/O and next-generation fabrics to improve


access to block storage, as well as wide-area data services and data
optimization to improve access to NAS data storage systems. You can listen
to our interview with Jeff on improving data storage management or read the
transcript below.

Q. How can users better leverage their existing NAS data storage?

A. When it comes to [NAS data storage, or] file storage, there's always
been data classificationand tiering as a possibility for optimizing your storage.
At the heart of the matter is how you understand where you're applying your
storage resources -- if you're storing the right types of files in the right places,
if you're using your storage for data that's actually important to the business.

Data classification with e-discovery behind it has been driven to a whole new
level of maturity, and if you haven't looked in a while, there's a whole new set
of tools out there that you can access to classify your data, figure out what's
going on, and really move and optimize it. And in fact, even tools like StorNext
from Quantum Corp. has been working behind the scenes, and they're getting
new capabilities over time because now they're at the heart of data
deduplication. But StorNext was originally a data archiving platform. So,
there's interesting possibilities.

Without talking specifically about storage technologies, let's talk about


something that you might apply to move your data and optimize your storage
better today. Those are typically data classification and/or file
virtualization tools – things like the F5 Networks Inc.'s Acopia or EMC Corp.'s
Rainfinity. Or there's a relatively new company to that field of file virtualization,
AutoVirt Inc., with solutions targeted more at the small- and medium-sized
business (SMB).

In addition, a lot of the data classification vendors out there have some tools
that you can apply alongside of a NetApp filer, for instance, or EMC Celerra,
and use things like the file mover API to tier some of your data to other
storage systems.

But what you really want to dig into is a tool that can help you understand your
data without too much complexity. You don't want to get into "analysis
paralysis" when it comes to tagging stuff and getting all kinds of metadata
from your existing files. But you want something that understands who's using
that file, how often it's accessed and how important it is to the business.

Q. What advice do you have for end users looking to make better use of
their block data storage systems?

A. Let's talk about identifying how you're using your existing block storage
with an eye toward performance. There are a lot of technologies out there, like
thin provisioning; you either have that or you don't today. Certainly if you're
acquiring new stuff, you should never overlook thin provisioning and make
sure that you're thin provisioning inside of the system is built to deliver the
performance capabilities that you expect of it.
But if you're not in that place today, let's talk about using your block storage a
little bit longer than normal and with an eye toward performance. Vendors like
Akorri are bringing performance management tools to the table that can help
you peer into your infrastructure and understand performance requirements in
various applications. [Companies] like Virtual Instruments can give you really
deep, packet-level insight that they can roll up into big dashboard data to help
you understand your environment.

There's even things out there like Performance Pack for EVA from Hewlett-
Packard (HP) Co. that help you get this kind of visibility as well – these tools
can help you understand on a session basis, an application basis, what kind
of performance resources you require, and maybe you can start differentiating
a little bit better between how you provision storage in your environment and
doing things like restricting bandwidth within your fabric if you have that type
of intelligence within your switches. Or you can reconfigure your LUNs [logical
unit numbers] on the back end so that you're not taking up as many resources
and maybe your RAID volume constructions are a little bit different, your
virtualized volumes are a little bit different, so you're not sucking down the
same performance for every application when not every application needs it.

Q. How can users better their data storage management regarding


access to both their NAS and block storage systems?

A. There are big opportunities in block [data storage] – virtualized I/O, for
instance, next-generation fabrics for select sets of equipment, things like Fibre
Channel over Ethernet (FCoE). When it comes to buying new equipment,
don't just keep provisioning host bus adapters [HBAs] and networks
separately and redundant connectivity all over your enterprise and consuming
your Fibre ports if that's a limited resource. [Consider] next-generation fabrics
and things like virtualized I/O, where you're doing I/O to your local servers with
InfiniBand or with Ethernet and running FCoE over it, where you can take this
out over a single wire or two wires from a server to a gateway that only
consumes a couple of ports from your fabric.

Then let's turn an eye toward [improving access to NAS data storage]. You
shouldn't overlook the opportunity to spread your use of wide-area data
services, wide-area data optimization for [NAS data storage]. This can let you
consolidate file storage in single locations. Maybe you have some of that in
your enterprise today, but you're not making full use of it.

Look at getting more of your data back into a central location where you can
apply your time and effort to the management of it better; [make sure] you
have the right wide-area data services/wide-area file services in place so that
users can still access it like a localized resource but keep it in a central
location. So, look for those types of opportunities – compressing bandwidth,
moving data back to a central place, not occupying as many resources, when
it comes to connecting into your existing fabric.

Block level vs. file level


:

For the pros, this question represents "Beginner's Storage 101." But the
storage tech literature always talks about "block level" versus "file level" data,
without ever clearly explaining the key differences and relevance. Can
someone please do so, once and for all, in layman's language so all us
"uninitiated" can understand? Thank you.

Any two devices communicating over a network have to agree on how they
will communicate. Standard protocols are the implementations of those
communications agreements. There can be and there are many networking
protocols.
Storage devices and subsystems typically are slaves to the filing systems that
write data to them and read from them. The filing systems are typically file
systems or database systems. Examples of filing systems are the NTFS file
system in Windows 2000 and NT, the FAT file system in DOS, the many
flavors of the Unix File system (UFS), the Veritas File system (VxFS), Oracle
databases, Informix databases, Sybase databases.

Filing systems do two things: First, they represent data to end users and
applications. This data is typically organized in directories or folders typically
in some hierarchical fashion. I talk about this in my new book as data
representation. The second thing filing systems do, is organize where data is
placed in storage. These filing systems have to scatter the data around the
storage container to make sure that all data can be accessed with reasonable
performance. They do this by directing the storage block addresses where the
data is going to be placed. I refer to this as a data structure function. Today,
these are actually all logical block addresses as the disk drives keep their own
internal block translation tables. That might be more than you need to know
for now, but it could be useful for some.

So the filing system sends commands to "slave" storage to write data to


certain blocks and retrieve it from certain blocks. This is what is commonly
called block-level storage. In my new book I talk about this as storing. Storing
functions are based on master/slave relationships, not client server.

It is also possible for systems to request data using the user-level data
representation interfaces (File level storage). This is done by the client using
the data's filename, its directory location, URL, or whatever. This is a
client/server model of communicating. The server in this case receives the
filing request and then looks up the data storage locations where the data is
stored and retrieves it using storing level functions (block level storage). The
server does not send the file to the client as blocks, but as bytes of the file.
File level protocols do not have the capability of understanding block
commands. Likewise, block protocols cannot convey file access requests and
responses.

One of the confusing things in this is that filing and storing are tightly
integrated. Neither can work without the other. But when it comes to
understanding how storing and filing traffic is transferred over a network; both
are independent of the wiring (networking or bus) that supports their
communications. In other words, storing and filing traffic can exist on the
same network using different storage application protocols.

===================

Block vs file for Hyper-V and


VMware storage: Which is
better?
Is block- or file-based storage better for Hyper-V and VMware virtual server
environments? The answer depends on the precise needs of your virtual
server environment.

When it comes to Hyper-V and VMware storage, which is better: block- or file-
based access? The rate of adoption of server virtualisation has accelerated
over recent years, and virtual server workloads now encompass many
production applications, including Tier 1 applications such as databases.

For that reason it is now more important than ever that Hyper-V and VMware
storage is well-matched to requirements of the environment. In this article we
will discuss the basic requirements for Hyper-V and VMware storage and
examine the key question of block vs file storage in such deployments.

Basic requirements for storage in virtual server environments

When selecting storage for virtual server environments, a basic set of


requirements must be met, irrespective of the hypervisor or the storage
protocol. These include:

 Shared access. Storage connected to hypervisors typically needs to


provide access shared among hypervisor hosts. This enables redundant
and high-availability configurations. Where shared storage is implemented
for multiple hypervisors, guests can be load-balanced across the servers
for performance and availability in the event of a server failure.

 Scalability. Virtual server environments can include hundreds of virtual


machines. This means any storage solution needs to be scalable to cater
for the large volume of data virtual guests create. In addition, scalability is
required for shared connectivity, providing for multiple hosts with multiple
redundant connections.

 High availability. Virtual server environments can contain hundreds of


virtual servers or desktops. This represents a concentration of risk
requiring high availability from the storage array. Availability can be
quantified in terms of array uptime but also of components that connect the
server to the array, such as network or Fibre Channel switching.

 Performance. Virtual environments create a different performance profile


for I/O than that of individual servers. Typically, I/O is random in nature, but
certain tasks, such as backup and guest cloning, can result in high
sequential I/O demands.

Protocol choice: Block vs file?


Virtual servers can be deployed either to direct-attached storage (DAS) or
networked storage (NAS or SAN). DAS does not provide the shared access
required of highly available virtual clusters because it is physically associated
with a single virtual server. Enterprise-class solutions, therefore, use
networked storage and this means protocols such as NFS, CIFS, iSCSI, Fibre
Channel and Fibre Channel over Ethernet (FCoE).

File-level access: NAS

Network-attached storage encompasses the NFS and CIFS protocols and


refers specifically to the use of file-based storage to store virtual
guests. VMware ESXi supports only NFS for file-level access; Hyper-V
supports only CIFS for file access. This difference is perhaps explained by the
fact that CIFS was developed by Microsoft from Server Message Block (SMB)
and NFS was originally developed by Sun Microsystems for its Solaris
operating system -- both Solaris and ESXi are Unix variants.

For VMware, NFS is a good choice of protocol as it provides a number of


distinct benefits.

 Virtual machines are stored in directories on NFS shares, making them


easy to access without using the hypervisor. This is useful for taking virtual
machine backups or cloning an individual virtual guest. VMware
configuration files can also be directly created or edited.

 Virtual storage can easily be shared among multiple virtual servers;


VMware uses a locking file on the share to ensure integrity in a clustered
environment.

 No extra server hardware is required to access NFS shares, which can be


achieved over standard network interface cards (NICs).

 Virtual guests can be thinly provisioned, if the underlying storage hardware


supports it.
 Network shares can be expanded dynamically, if the storage filer supports
it, without any impact on ESXi.

There are, however, some disadvantages when using NFS with VMware.

 Scalability is limited to eight NFS shares per VMware host (this can be
expanded to 64 but also requires TCP/IP heap size to be increased).

 Although these NFS shares can scale to the maximum size permitted by
the storage filer, the share is typically created from one group of disks with
one performance characteristic; therefore, all guests on the share will
experience the same I/O performance profile.

 NFS does not support multipathing, and so high availability needs to be


managed at the physical network layer with bonded networks on ESXi and
virtual interfaces on the storage array -- if it supports it.

For Hyper-V, CIFS allows virtual machines (stored as virtual hard disk, or
VHD, files) to be stored and accessed on CIFS shares specified by a Uniform
Naming Convention (UNC) or a share mapped to a drive letter. While this
provides a certain degree of flexibility in storing virtual machines on Windows
file servers, CIFS is an inefficient protocol for the block-based access required
by Hyper-V and not a good choice. It is disappointing to note that Microsoft
currently doesn’t support Hyper-V guests on NFS shares. This seems like a
glaring omission.

Block-level access: Fibre Channel and iSCSI

Block protocols include iSCSI, Fibre Channel and FCoE. Fibre Channel and
FCoE are delivered over dedicated host adapter cards (HBAs and CNAs,
respectively). iSCSI can be delivered over standard NICs or using dedicated
TOE (TCP/IP Offload Engine) HBAs. For both VMware and Hyper-V, the use
of Fibre Channel or FCoE means additional cost for dedicated storage
networking hardware. iSCSI doesn’t explicitly require additional hardware but
customers may find it necessary to gain better performance.

VMware supports all three block storage protocols. In each case, storage is
presented to the VMware host as a LUN. Block storage has the following
advantages.

 Each LUN is formatted with Virtual Machine File System, or VMFS, which
is specifically written for storing virtual machines.

 VMware supports multipath I/O for iSCSI and Fibre Channel/FCoE.

 Block protocols support hardware acceleration through vStorage APIs for


Array Integration (VAAI). These hardware-based instructions improve the
performance of data migration and locking to increase throughput and
scalability.

 ESXi 4.x supports “boot from SAN” for all protocols, enabling stateless
deployments.

 SAN environments can use RDM (Raw Device Mapping), which enables
virtual guests to write non-standard SCSI commands to LUNs on the
storage array. This feature is useful on management servers.

For VMware, there are some disadvantages to using block storage.

 VMFS is proprietary to VMware, and data on a VMFS LUN can be


accessed only through the hypervisor. This process is cumbersome and
slow.

 Replication of SAN storage usually occurs at the LUN level; therefore,


replicating a single VMware host is more complex and wasteful in
resources where multiple guests exist on the same VMFS LUN.

 iSCSI traffic cannot be encrypted and so passes across the network in


plain view.
 iSCSI security is limited to CHAP (Challenge Handshake Protocol), which
isn’t centralised and has to be managed through the storage array and/or
VMware host. In large deployments this may be a significant overhead in
management.

Hyper-V is deployed either as part of Windows Server 2008 or as Windows


Hyper-V Server 2008, both of which are Windows Server variants. Therefore
virtual guests gain all the benefits of the underlying operating system,
including multipathing support. Individual virtual machines are stored as VHD
files on LUNs mapped to drive letters or Windows mount points, making them
easy to back up or clone.

Summary

NFS storage is suitable only for VMware deployments and is not supported by
Hyper-V. Typically, NAS filers are cheaper to deploy than Fibre Channel
arrays, and NFS provides better out-of-band access to guest files without the
need to use the hypervisor. In the past NFS had been used widely for
supporting data like ISO installation files, but today it has wider deployments
where the array architecture supports the random I/O nature of virtual
workloads.

CIFS storage is supported by Hyper-V but is probably best avoided in


preference of iSCSI, even in test environments; Microsoft has now made its
iSCSI Software Target freely available.

Block-based storage works well on both virtualisation platforms but can


require additional hardware. Directly accessing data is an issue for
iSCSI/Fibre Channel/FCoE, making data cloning and backup more complex.

Overall, the choice of platform should be considered against your


requirements. There are clearly pros and cons with either a file- or block-
based approach, each of which can coexist in the same infrastructure. There’s
no doubt that both will find homes with server virtualisation for many years to
come.

Object storage vs block vs file

We recap the key attributes of file and block storage access and the pros and
cons of object storage, a method that offers key benefits but also drawbacks
compared with SAN and NAS

The emergence of object storage as a viable means of data retention upsets


the existing methods – closely connected – of file and block storage, also
known as NAS and SAN.

This article will recap the fundamentals of file and block, but with the purpose
of highlighting the quite different characteristics of object storage, all of which
are forms of shared storage. In the final analysis, we will suggest the use
cases most suited to object storage, as well as file and block.

The trigger is the rise of object storage, which has become prominent in the
form of array-type products as well as being the basis for cloud-based
protocols such as Amazon’s S3.

To see how object storage differs significantly from SAN and NAS protocols,
let’s first look at those.

File and block are file system-based methods of storage access.


In both cases, there is a file system. We are all familiar with them
– FAT and NTFS in Windows, ext in Linux, and so on. They organise data into
files and folders in a tree-like hierarchy and give a path to the file while also
retaining a small amount of metadata about the file.

That is the part we see. But under the bonnet, that file path and the file system
also handle addressing to the physical location of blocks of storage on the
media itself.

The key difference between file access/NAS and block access/SAN is that in
NAS, the file system resides on the array. Here, an application’s I/O requests
go via the file system resident on the NAS hardware, accessed as a volume or
drive. In a SAN, the file system is external to the array and I/O calls are
handled by the file system on the server, with only block-level information
required to access data from the SAN.

Key practical difference


From that distinction arises the key practical difference between NAS and SAN.

NAS is best suited to retention and access of entire files and has locking systems that
prevent simultaneous changes and corruption to files.

Meanwhile, SAN systems allow changes to blocks within entire files and so are
extremely well suited to database and transactional processing.

Both usually come as array products, even if software-defined, and – depending on


how high-end or not – with features such as synchronous and
asynchronous replication, snapshots, compression and deduplication, and storage
tiering. Both can also take advantage of flash storage.

SAN and NAS are well suited to what they do, but have drawbacks.
For example, NAS can be limited by scale. Historically, organisations put in a NAS
box to service a department, but these proliferated and were unconnected, leading to
silos of data. This issue is overcome with scale-out NAS, where multiple NAS
instances operate a single, highly-scalable parallel file system.

The tree-like file system hierarchy can handle millions of files quite easily, but once
you scale to billions, it can start to slow up.

Massive scalability
Object storage brings massive scalability. That is because it works differently from
the SAN and NAS protocols. It has no file system but, like NAS, changes are at the
file level.

Instead of a tree-like hierarchy, object storage organises files, or objects, in a flat


layout. Objects are just objects, with unique identifiers.

That means object storage is massively scalable, to billions of objects, because the file
organisation does not become unwieldy the bigger it becomes.

Objects also have metadata, and lots of it, potentially, all definable by the customer.
That means any attribute can be associated with an object in its header metadata: the
application it is associated with, its data protection characteristics, tiering information,
when it should be deleted, and by custom business- or organisation-related attributes.

So, object storage is eminently suited to analytics, being searchable in very large
datasets for potentially almost any attribute.

Data protection is usually by erasure coding, sometimes by replication, although the


former is considered more efficient than the latter because it produces less overhead
data.
Almost always, however, object storage data is “eventually consistent”, which means
the multiple instances required for data protection schemes to work are not
instantaneous or anywhere near. They will eventually be consistent with each other as
erasure coding/replication works its way between locations.

Read more on object storage

 Object storage is a rising star in data storage, especially for cloud and web use. But what are the pros and cons

of cloud object storage or building in-house?

 Both NAS and object storage offer highly scalable file storage for large volumes of unstructured data, but which

is right for your environment?

But that multiple location attribute can also be an advantage, making object storage
well-suited to an organisation with multi-regional needs.

By contrast, SAN and NAS can be “strongly consistent”, with near real-time mirrors
of datasets possible.

Also, object storage cannot perform as well as SAN and sometimes NAS, mainly
because of the large file header overheads it carries. It also cannot offer the sub-file
block-level manipulation required for database and transactional work that SAN
access can.

For those two key reasons, object storage is best suited to large datasets
of unstructured data in which objects do not change that often.

Outside the pros and cons of the technology per se, object storage has the advantage
of relative cheapness, often running on commodity hardware. That is in contrast to
potentially expensive packaged array-type products from storage box suppliers.

Having said that, costs can come in other areas, such as changes to your software
environment. Not all applications will necessarily be natively compatible with object
storage file calls. Built for NFS, SCSI, and so on, they will need adapting to deal with
the Get, Put, Delete and other commands of object storage.

To sum up:

 NAS: Good at secure file sharing. Can become siloed. Scale-out NAS potentially
good at scale. Bad at extreme scale.

 SAN: Good at transactional and database workloads. Can be expensive.

 SAN and NAS: Both can come with advanced storage features, such as replication.
Both can be relatively costly compared with object storage on commodity
hardware, although both SAN and NAS software-defined storage are available.
Both lack the rich metadata of object storage.

 Object storage: Very scalable, suited to unstructured data and large datasets,
potentially good for analytics via rich metadata. Lacks high-end performance and
data protection is slow across clusters. Can be very cost-efficient, hardware-wise.

You might also like