If previously all data could be stored literally on one hard drive, now any functional system requires separate storage – for instance, email servers, DBMS, domain, and so on. Therefore, with the help of storage systems, you can organize the decentralization of information ( dispersing it across different storage facilities).
Increased competition requires an increasingly in-depth analysis of information about the market, customers, their preferences, orders and the actions of competitors. However, the number of hard drives that you can install in a given server cannot cover the capacity required by the system. This is where data storage systems come to the rescue.
Data storage is one of many functions of modern storage systems. They also offer to save storage space using deduplication and compression. Compression allows the system to compress files, eliminating redundant information, and deduplication helps save storage space by eliminating redundant files and leaving only links to them.
So let’s make it easier for you. PS: Did you know that you can get these and GPU servers at hostkey.com?
What types of data storage systems are there?
Storage systems are classified into files, blocks, and objects. Each type of storage system determines in what form the data is stored, the method of accessing it, and, as a result, the ease of management and speed of data access.
File
They store information in the form of files collected in directories (folders). Files are organized and retrieved using metadata that tells where a particular file is located. Conventionally, such a system can be represented in the form of a catalogue.
Block
The data is stored independently of each other. Each such block is assigned an identifier, which allows the system to place each block where it is convenient for it. Block storage does not rely on a single path to data (unlike file storage).
Object
They split files into “objects” that are located in one shared storage. It can be divided into volumes, each of which can have a unique identifier and detailed metadata that allows objects to be quickly located. This approach is a distributed system.
The principle of operation of storage systems – NAS, SAN and DAS
There are several hardware components, software, and protocols that ultimately give storage solutions their unique properties.
Based on the classification above, there are two main types of storage systems: they differ in the level of storage, reading and writing of data.
- The first option works with file-level data. This means that such storage functions as a server with its own file system. In practice, the client-server issues commands such as “write bits to this file” or “extract a> bits from this file” respectively. This type of storage is called NAS.
- The second option is block-level data access. This speeds up the exchange of data between the server and the storage since it is direct, i.e. “write blockX·” or “call block”. Such repositories are connected to each other and to the server either as a DAS or via a SAN.
Let’s talk about each of them in more detail.
NAS
NAS stands for Network Attached Storage, which can be roughly translated as network storage. Because data is processed at the file level, the server appears to the NAS as a network server with its own file system.
To explain it more simply, imagine a desktop computer connected to a home router. It stores photos, videos, documents and other data. Network access is allowed to all users – this is approximately what a NAS looks like.
NAS storage can take many forms. For example, a production server may be connected to other servers, virtual machines or so-called disk stations, which contain a different number of removable hard drives.
NAS benefits:
- Availability and low cost.
- Easy to connect and manage.
- Flexibility is the ability to quickly increase data storage capacity.
- Universality of clients (a computer running any operating system can access files).
Disadvantages of NAS:
- Storing data only in the form of files.
- Slow access to information via network protocols (compared to the local system).
- Some applications are unable to work with network drives.
DAS
DAS stands for Direct Attach Storage – direct connection to a workstation or storage). For example, connecting an external drive via USB can be called DAS.
The fundamental simplicity of the DAS architecture leads to its main advantages: affordable price and relative ease of implementation. In addition, such a configuration is easier to manage due to the fact that the number of system elements is small.
Inside the system, there is a power supply, cooling and a RAID controller, which ensures the reliability and fault tolerance of the storage. Managed using the built-in operating system.
Advantages of DAS:
- Ease of deployment and administration.
- High data transfer speed.
- Low cost of equipment.
Disadvantages of DAS:
- Requires a dedicated server).
- Connection restrictions (no more than two servers).
SAN
In turn, SANs are storage networks. As a rule, they are presented in the form of external storage on several network block devices. They are implemented in the form of the FC (Fiber Channel) or iSCSI (Internet Small Computer System Interface) protocol. This is block access directly to a storage device – a disk or sets of disks in the form of RAID groups or logical devices.
By the way, the DAS above can be mighty and often cheaper than a SAN. However, at the same time, the disadvantage of DAS is that it cannot be easily expanded – the number of connected computers is limited by the physical number of SAS ports on the DAS (usually, there are only four). Therefore, many companies and institutions prefer to choose block storage connected via a SAN.
SAN benefits:
- High speed, low latency.
- Flexibility and scalability.
- Storing data in blocks.
- High reliability of data exchange and storage.
- Unloading the subnet from service traffic.
Disadvantages of SAN:
- Design complexity
- High price.
- There is an inability of some applications and systems to work with the iSCSI protocol.
How do you choose a storage system?
First of all, you need to understand what problems it will solve. It is essential to decide on several basic parameters.
Data type
Different types of data require different access speeds, processing technologies, compression, and so on. For example, a virtual storage system for working with large media files differs from a system that will work with unstructured data for a neural network.
Data volume
The choice of disk drives depends on this. Sometimes, you can get by with a consumer-grade SSD – if you know that the storage capacity, even in the worst case, will be at most 300 GB, and access speed is not critical.
Fault tolerance
It is necessary to understand what the cost of data loss is over a certain period. This will help you calculate RPO (Recovery-Point Objective) and RTO (Recovery Time Objective) and also avoid unnecessary backup costs. Backups, backups and more backups.
Performance
If the storage system is purchased for a new project (the load of which is difficult to predict), then it is better to talk with colleagues who have already solved this problem or test the storage system.
Summing it up?
If you are now looking for a solution for working with data, you can rent a dedicated web server and storage systems (data storage systems) from hostkey. They will provide the server with a fast Internet connection, a permanent connection to electricity and 24/7 support.