A view on Cloud Storage Services and related protocols
The goal of this paper is to identify the common storage concerns in a cloud environment.
By listing the major Cloud storage services and the used protocols, we expect to have an idea of which type of connectors which should provide.
Cloud Storage Services list
Name | Vendor | Description | Supported protocols |
---|---|---|---|
SwiftStack |
San Francisco-based developer of a commercial version of the open-source Swift object storage technology. |
NFS, SMB, Swift, S3 |
|
Amazon |
Amazon EFS is a new, fully managed service for setting up and scaling file storage in the AWS Cloud. |
NFS, On premise connection (AWS Direct Connect) |
|
Amazon AWS |
Provides storage through web services interfaces. Designed as a complete storage platform. |
S3 API, Local file server (AWS Storage gateway) |
|
Amazon AWS |
A secure, durable, and extremely low-cost cloud storage service for data archiving and long-term backup. |
Proprietary API |
|
Microsoft |
Cloud service that provides storage that is highly available, secure, durable, scalable, and redundant. Azure Storage consists of Blob storage, File Storage, and Queue storage. Learn how to leverage Azure Storage in your applications with our quickstarts and tutorials. |
Developer SDK, REST API |
|
Ctera, USA |
File sync and share services that encourage user adoption while ensuring total governance over all methods of data access and storage – any device, any service, any location. |
Proprietary Sync Agent |
|
Oracle |
Enterprise-proven object storage and archive services for cloud-based data storage, sharing, and protection. Secure, resilient, elastic, and simple to use so that data is available when user need it from any environment connected to the Internet. |
Swift REST API, Java client |
|
OpenStack |
OpenStack Object Storage (swift) service provides software that stores and retrieves data over HTTP. Objects (blobs of data) are stored in an organizational hierarchy that offers anonymous read-only access, ACL defined access, or even temporary access. |
Swift, NFS, CIFS, GlusterFS, HDFS |
|
Rackspace |
Cloud Files provides online object storage for files and media, delivering them globally at blazing speeds over a worldwide content delivery network (CDN) |
Swift, NFS, Client API |
|
Dell/EMC |
ECS brings all the benefits of a public cloud to your own datacenter while keeping its cost under control. It can be used for a wide variety of workloads such as deep archive, geo protection of Hadoop, Internet of Things. |
HDFS, Object storage API |
|
A unified offering across the availability spectrum: from live data tapped by today’s most demanding applications, to cloud archival solutions Nearline and Coldline. Featuring a consistent API, latency, and speed across storage classes. |
Client API & libraries, Amazon S3 |
||
Commvault |
Back up your databases, files, applications, endpoints and VMs with maximum efficiency according to data type and recovery profile. Integrate hardware snapshots. Optimize storage with deduplication. |
Proprietary Agent and API |
|
Egnyte |
Egnyte Connect delivers Enterprise File Sync and share (EFSS), designed with businesses in mind, so IT can focus on security & performance, while users can access all their content from their desktop, mobile and browser |
Proprietary Clients (Desktop, mobile, web) |
|
Box |
Create, edit and review documents with others in real time from anywhere, on any device. |
Proprietary Clients, API & SDK |
|
Dropbox |
Dropbox creates a special folder on the user’s computer, the contents of which are then synchronized to Dropbox’s servers and to other computers and devices that the user has installed Dropbox on, keeping the same files up-to-date on all devices. |
Proprietary Clients, API & SDK |
Crawling use cases
File System (sync or file server)
In this case, the vendor solution will expose the files in a local computer (desktop, server). using one of the two following methods:
-
A synchronization agent: This local daemon manages to synchronize the files in a local location.
-
Using a file sharing service: The files are served using a common file protocol (NFS, CIFS).
The crawling process is done by browsing the directories using the standard file crawler.
Who ? Dropbox, Box, GoogleDrive, Microsoft OneDrive, OwnCloud, …
Object storage
The files are exposed by a documented API. Some vendors may provide client implementations. The API can be standard (S3, SWIFT, HDFS) or proprietary.
The crawling process only differ from the standard file crawler by calling an API to list the files and collect the content.
Who ? Rackspace, OpenStack, AmazonS3, …
Backup & vault storage
This kind of storage provides really specific API. First, the files are not directly visible. What is exposed first are backup entities (archives).
Most of the time, the access to the files is a slow (and complex) process.
The crawling of those archives may be useful to index old data. However, it would probably more interesting to crawl the files before they enter the vault.
The files are exposed
References
-
Comparison of File Hosting Services: Wikipedia, 2017.
-
The Best Cloud Storage and File-Sharing Services of 2017, PCMag, 2017.
-
The 10 Coolest Enterprise Cloud Storage Offerings In 2016 : By Joseph F. Kovar. TheChannelCo. 2016.