A. Storage Types
Storage is typically divided into three
major types.
1. Block Storage
Data is stored in evenly size of
Block. Means when you update data, then only required block get updated, which
is fast.
Its provide lowest latency, Highest
performance and highly redundant.
Block storage is suitable for
transaction database, Random read/write loads and Structure database.
2. File Storage
Data is stored as a single file of
information in a folder, so if you want to update the information you
need to overwrite the whole file. Files are stored in Folder and we can
nest more folder under others folders. So we organize files in sub-directories and
directory. Each file has limited set of metadata like Name, Creation date,
Modified date, Created by etc.
It works fine when you have limited files,
when files grow your performance went down drastically. you will face the issue
when searching file, updating file etc.
3. Object Storage
To overcome the limitation/Problem of
File storage, Object storage comes into the picture.
Each file is a separate object for
file storage. These files store in a flat address space, means there is no
hierarchy concept in object storage, all files store in the same level.
In Object storage single, File is bundled
into object along with metadata TAGS and unique identifier.
Because Object storage, stores
Object(File), its metadata and unique identifier together, that’s why Its
ideally suited for Distributed storage
architecture and easily scaled
with cheap hardware compare to Block and File storage.
Object storage cannot be mounted as a drive
on the virtual server. You can access the object(files) using API/Command line
only.
B. Distributed architecture
Data processing and data storage are
not kept on a single machine rather it is distributed over several independent
machines and Machines can be located in a different location which is connected
via Network.
C. Simple Storage Service(S3)
S3
is object based Storage service offering
of AWS. S3 has distributed architecture
where objects are stored in multiple location on AWS infrastructure.
Using
the API, SDK and AWS console, you can Store/Retrieve any amount of data anytime,
anywhere using internet.
You
can store unlimited object in S3 and Size of an object can be 0 Bytes to 5 TB.
There
are two main block/components of Simple
Storage Service(S3).
1. Bucket
The Bucket is a flat container of objects;
it does not provide any hierarchical Structure. If you want to store objects in
S3, first you need to create the Bucket. The Bucket is a region specific in
AWS.
a. Bucket is a
flat container of the objects but You can create the logical folder inside the
Bucket.
b. You can
store unlimited objects in the bucket.
c. You cannot
create the nested Bucket.
d. By default,
you can create maximum 100 Bucket but This is a soft limit, it can be increased.
e. Ownership
of Bucket cannot be transferred.
f.
Bucket Name is globally unique in all regions
and account. Means someone chooses Bucket Name then you cannot use this name.
However, if you delete the Bucket then same name will available for use.
g. Bucket
name cannot rename/Changed.
h. Bucket name
has a specific naming convention; you have to follow while creating the Bucket.
2. Object
a. Size of an
object can be 0 Bytes to 5 TB.
b. Each
object is accessible by a unique ID (name or Key).
c. Using
combinations of below properties, you can uniquely identify the object in S3.
i.
Service
End Point
ii.
Bucket
Name
iii.
Object
Key(name)
iv.
Object
version(optional)
d. Bucket is
region specific and object is stored in a Bucket, So Object also is region
specific. The objects never leave that region, unless you intentionally move to
another region or Enable CROSS REGION
REPLICTION properties. We will discuss CROSS
REGION REPLICTION in later section of this Blog.
e. S3 provide
high data durability, Object are S3 redundantly stored in multiple facility in
a region (Where Bucket is created).
3. Storage Classes
Now
we have an idea of what is Bucket
and objects. We store our files in the
form of object in Bucket. While we upload objects in a Bucket, AWS asks for Storage Classes.
We
have different type of data in our environments, every data use for different purpose,
So Each Data has different requirement in terms of durability, availability. So
AWS Provided the different type of storage classes based on the ask. Every
storage class has different durability, availability and cost. if you go to high availability and
durability, you have to pay the high cost.
Below
are the different storages classes provided by AWS for S3 Buckets.
a. S3-Standard
b. S3-Standrad
Infrequent Access
c. S3 one
Zone Infrequent Access
d. S3 Reduced
Redundancy
e. S3
Intelligent Tiering
f.
S3 Glacier
g. S3 Glacier
Deep archive
1. S3-Standard
If
your data is frequently accessed and you need high durability along with high
availability the S3-Standard id your
answer.
1. It
provides Eleven Nines Durability i.e.
99.999999999%.
2. It
provided four Nines Availability i.e.
99.99% over a given year.
3. 3 copies
of each object is created in the different availability Zone.
2. S3-Standard IA (Infrequent Access)
If
you don’t access your data frequently. you need high durability and high availability
is not a requirement but at least good availibity required. Then S3-Standard IA storage class is good
option to store the data.
1. It
provides Eleven Nines Durability i.e.
99.999999999%
2. It
provided Three Nines Availability i.e.
99.9% over a given year.
3. Minimum
storage duration is 30 Days, means
if you put files in this type of storage class Bucket, you have to pay for at
least 30 day’s charges. No matter if you delete after one day, you have to pay
at least 30 day’s cost. This is also a reason. it is good for the long lived
object.
4. Minimum
billable object size is 128 KB, means
if you upload object size of less than 128 KB, AWS charge you 128 KB size
5. Per GB
retrieval charges apply.
6. 3 copies
of each object are created in the different availability Zone.
3. S3 one Zone IA
This class is your option if your data
is Long lived, infrequent access and Non Critical (can be reproduce in case of
data loss)
1. It
provides Eleven Nines Durability i.e.
99.999999999%
2. It
provides Availability i.e.
99.5% over a given year.
3. Minimum
storage duration is 30 Days.
4. Minimum
billable object size is 128 KB, means if you upload object size of less than
128 KB, AWS charge you 128 KB size.
5. Per GB
retrieval charges apply.
6. Only one copy
of each object is created, so in case of Zone failure your data is completely
lost, that’s why AWS suggested to use this class for Non critical data that can
be reproduce in case of lost.
4. S3 Reduced Redundancy (RRS)
Now
this is not AWS recommended storage
class. it is created for frequently access
and non-critical data.
1. It
provides four Nines Durability i.e. 99.99%
2. It
provided four Nines Availability i.e.
99.99% over a given year.
3. 3 copies
of each object is created in the different availability Zone.
S3-Standard is more
cost effective storage class compare to “Reduced
Redundancy” storage class. This is the reason
this class in not recommended for use, It will out from storage class types soon.
5. S3 Intelligent Tiering
If the data access pattern is not
predictable and data is for long lived, then S3 Intelligent Tiering is a good option.
Data resides between 2 storage classes. AWS will move data “S3 Standard” to “S3 Standard IA” storage classes according the accessibility of the object.
If data is infrequent access AWS move
this object from “S3 Standard” to “S3 Standard IA” and vice versa. An Object
less than 128 KB size cannot move to “S3
Standard IA”.
As AWS applies this logic to change
the access class, that’s why AWS charges for monitoring and Automation.
1. It
provides Eleven Nines Durability i.e.
99.999999999%.
2. It
provided Three Nines Availability i.e.
99.9% over a given year.
3. Minimum
storage duration is 30 Days.
4. Extra fee applies
for monitoring and automation of the object.
5. 3 copies
of each object are created in the different availability Zone.
6. S3 Glacier
It
is a solution for long term backup/Archive of data and retrieval time is
between minute to hours.
1. It
provides Eleven Nines Durability i.e.
99.999999999%.
2. It
provided four Nines Availability i.e.
99.99% over a given year.
3. Minimum
storage duration is 90 Days.
4. Per GB
retrieval charges apply but 10 GB
per Month retrieval is free in with an account.
5. 3 copies
of each object is created in the different availability Zone.
7. S3 Glacier Deep Archive
It
is a solution for the archival of rarely access data and retrieval time is between 12 hours to 48 hours.
1. It
provides Eleven Nines Durability i.e.
99.999999999%.
2. It
provided Four Nines Availability i.e.
99.99% over a given year.
3. Minimum
storage duration is 180 Days.
4. Per GB
retrieval charges apply.
5. 3 copies
of each object are created in the different availability Zone.
6. Cost is
approx. 75% less than S3 Glacier
Storage class.
4. S3 Bucket Versioning
Bucket
versioning helps you Keeping multiple version of the object in a same Bucket.
It helps you keep safe your object from accidental data Deletion or overwrite.
It also used for data retention and archive the older data.
1. When
versioning is enabled and you overwrite the existing object, S3 automatically
create a new version of the object. You can also access the older version whenever
is required.
2. When
versioning is enabled and you try to delete an object, a delete marker is
placed on the object. You can still view the object and “delete marker”. If you
want to reconsider the deleted object, you just need to delete the “Delete
marker” and object will be available again.
3. Versioning
apply on all the object of the Bucket.
4. If we
Enabled, the versioning on existing Bucket which already have an object then
Bucket versioning will protect the existing and new object and maintain their
version as they updated.
5. Only S3
Bucket owner can permanently delete the object once versioning is enabled.
6. Once you
Enabled the versioning on the Bucket, you cannot disable it. However, you can
just suspend it. Below are the states of Bucket versioning.
a. Un-Versioned
b. Enabled
c. Suspend
7. If you
suspend the versioning, Existing object version remain as it is. However, objects
will not versioned further in future updates.
8. If you Get
any object and not pass the version, by default S3 return most recent version
of the object.
5. S3 Bucket MFA (Multifactor Factor
Authentication) Delete
MFA Delete is an another level of
security to protect the S3 Bucket. You cannot enable MFA Delete via Console. it can be enabled via Command line and API.
6. S3 Consistency Level
Before
going to discuss the S3 consistency level, we need to understand what is the Data
consistency and type of data consistency.
Data Consistency
In Distribute architecture, when we
store the same data in multiple nodes and read this data from a different node
at same time. Then Consistency level refer how consistent is your returned data.
There are two type of Consistency
1. Strong /Immediate Consistency:
If we update object to any node, it
will be updated to all other nodes before an object is available to read. It
means in some time period object is not available. It requires a blocking
mechanism, Object is block to read until data is updated to all nodes, So Every
consumer will have same the object.
2. Eventual Consistency:
There is no blocking mechanism
required for eventual consistency. if an object is updated to any node and an
immediate read from different nodes, will not return the updated object.
1. S3 provides
strong/Immediate consistency for new object (PUT), Means if you upload object
in S3 then S3 provide Strong Consistency level.
2. S3 provides
Eventual consistency for overwrite the existing object.
3. S3 provides
Eventual consistency to delete the existing object
7. S3 Encryption
You can encrypt the S3 Data at REST.
There are two ways to encrypt data in S3.
1. Server side Encryption (SSE)
Data is encrypted from S3 service
before storing on disk. There is no extra cost to use this feature. There are three ways to achieve server side
encryption.
a. SSE-S3
Data
is encrypted by the S3 service using S3 managed Encryption key. S3 regular
rotate the master key and use AES-256 encryption key.
b. SSE-KMS
Data
is encrypted by the S3 service using AWS KMS Encryption keys.
c. SSE-C
Data
is encrypted by the S3 service using Client provided Encryption key. AWS never
store the key, so if The Client loses the key then he/she can never access the
object.
2. Client side Encryption:
The
Client encrypts the data on their side and then upload/transfer the data into S3.
8. S3 Static Website
With the help of this feature, you can
host your static content websites using S3.
1. S3 Hosted
static websites are automatically scaled to meet the demand. you don’t need any
Load Balancer to scale.
2. There are
no extra charges to host the static websites on S3.
3. You can
use your own Domain Name with S3 hosted static website.
4. S3 hosted
website does not support HTTPS, it works only on HTTP.
5. Below are
two formats of S3 hosted Static website URL
a. Format-1
Example: http://mybucket.s3-website-eu-west- 3.amazonaws.com
b. Format-2
http://<bucket Name>. S3-website . <Aws region>. amazonaws.com
Example: http://mybucket.s3-website.eu-west- 3.amazonaws.com
9. Pre Signed URLs
Pre
Signed URLs uses to provide temporary access to specific objects to those
people who don’t have AWS credentials.
Expiry
date and time is associated with pre-sign URLs.
Pre-Sign
URLs can be used for downloading or uploading the object.
10. CRR (CROSS REGION REPLICATION)
This
is a Bucket level replication, which enables automatic and asynchronous copy of
object to different region bucket in the same or different account.
Use case:
a.
Low latency to access the object
b.
Compliance requirement (Put the data at least
two region)
1. To apply
the CROSS REGION REPLICATION,
Versioning must be enabled on Source and destination bucket.
2. Replication
can happen to only one Destination
Buckets.
3. If you are
setting up CROSS REGION REPLICATION in cross
account, then the source Bucket owner must have permission to replicate objects
in destination bucket.
4. It
replicated TAG along with object, if any.
5. Object
using Encryption with SSE-C (Server side Encryption with Client Key) and
SSE-KMS C (Server side Encryption with KMS Key) cannot be replicated.
6. Objects in
source Bucket that are replicas created by another CROSS REGION REPLICATION process are not replicated in destination
Bucket.
11. S3 Transfer Acceleration
This is used to accelerate the
object uploading process into S3 Bucket from users over a long distance. If you
have a Bucket in USA region and you are trying to upload the object from India,
then it will take long time to upload, S3 Transfer acceleration reduces this
uploading time.
1. Once
S3 Transfer acceleration is enabled, you cannot disable it, however you can
suspend it.
2. S3
Transfer Acceleration uses the Cloud Front Edge location, Once Data has arrived
at nearest CloudFront Edge location from the user then it will be copied to
destination bucket using an optimized network path.
3. No
data is stored on CloudFront Edge location.
4. It
is not HIPAA Compliant.
12. S3 Access
You
can grant S3 Bucket/object permission to:
1. Individual
users
2. AWS
accounts
3. Make the
resource Public
4. All
Authenticated user (User with AWS credentials)
13. Bucket/Object Access path
To access the Bucket /Objects via
SDK/API, AWS provide the URL for each Bucket/Object. AWS S3 provides 2 styles
of paths for accessing S3 Bucket/objects.
1. Virtual-Hosted-Style URLs
Below
is the format:
https://
<Your BucketName>. s3 . <Region Name>. amazonaws.com/<Object
Name>
Example:
2. Path-style URLs
Below is
the format:
https://S3 -
<Region Name > . amazonaws.com / <Bucket Name> / <Object
Name>
Example:
https://s3-us-east-2.amazonaws.com/MyBucket/object.txt
Note:
1.
if your region is “us-east-1” i.e. N. Virginia then you don’t need to pass
the region Name in URL.
2. Support
for the path-style model ends on
September 30, 2020
14. S3 Object Multipart Upload
Multipart upload is used to upload S3 Object in parts, this parts upload
in parallel. You can upload the maximum 5GB object size in one PUT request, if the
object size is more than 5 GB then multipart will help you to upload the
objects.
1. Recommend
object size to use multipart upload is larger than 100 MB. However, you can use
for object size starting from 5 MB.
2. Use
Case:
Higher throughput – we can upload parts in parallel,
Easier error recovery – we need to re-upload only the failed parts
15. S3 Server Access Logging
This S3 feature help you to enable Records
of requests which is made to access the bucket and save these records logs into
S3 Buckets. Bucket from where you apply the Server Access Logging is called
Source Bucket and Bucket where you store these logs are called Destination Bucket.
Source Bucket and Destination Bucket can be same.
1. These
detail logs provide information such as Requester name, Bucket name, Request
Time, Request Action, response code and Error if applicable.
2. By
Default, Server Access Logging is disabled.
3. Destination Bucket (Bucket
where you store logs) can be exist in the same region as Source Bucket (Bucket from where you apply the Server Access
Logging) exists. But recommendation is to put Destination bucket in different
region.
4. There can
be a delay to receive the “Access Logs” in Destination Bucket.
5. Once Enable,
you can disable it any time when required.
16. S3 Bucket Life Cycle Policy
With the help of Life Cycle Policy, you can define the action on objects
during its lifetime.
e.g.
a.
Move Object to another Storage Class.
b.
To Archive objects.
c.
Delete Objects after specific time periods.
1. You
can apply Life cycle policy on all the objects of the Bucket OR Subset of objects.
Based on Versions, Name prefix.
17. S3 and Event Notification
When certain Event occur on Bucket,
you can configure automatically notification to below AWS services.
a. SNS
(Amazon Simple Notification Service)
b. SQS
(Amazon Simple Queue Service)
c. AWS
Lambda Function
1. This
is Bucket level configuration and you can configure multiple event as required.
2. No
Extra charge apply to configure the Event Notification on S3.
3. You
can configure the Notification on Create Bucket, Delete Object etc.
This comment has been removed by the author.
ReplyDeletewhatsapp görüntülü show
ReplyDeleteücretli.show
ZİGL0