Table of Contents
Table of Contents
api
#- api gateway throttling limits
- aws throttling limit (region level)
- per account
- per-api per-stage (methods)
- per-client (usage plan)
- three type of endpoint
- edge-optimized (default) - route to nearest cloudfront
- regional
- private
application discovery service
#- for migration planning
- connection type
asg
#athena
#- performance tunning
- partition data
- compression (glue)
- optimise the file size (aws glue)
- use columnar (apache orc, parquet with spark or hive on EMR)
- prevent select *
- use limit by (guide for columnar)
billing
#- cost allocation tags - tags will show in the cost & usage report
- budget - create alert if cost exceed the budget
- setup for cost analysis
- enable cost allocation tags in billing
- allow user access the billing
- cost allocation tags => tags
- user-define => user:XXXX
- aws generated => aws:XXXX
- cost explorer => ui for search and filtering
- cost category => filter in cost explorer (saved filter)
- cost budget => billing alarm with foretasted charged + filtering + linked account; billing alert => amount already be charged
- billing alert include
- recurring fee like premium support
- ec2 instance-hours but exclude
- one off fee
- refund
- forecast
- access control
- reason of cfn with s3 access denied errors
- s3 block public access must turn off if no oas policy is set - because it will override the permissions that allow public read access
- if request pays is turn on, the request must include the payer header
- object cannot be kms encrypted
cfn
#cloudhsm
#cloudtrail
#- best practice to migrate to org trail
- create org trail in central account
- create bucket for org (need to set bucket policy to allow member account to write to it)
- enable cloudtrail feature in org
- create org trail through cli
- move old trail data from member accounts to org trail bucket
- stop cloudtrail in member accounts and remove the old trail buckets
codecommit
#- data protection
- use macie => can help protecting data in s3
codedeploy
#- need to connect to s3 and codedeploy endpoints
- cw embedded metric format => can automatrically create metric from log
- cw endpoints => monitoring.us-east-2.amazonaws.com
- treats each unique combination of dimensions as a separate metric, even if the metrics have the same metric name
data pipeline
#data sync
#ddb
#- can stop start eb environment with lambda at scheduled time
- doesn’t support HTTPS_PROXY
ebs
#- aws only recommend raid0
- summary table for different volume types
- gp2
- range: 100-16k iops
- baseline: base on volume (limited by burst credits)
- provision: no
- gp3
- range: 3k-16k iops
- baseline: consistent 3k iops
- provision: 500iops/gb
- io2, io1
- range: 100-32k iops / 32k-64 iops (only available for nitro system )
- provision: io1: 50iops/gb; io2: 500iops/gb
- io2 block express
- instance store - temporary block-level storage (physically attach to the host so not an network drive) (support in specific instance type. it is free)
- the i/o performance are limited by ec2 instance type. although you can use raid0 to increase iops but still have a max for that
- queue length on ssd: 1/1000iops
- default block size is 4kb
- use case of each volume type
- gp2, gp3 => boot, dev, test
- io1, io2 => db
- st1 (hdd) => large sequential workloads like data / log processing (EMR, ETL, data warehouse)
- sc1 => save costs
ec2
#- use cases of dual home
- separate the traffic by role (frontend, backend)
- ha (move the eni to other instance)
- security appliance reason
- eni is binding to subnet
- eni - when creating the eni, it inherits the public ipv4 address attribute from subnet
efs
#- HA - regional replication => notice that it means multi az not multi regions
- cross region backup
- create 1st lambda to backup data from efs to s3 in region a; turn on cross-region replication in s3; create 2nd lambda to restore data from s3 > efs in region b
- data sync
- backup solution which does not work in cross region
- data pipeline => the backup instance cannot mount 2 efs in different regions
- efs-to-efs => same as data pipeline solution but implemented by lambda function only
- aws backup => does not support cross region backup
- dns name - file-system-id.efs.aws-region.amazonaws.com (like cw. us-east-2)
- efs can deliver sub or low single digit millisecond latencies with > 10gbps through and 500k iops
- launching instance is limited by the number of vcpu running per account per region running
elasticache
#elb
#iam
#mTurk
#- submit a request to mTurk. outsource manual tasks like taking survey, text recognition, data migration to public
opsworks
#org
#- org features to enable
- scp is one of the aws organisation feature
- default is allow all => can only use deny list only
- use allow list => have to remove FullAWSAccess (the default allow all policy)
other
#rds
#route53
#- access control to object c (account c) from request user a (account a) and s3 bucket (account b)
- check the iam role in account a
- check the bucket policy in account b
- check the object acl in object owner
- event notification support object and bucket level but it will resend the notification and sometimes will delay so people use cloudwatch event instead
- there is a check in trusted advisor for the check open access in s3 bucket but no remediation for that. to fix the bucket permission automatically and use lambda + cloudwatch event
- cf cannot cache if the request is larger than 30gb. can use range request to chunk the large file into smaller object
- requester pays don’t support 1. anonymous request 2. SOAP request
- default 100, max 1000 in each account
- genomics data processing use case
- sync data to s3 with data sync
- use s3 for data storage
- use storage gateway (on-premise access) / fsx (ec2 access)
- s3 encryption - only support symmetric keys
- when downloading encrypted s3 object, have to download the encrypted object along with a cipher blob version of the data key. client send the cipher blob to kms to get the plaintext version of data key to decrypt the object
- reduced redundancy is one of the storage class in s3 but not recommend bylaws - may have a chance to lose the object
secret manager
#- set RotationSchedule to schedule an auto rotation (to run a custom lambda) the rds password
snowball
#- tips to increase performance
- batch small file
- multiple copy operations at one time (2 terminals 2 cp command)
- connect to multiple workstation (1 snowball can connect to multiple workstation)
- step for using snowball
- start the snowball
- setup workstation by download ova image and import to the vmware
- use cp command (something like aws s3api cp) to copy file to snowball
- can upload through gui / command line
- send the device back to aws. they will import your data to s3
- take at least 1 weeks
sso
#- permission sets - 1 permission set has multiple iam policies => associate to user / group
- sso
- ad (identity provider) -> aws sso -> application (github, dropbox) / aws accounts
- sources of identity provider
- aws sso
- ad connector
- aws managed ad
- external ad (two way trust)
- server -> client
- server = adfs
- create an app
- config app sign-in and sign-out url
- client = integrated website
- trusted idp
- config idP’s sign-in and sign-out url + cert
- user login with ad’s app endpoint => ad post data to app’s sign-in url => app receive and decrypt the data from ad and give permission to user
- aws iam federation => single account only
storage gateway
#- need to download ova and import the vm to create a endpoint to bridge on-premise and aws
- storage gateway type
- volume => mount as a disk (iSCSI) => s3 => ebs
- cached => save some frequently used data to vm
- stored => completely on s3
- file => smb / nfs
- tape => tape backup software
support
#swf
#vpc
#worksplace
#- use connection aliases for cross-region workspaces redirection
- create connection alias
- share to other account
- associate with directories in each region
- setup route53 for failover
- setup connection string
- maintenance - support regular maintenance windows (eg 15:00-16:00) or manual maintenance but cannot set something like patching on tue 3:00
- workspaces application manager - package manager to help installing software
- workspaces support Windows 10 desktop but no Windows server