AWS SA Professional
·
cert
exam
aws
sa
Table of Contents
api>
api #
- api gateway throttling limits
- aws throttling limit (region level)
- per account
- per-api per-stage (methods)
- per-client (usage plan)
- three type of endpoint
- edge-optimized (default) - route to nearest cloudfront
- regional
- private
application discovery service>
application discovery service #
- for migration planning
- connection type
- application discovery agent -> install on server. support vm / physical server
- discovery connector -> install on vCenter (is a ova)
- Migration Hub import => import the details directly
asg>
asg #
- will automatically tag the instances by default
- cooldown will start after last instances launched if there are multiple instance scale at the moment
athena>
athena #
- performance tunning
- partition data
- compression (glue)
- optimise the file size (aws glue)
- use columnar (apache orc, parquet with spark or hive on EMR)
- prevent select *
- use limit by ( guide for columnar)
billing>
billing #
- cost allocation tags - tags will show in the cost & usage report
- budget - create alert if cost exceed the budget
- setup for cost analysis
- cost allocation tags => tags
- user-define => user:XXXX
- aws generated => aws:XXXX
- cost explorer => ui for search and filtering
- cost category => filter in cost explorer (saved filter)
- cost budget => billing alarm with foretasted charged + filtering + linked account; billing alert => amount already be charged
- billing alert include
- recurring fee like premium support
- ec2 instance-hours but exclude
- one off fee
- refund
- forecast
cf>
cf #
- access control
- cf + waf + elb, it should be cf (set custom header) > waf (validate the rule) > alb
- cf + s3, => cf with oai > s3 bucket policy
- cf + alb => cf with custom header > alb rule
- reason of cfn with s3 access denied errors
- s3 block public access must turn off if no oas policy is set - because it will override the permissions that allow public read access
- if request pays is turn on, the request must include the payer header
- object cannot be kms encrypted
cfn>
cfn #
- can use automatic deployment to auto deploy existing stackset to new accounts in organisation
cloudhsm>
cloudhsm #
- need tcp/3389 for windows and tcp/22 for linux to connect to ec2 to install cloudhsm client; tcp/2223-2225 to communicate with the cluster
cloudtrail>
cloudtrail #
- best practice to migrate to org trail
- create org trail in central account
- create bucket for org (need to set bucket policy to allow member account to write to it)
- enable cloudtrail feature in org
- create org trail through cli
- move old trail data from member accounts to org trail bucket
- stop cloudtrail in member accounts and remove the old trail buckets
- create org trail in central account
codecommit>
codecommit #
- data protection
- use macie => can help protecting data in s3
codedeploy>
codedeploy #
- need to connect to s3 and codedeploy endpoints
cw>
cw #
- cw embedded metric format => can automatrically create metric from log
- cw endpoints => monitoring.us-east-2.amazonaws.com
- monitoring.XXXX
- no az
- treats each unique combination of dimensions as a separate metric, even if the metrics have the same metric name
data pipeline>
data pipeline #
- components
- pipeline definition
- pipeline schedule
- task runner
- swf which is specific for data engineering
- task runner can run on on-premise hosts
- task runners can be run on any compute resources (ec2 and on-premise servers)
- use resources in multiple regions
- only supported in limited region
data sync>
data sync #
- use for transfer data between on-premise and aws or between aws service
- support cross-region (s3 <=> s3, efx <=> efx, efs <=> efs) sync
- source location
- destination location
ddb>
ddb #
- support atomic counter
- local secondary index only can create at table creation
eb>
eb #
- can stop start eb environment with lambda at scheduled time
- doesn’t support HTTPS_PROXY
ebs>
ebs #
- aws only recommend raid0
- summary table for different volume types
- gp2
- range: 100-16k iops
- baseline: base on volume (limited by burst credits)
- provision: no
- gp3
- range: 3k-16k iops
- baseline: consistent 3k iops
- provision: 500iops/gb
- io2, io1
- range: 100-32k iops / 32k-64 iops (only available for nitro system )
- provision: io1: 50iops/gb; io2: 500iops/gb
- io2 block express
- only support with specific instance ( R5b, X2idn, and X2iedn)
- range: 256k iops
- provision: 1000iops/gb
- instance store - temporary block-level storage (physically attach to the host so not an network drive) ( support in specific instance type. it is free)
- the i/o performance are limited by ec2 instance type. although you can use raid0 to increase iops but still have a max for that
- queue length on ssd: 1/1000iops
- default block size is 4kb
- use case of each volume type
- gp2, gp3 => boot, dev, test
- io1, io2 => db
- st1 (hdd) => large sequential workloads like data / log processing (EMR, ETL, data warehouse)
- sc1 => save costs
ec2>
ec2 #
- use cases of dual home
- separate the traffic by role (frontend, backend)
- ha (move the eni to other instance)
- security appliance reason
- eni is binding to subnet
- eni - when creating the eni, it inherits the public ipv4 address attribute from subnet
efs>
efs #
- HA - regional replication => notice that it means multi az not multi regions
- cross region backup
- create 1st lambda to backup data from efs to s3 in region a; turn on cross-region replication in s3; create 2nd lambda to restore data from s3 > efs in region b
- data sync
- backup solution which does not work in cross region
- data pipeline => the backup instance cannot mount 2 efs in different regions
- efs-to-efs => same as data pipeline solution but implemented by lambda function only
- aws backup => does not support cross region backup
- dns name - file-system-id.efs.aws-region.amazonaws.com (like cw. us-east-2)
- efs can deliver sub or low single digit millisecond latencies with > 10gbps through and 500k iops
- launching instance is limited by the number of vcpu running per account per region running
elasticache>
elasticache #
- caching strategies
- lazy load - set cache when select from db
- write through - set cache when write to db
- ttl - write through + lazy load but set an expire date
- connection endpoints
- node ep - read and write
- primary ep for write; reader ep for read (cluster mode disabled)
- configuration ep for read and write like node ep (cluster mode enable)
- automatically cache query to elasticache for rds, aurora and redshift (use proxy)
- support up to 500 nodes and shards
elb>
elb #
- classic load balancer only support at most one subnet per az
iam>
iam #
- set SAML session tags for access control (add attribute to idp metadata)
- policy to deny access on specific region - deny all except the global service
- in console, the instance profile are automatically created along with the iam role with the same name
- ArnLike is case-sensitive but support wildwards like * and ?
- group name limit is 128 characters
- temporary security credentials are valid until they expire
mTurk>
mTurk #
- submit a request to mTurk. outsource manual tasks like taking survey, text recognition, data migration to public
opsworks>
opsworks #
- setup custom recipe to config the application with other aws services information => solid example for adding redis cluster connection information to rail application
org>
org #
- org features to enable
- all
- consolidated billing
- scp is one of the aws organisation feature
- default is allow all => can only use deny list only
- use allow list => have to remove FullAWSAccess (the default allow all policy)
other>
other #
- Public Data Sets - data set for public access. more details
- fileb:// is supported in
- kms (key)
- ec2 user data (gzip)
- s3 (encryption key)
- govcloud comparson
- billing and using can be viewed in standard account
- only us citizen employees can administer the govcloud
- authentication is isolated from amazon.com
- network is isolated from other region
- migrate IBM MQ to Amazon MQ
- migrate ibm db2 luw to rds (mysql postgresql)
- iot monitor can check whether the rule has been executed
rds>
rds #
- RMAN restore isn’t supported for Amazon RDS for Oracle DB instances (RMAN is an backup and store tool for oracle db)
route53>
route53 #
- health check must respond with 2xx or 3xx. support tcp and http
- support DNSSEC
s3>
s3 #
- access control to object c (account c) from request user a (account a) and s3 bucket (account b)
- check the iam role in account a
- check the bucket policy in account b
- check the object acl in object owner
- event notification support object and bucket level but it will resend the notification and sometimes will delay so people use cloudwatch event instead
- there is a check in trusted advisor for the check open access in s3 bucket but no remediation for that. to fix the bucket permission automatically and use lambda + cloudwatch event
- cf cannot cache if the request is larger than 30gb. can use range request to chunk the large file into smaller object
- requester pays don’t support 1. anonymous request 2. SOAP request
- default 100, max 1000 in each account
- genomics data processing use case
- sync data to s3 with data sync
- use s3 for data storage
- use storage gateway (on-premise access) / fsx (ec2 access)
- s3 encryption - only support symmetric keys
- when downloading encrypted s3 object, have to download the encrypted object along with a cipher blob version of the data key. client send the cipher blob to kms to get the plaintext version of data key to decrypt the object
- reduced redundancy is one of the storage class in s3 but not recommend bylaws - may have a chance to lose the object
secret manager>
secret manager #
- set RotationSchedule to schedule an auto rotation (to run a custom lambda) the rds password
snowball>
snowball #
- tips to increase performance
- batch small file
- multiple copy operations at one time (2 terminals 2 cp command)
- connect to multiple workstation (1 snowball can connect to multiple workstation)
- step for using snowball
- start the snowball
- setup workstation by download ova image and import to the vmware
- use cp command (something like aws s3api cp) to copy file to snowball
- can upload through gui / command line
- send the device back to aws. they will import your data to s3
- take at least 1 weeks
sso>
sso #
- permission sets - 1 permission set has multiple iam policies => associate to user / group
- sso
- ad (identity provider) -> aws sso -> application (github, dropbox) / aws accounts
- sources of identity provider
- aws sso
- ad connector
- aws managed ad
- external ad (two way trust)
- server -> client
- server = adfs
- create an app
- config app sign-in and sign-out url
- client = integrated website
- trusted idp
- config idP’s sign-in and sign-out url + cert
- user login with ad’s app endpoint => ad post data to app’s sign-in url => app receive and decrypt the data from ad and give permission to user
- aws iam federation => single account only
storage gateway>
storage gateway #
- need to download ova and import the vm to create a endpoint to bridge on-premise and aws
- storage gateway type
- volume => mount as a disk (iSCSI) => s3 => ebs
- cached => save some frequently used data to vm
- stored => completely on s3
- file => smb / nfs
- tape => tape backup software
- volume => mount as a disk (iSCSI) => s3 => ebs
support>
support #
- paid support plans allow unlimited number of users to open technical support cases
swf>
swf #
- swf vs step function: use step function first. if does not fit => swf
- use case: processing large product catalogue using Amazon Mechanical Turk
vpc>
vpc #
- 5 sg / eni
- for dx, need to enabled route propagation
- resource arn supported in dx
- tenancy vpc will determine the instance tenancy by default.
- for vpce, need to ensure the private dns option is enabled
worksplace>
worksplace #
- use connection aliases for cross-region workspaces redirection
- create connection alias
- share to other account
- associate with directories in each region
- setup route53 for failover
- setup connection string
- maintenance - support regular maintenance windows (eg 15:00-16:00) or manual maintenance but cannot set something like patching on tue 3:00
- workspaces application manager - package manager to help installing software
- workspaces support Windows 10 desktop but no Windows server