Skip to main content
Hugo’s IT journal Hugo's IT journal

Istio Note

·
k8s service-mesh istio
Main concept>

Main concept #

Injecting a proxy in front of each pods - so we can do traffic management

  • Traffic Shifting
  • Fault Injection
  • Circuit Breaking
  • Mirror

and apply security features

  • mtls between pods
  • security policy

or some logging / auditing works

Imagine there is a proxy server in front of each pod.

VirtualService and DestinationRule are the configure of the proxy server

Actually the offical documentation is good enough. here is a good course to learn about istio.

Istio Settings>

Istio Settings #

Install Multicluster>

Install Multicluster #

Setup Guide

The Before you Begin section is important. For Multi-Primary Multicluster, we need to generate a root certificate to issue the certificates for cluster1 and cluster2. See Configure Trust

  • Multi-primary: istiod on each cluster
  • Primary-remote: only primary will have istiod - remote will share the istiod with primary
  • On different networks: the pods are in different networks at 2 k8s clusters. EKS support vpc networking, the pods can be in the vpc subnet so the pods in different cluster can be configured on the same network. I have set up 2 k3s clusters on aws ec2. The pods are on different networks.
Install VM Workload>

Install VM Workload #

istio can bring VM into the service mesh. treat vm as a k8s pod

  • Workload Entry: Pod
  • Workload Group: Deployment
  • Service Entry: Service

I follow this guide to set this up

There are few options for the installation. I chose

  • Multi Network - as the vm and k8s pods are in different networks (For example: vm 192.168.0.1/24 k8s pods 10.0.1.0/24. In AWS, we pods can use the aws vpc network which mean if we have a subnet 192.168.0.1/24 both vm and eks pods can be in same subnet)
  • Automated WorkloadEntry Creation - workload entry will automatically create after joining the mesh
dns>

dns #

Before the installation, we need to setup a dns server first. VM need to know about the k8s cluster (*.cluster.local eg: httpbin.default.svc.cluster.local).

Here are some notes on setting up dnsmasq on ec2.

/etc/dnsmasq.conf

bind-interfaces
listen-address=127.0.0.1
server=169.254.169.254
address=/cluster.local/<k8s_cluster_ip>/
  • bind-interfaces + listen-address make the dns server work. bind-interfaces will bind port 53 on all interfaces but listen-address will restrict it on specific interface
  • need to forward queries to aws’s dns (169.254.169.254)

/etc/resolv.conf

nameserver 127.0.0.1

update resolv.conf to tell the vm to use the dnsmasq

For ubuntu, the resolv.conf will be managed by systemd-resolved so it is better to disable it

systemctl disable systemd-resolved
systemctl stop systemd-resolved

system-resolved will create a symbolic link to /etc/resolv.conf. When systemd-resolved is disable, the resolv.conf may disappear after the server restart. So be careful of that.

Install with revision>

Install with revision #

If istio is installed with the canary method, the istiod service name will be different.

  • without revision: istiod.istio-system.svc.cluster.local
  • with revision: istiod-<revision>.istio-system.svc.cluster.local

The templates at Expose services inside the cluster via the east-west gateway are using istiod.istio-system.svc so we have to change it manually

Also we have to add --revision when deploy the east west gateway

Service vs ServiceEntry>

Service vs ServiceEntry #

  • the vm can be selected with Service - ServiceEntry is not necessary
  • If we want to use ServiceEntry, DNS proxy need to be enabled
Upgrade>

Upgrade #

In-placed>

In-placed #

run istioctl upgrade with the updated istioctl binary

Canary>

Canary #

For example, we want to upgrade istio from 1.13.0 to 1.14.3.

We can

  • install new version (let 2 versions coexist)
  • switch to new version
  • uninstall the old version

Firstly, istio has to be installed with --revision

istioctl operator init --revision 1-13-0

and apply IstioOperator config with specific revision

spec:
...
  revision: 1-13-0

To enable auto injection in namespace (example is default), instead of

kubectl label ns default istio-injection=enabled

we have to specify the revision

kubectl label ns default istio.io/rev=1-13-0

To upgrade istio:

  • Download 1.14.3 istioctl binary
  • init 1.14.3 operator istioctl operator init --revision 1-14-3
  • apply IstiOperator config with revision: 1-14-3
  • Update injection label - kubectl label ns default istio.io/rev=1-14-3 --overwrite
  • Uninstall the old version - istioctl x uninstall --revision 1-13-0
Sidecar injection>

Sidecar injection #

  • cannot enable globally - not like mTLS - need to be apply per namespace / per pod
  • hostNetwork: true in pod will stop the injection
  • injection can be override in pod level
Service Entry>

Service Entry #

This is confusing me so much.

Why the hostname resolution does not work?>

Why the hostname resolution does not work? #

My expectation: like k8s’ service but it’s for external services - so we are adding an A record to the k8s DNS

For example:

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: mysql
spec:
  hosts:
  - db.hhuge9.com
  location: MESH_EXTERNAL
  endpoints:
  - XXXXX

Expected that I can connect to my own mysql service with mysql -h db.hhuge9.com at any pod in the mesh after apply this config

But the result is - db.hhuge9.com cannot be resolved

The reason is - ServiceEntry will not do anything on the dns unless we enable the DNS proxying feature

When we apply this ServiceEntry

apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: google
spec:
  hosts:
  - google.hhuge9.com
  location: MESH_EXTERNAL
  resolution: DNS
  endpoints:
  - 142.250.200.14

with enabling dns proxying:

  • ISTIO_META_DNS_CAPTURE: true - will add entry to k8s dns. google.hhuge9.com will be resolved with the endpoints ips (142.250.200.14)
  • ISTIO_META_DNS_AUTO_ALLOCATE: true - will allocate an internal ip (like service in k8s). google.hhuge9.com will be resolve with that ips

We can leave ISTIO_META_DNS_AUTO_ALLOCATE: false and assign an internal ip at the addresses field

...
  hosts:
  - google.hhuge9.com
  addresses:
  - 192.168.0.1
  ...
  endpoints:
  - 142.250.200.14

Therefore nslookup google.hhuge9.com will return 192.168.0.1

But for the host name resolution, we need ISTIO_META_DNS_CAPTURE: true

Is it just a canonical name / A record?>

Is it just a canonical name / A record? #

I doubted why ServiceEntry is needed. We can access the service directly without applying a config, right? It looks so useless

Acutualy, It becomes useful when we want to control the egress traffic

When the REGISTRY_ONLY policy is enabled, only registered services can be accessed within the mesh. Therefore, a service entry of facebook.com will be needed if we want to access facebook inside the pod

AuthorizationPolicy>

AuthorizationPolicy #

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
 name: httpbin
 namespace: foo
spec:
 selector:
   matchLabels:
     app: httpbin
     version: v1
 action: ALLOW
 rules:
 - from:
   - source:
       principals: ["cluster.local/ns/default/sa/sleep"]
   - source:
       namespaces: ["dev"]
   to:
   - operation:
       methods: ["GET"]
   when:
   - key: request.header[foo]
     values: ["bar"]

This policy means - allow access to the pod in namespace foo with label httpbin and v1 with the existence of

  • GET request and
  • foo: bar header and the source is
  • from namespace dev or
  • with principals cluster.local/ns/default/sa/sleep

principal: cluster.local/ns/default/sa/sleep means sleep.default.svc.cluster.local. It is the service account name when the mtls is enabled

The account name can be confirmed by printing out the environment variables with env in the pod

Ingress Gateway>

Ingress Gateway #

Egress Gateway>

Egress Gateway #

Traffic Management>

Traffic Management #

Most of the examples can be found on here

  • Fault Injection - like: return 500 / add delay to the requests
  • Traffic Shifting - eg: 50% to v1; 50% to v2 - good for canary / blue green deployment / AB testing
  • Circuit Breaking - exclude the pods if face consecutive 5XX error / set max connections
  • Mirror - mirror the traffic. good for logging / testing / monitoring