4/17/2018

Step by step guide to setup a multi-region Kubernetes app

Who doesn't want to spread one of its SaaS across continents? I finally had the opportunity to do it for my Image-Charts SaaS!

Image-charts was at first hosted on a Kubernetes cluster in europe-west Google Cloud Engine region and it was kind of an issue considering where our real users where (screenshot courtesy of Cloudflare DNS traffic):

As we can see, an important part of our traffic come from the US zone. Good news, it was time to try multi-region kubernetes. Here is a step by step guide on how to do it.

New Kubernetes cluster region on GCE

First thing first, let's create a new kubernetes cluster in us-west coast:

You should really (really) enable the preemptible nodes feature. Why? Because you get chaos engineering for free and continuously test your application architecture and configuration for robustness! Preemptible VM means that cluster nodes won't last more than 24 hours and good news : I observed cluster cost reduction by two.

Get a static IP from GCE

Next we need to create a new IP address on GCE. We will associate this IP to our kubernetes service so it will always stay with a static IP.

gcloud compute addresses create image-charts-us-west1 --region us-west1

Wait for it...

gcloud compute addresses list
NAME                       REGION        ADDRESS         STATUS
image-charts-europe-west1  europe-west1  35.180.80.101   RESERVED
image-charts-us-west1      us-west1      35.180.80.100   RESERVED

Kubernetes YAML

The above app YAML defines multiple Kubernetes objects: a namespace (app-ns), a service that will give access to our app from outside (my-app-service), our app deployment object (my-app-deployment), auto-scaling through horizontal pod autoscaler and finally a pod disruption budget for our app.

# deploy/k8s.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: app-ns
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: my-app
    zone: __ZONE__
  name: my-app-service
  namespace: app-ns
spec:
  type: NodePort
  ports:
  - name: "80"
    port: 80
    targetPort: 8080
    protocol: TCP
  selector:
    app: my-app
  type: LoadBalancer
  loadBalancerIP: __IP__
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: app-ns
  name: my-app-deployment
  labels:
    app: my-app
spec:
  replicas: 3
  # how much revision history of this deployment you want to keep
  revisionHistoryLimit: 2
  strategy:
    type: RollingUpdate
    rollingUpdate:
      # specifies the maximum number of Pods that can be unavailable during the update
      maxUnavailable: 25%
      # specifies the maximum number of Pods that can be created above the desired number of Pods
      maxSurge: 200%
  template:
    metadata:
      labels:
        app: my-app
    spec:
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              topologyKey: "kubernetes.io/hostname"
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - my-app
      restartPolicy: Always
      containers:
      - image: __IMAGE__
        name: my-app-runner
        resources:
          requests:
            memory: "100Mi"
            cpu: "1"
          limits:
            memory: "380Mi"
            cpu: "2"
        # send traffic when:
        readinessProbe:
          httpGet:
            path: /_ready
            port: 8081
          initialDelaySeconds: 20
          timeoutSeconds: 4
          periodSeconds: 5
          failureThreshold: 1
          successThreshold: 2
        # restart container when:
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8081
          initialDelaySeconds: 20
          timeoutSeconds: 2
          periodSeconds: 2
          # if we just started the pod, we need to be sure it works
          # so it may take some time and we don't want to rush things up (thx to rolling update)
          failureThreshold: 5
          successThreshold: 1
        ports:
        - name: containerport
          containerPort: 8080
        - name: monitoring
          containerPort: 8081
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
  namespace: app-ns
spec:
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: my-app-deployment
  minReplicas: 3
  maxReplicas: 5
  targetCPUUtilizationPercentage: 200
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
  namespace: app-ns
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Some things of note:

__IMAGE__, __IP__ and __ZONE__ will be replaced during the gitlab-ci deploy stage (see continuous delivery pipeline section).

  kind: Service
  # [...]
  type: LoadBalancer
  loadBalancerIP: __IP__

type: LoadBalancer and loadBalancerIP tell kubernetes to creates a TCP Network Load Balancer (the IP must be a regional one, just as we did).

It's always a good practice to split up monitoring and production traffic that's why monitoring is on port 8081 (for liveness and readiness probes) and production traffic on port 8080.

Another good practice is to setup podAntiAffinity to tell kubernetes scheduler to place our app pod replica between available nodes and not put all our app pods on the same node.

And the good news is: this principle can be easily applied in Kubernetes thanks to requests and limits.

Finally we use PodDisruptionBudget policy to tell Kubernetes scheduler how to react when disruptions occurs (e.g. one of our preemptible node dies).

Continuous Delivery Pipeline

I used Gitlab-CI here because that's what power Image-charts & Redsmin deployments but it will work with other alternatives as well.

# gitlab.yaml

# https://docs.gitlab.com/ce/ci/yaml/
image: docker

# fetch is faster as it re-uses the project workspace (falling back to clone if it doesn't exist).
variables:
  GIT_STRATEGY: fetch
  # We use overlay for performance reasons
  # https://docs.gitlab.com/ce/ci/docker/using_docker_build.html#using-the-overlayfs-driver
  DOCKER_DRIVER: overlay

services:
  - docker:dind

stages:
  - build
  - test
  - deploy

# template that we can reuse
.before_script_template: &setup_gcloud
  image: lakoo/node-gcloud-docker:latest
  retry: 2
  variables:
    # default variables
    ZONE: us-west1-b
    CLUSTER_NAME: image-charts-us-west1
  only:
    # only
    - /master/
  before_script:
    - echo "Setting up gcloud..."
    - echo $GCLOUD_SERVICE_ACCOUNT | base64 -d > /tmp/$CI_PIPELINE_ID.json
    - gcloud auth activate-service-account --key-file /tmp/$CI_PIPELINE_ID.json
    - gcloud config set compute/zone $ZONE
    - gcloud config set project $GCLOUD_PROJECT
    - gcloud config set container/use_client_certificate False
    - gcloud container clusters get-credentials $CLUSTER_NAME
    - gcloud auth configure-docker --quiet
  after_script:
    - rm -f /tmp/$CI_PIPELINE_ID.json

build:
  <<: *setup_gcloud
  stage: build
  retry: 2
  script:
    - docker build --rm -t ${DOCKER_REGISTRY}/${PACKAGE_NAME}:${CI_COMMIT_REF_NAME}-${CI_COMMIT_SHA} .
    - mkdir release
    - docker push ${DOCKER_REGISTRY}/${PACKAGE_NAME}:${CI_COMMIT_REF_NAME}-${CI_COMMIT_SHA}

test:
  <<: *setup_gcloud
  stage: test
  coverage: '/Lines[^:]+\:\s+(\d+\.\d+)\%/'
  retry: 2
  artifacts:
    untracked: true
    expire_in: 1 week
    name: "coverage-${CI_COMMIT_REF_NAME}"
    paths:
     - coverage/lcov-report/
  script:
    - echo 'edit me'

.job_template: &deploy
  <<: *setup_gcloud
  stage: deploy
  script:
    - export IMAGE=${DOCKER_REGISTRY}/${PACKAGE_NAME}:${CI_COMMIT_REF_NAME}-${CI_COMMIT_SHA}
    - cat deploy/k8s.yaml | sed s#'__IMAGE__'#$IMAGE#g | sed s#'__ZONE__'#$ZONE#g | sed s#'__IP__'#$IP#g > deploy/k8s-generated.yaml
    - kubectl apply -f deploy/k8s-generated.yaml
    - echo "Waiting for deployment..."
    - (while grep -v "successfully" <<<$A]; do A=`kubectl rollout status --namespace=app-ns deploy/my-app-deployment`;echo "$A\n"; sleep 1; done);
  artifacts:
    untracked: true
    expire_in: 1 week
    name: "deploy-yaml-${CI_COMMIT_REF_NAME}"
    paths:
     - deploy/

deploy-us:
  <<: *deploy
  variables:
    ZONE: us-west1
    CLUSTER_NAME: image-charts-us-west1
    IP: 35.180.80.100

deploy-europe:
  <<: *deploy
  variables:
    ZONE: europe-west1
    CLUSTER_NAME: image-charts-europe-west1
    IP: 35.180.80.101

Here are the environment variables to setup in Gitlab-CI pipeline settings UI:

DOCKER_REGISTRY (e.g. eu.gcr.io/project-100110): container registry to use, easiest way is to leverage GCE container registry.

GCLOUD_PROJECT (e.g. project-100110): google cloud project name.

GCLOUD_SERVICE_ACCOUNT: a base64 encoded service account JSON. When creating a new service account on GCE console you will get a JSON like this:

{
  "type": "service_account",
  "project_id": "project-100110",
  "private_key_id": "1000000003900000ba007128b770d831b2b00000",
  "private_key": "-----BEGIN PRIVATE KEY-----\nMIIEvgIBADANBgkqhkiG9w0BAQE\nhOnLxQa7qPrZFLP+2S3RaSudsbuocVo4byZH\n5e9gsD7NzsD/7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n7ECJDyInbH8+MEJxFBW/yYUX6XHM/d\n5OijyIdA4+NPo6KpkJa2WV8I/KPtoNLSK7d6oRdEAZ\n\n-----END PRIVATE KEY-----\n",
  "client_email": "gitlab@project-100110.iam.gserviceaccount.com",
  "client_id": "118000107442952000000",
  "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://accounts.google.com/o/oauth2/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
  "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/gitlab%40project-100110.iam.gserviceaccount.com"
}

To make it work on our CI, we need to transform it to base64 in order to inject it as an environment variable.

cat service_account.json | base64

PACKAGE_NAME (e.g. my-app): container image name.

The YAML syntax below defines a template...

.before_script_template: &setup_gcloud

... that we can then reuse with:

<<: *setup_gcloud

I'm particularly fond of the one-liner below that makes gitlab-ci job wait for deployment completion:

(while grep -v "successfully" <<<$A]; do A=`kubectl rollout status --namespace=app-ns deploy/my-app-deployment`;echo "$A\n"; sleep 1; done);

It's so simple and does the job.

Note: the above pipeline only runs when commits are pushed to master, I removed other environments code for simplicity.

The important part is:

deploy-us:
  <<: *deploy
  variables:
    ZONE: us-west1
    CLUSTER_NAME: image-charts-us-west1
    IP: 35.180.80.100

This is where the magic happens, this job includes shared deploy job template (that itself includes setup_gcloud template) and specify three variables: the ZONE, CLUSTER_NAME and the IP address to expose) that alone is sufficient to make our pipeline deploy to multiple Kubernetes cluster on GCE. Bonus: we store the generated YAML as an artifact for future inspections.

Pushing our code you should have something like this:

Now lets connect to one of our kubernetes cluster. Tips: to easily switch between kubectl context I use kubectx

# on us-west1 cluster
kubectl get service/my-app-service -w --namespace=image-charts
NAME                          TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE
my-app-service                LoadBalancer   10.59.250.180   35.180.80.100    80:31065/TCP   10s

From this point you should be able to access both public static IP with curl.

Configuring Geo-DNS

Same warning as before, I used Cloudflare for the geo-dns part but you could use another provider. It costs $25/mo to get basic geo-dns load-balancing between region: basic load balancing ($5/mo), up to 2 origin servers ($0/mo), check every 60 seconds ($0/mo), check from 4 regions ($10/mo), geo routing ($10/mo).

Once activated, create a two pools (one per region using the static IP we just created):

... then create a monitor and finally the load-balancer that distribute load based on the visitor geolocation.

And... we're done 🎉

4/15/2018

Sync/backup entire directories to personal Gitlab projects

Since ~2000 I always setup a "/labs" folder on new computers. I use it to quickly start new projects and experiments.

Even if I have continuous backup in place through Backblaze, I wanted, for non-public & non-open-source projects to be able to quickly sync/backup them to Gitlab... and that's what this script do.

Checkout sync-to-gitlab on Github

4/07/2018

Migrated Image-Charts website from gulp to parcel, picture says it all

I spent only 2 hours tonight migrating Image-Charts website from gulp to the amazing Parcel.

Before:

After:

Before:

After:

WAY less code, one cli that automatically created the dependencies tree starting from my main pug file, imported build packages (pug and node-sass) and created a state-of-the-art dist directory. Fast by design.

3/07/2018

cargo publish - invalid pack file - invalid packfile type in header error

I had this weird cargo error while trying to publish a new Mailchecker rust crate version and could not find anything on Google about.

Updating registry `https://github.com/rust-lang/crates.io-index`
error: failed to update registry `https://github.com/rust-lang/crates.io-index`

Caused by:
  failed to fetch `https://github.com/rust-lang/crates.io-index`

Caused by:
  invalid pack file - invalid packfile type in header; class=Odb (9)

I removed the local registry copy and it worked:

rm -rf ~/.cargo/registry
3/03/2018

[Recap] 2017 through my Github repositories and social media posts

I know it's a little late but I wondered this morning : would it be nice to make a recap of 2017 using only my Github, Instagram and Twitter accounts, so here we are!

I started 2017 playing with HTML Audio API, building real-time sound analysis and visualizations.

But this was mostly for entertainment, on the business side I was working on the architecture of the upcoming KillBug SaaS...

... and the user workflow with @EmmanuelJulliot.

I finally bought a LaMetric and started to play with it...

Later that year, during (another!) breakfast I built rss-to-lametric that displays latest Rss news directly from LaMetric!

The NodeJS version was great but not good enough. I've rewrote it in Rust 2 days laters, released it on Github and published multiple lametric apps with it!

... without knowing I would receive a Cease and Desist letter from 20 minutes because I used 20 Minutes logo, copyright stupidity.

In March I redesigned fgribreau.com so it could share testimonials about my advisory activities. The site source-code is now hosted on Github and served through Netlify.

Still in March, we've deploy KillBug on a Google Cloud Kubernetes cluster and I made a cost parsing little tool for it (now I use ReOptimize for that)

In April I sold RedisWekely newsletter to Redis Labs, a newsletter I started in 2013 that reached thousands of subscribers.

I also started to get back on wood work...

Had a lot fun building a laptop stand this weekend! It's built entirely out of raw pallet wood (pine), tailor-made, very robust and way better by design than industry alternatives (weak, plastic, not enough space for ventilation). Some followers on twitter are interested to buy some, would you? What you see here is version 1 usable for 13" and 15" macbook pro and I plan to remove every piece of steel so it will only be made of recycled pallet wood. With models for 13" 15" and 17" laptop and various colors (raw, dark oak dye, light oak dye). Anyway, what a perfect way to change my mind between the preparation of 2 new talks for #BreizhCamp this week (#Swagger #ImageCharts #automation #PostgreSQL #Postgrest and #GraphQL, more on that later). Oh! One last thing, the new SaaS I talked about in the previous shots is going well thanks to our amazing team, beta registration will be available in 3 weeks top!

Une publication partagée par François-Guillaume Ribreau (@fgribreau) le

at that time I did not know that I would build my very own table in August!

In the same month I've left iAdvize to join Ouest-France as head of digital development and architect. Elon Musk bibliography was an awesome read.

... and I also read that

... and this one ...

... and this one...

... and this one...

... and these ones (and lot of other books that I did not share on Instagram 😅)...

10 days after my arrival at Ouest-France I "hacked into" active directory to map the organisational chart because I could not find an up to date one and here is the result I had...

One other thing that bothered me was that some internal software (e.g. Redmine) at Ouest-France were not available programmatically from outside because of an old version of Microsoft ISA-server redirecting to an auth form and thus blocking requests. So I wrote a little tool that proxy requests and rewrite them to pass Microsoft ISA server gateway and let us finally automate things on top of internal software like Redmine.

In the last three month of every year since 2014, I always work on improving HeyBe. Heybe is a SaaS that let companies generate personalized video at scale for their customers. This year, I migrated HeyBe video generation cluster to Google Cloud, wrote my own Google Cloud Group Instance auto-scaler for our custom need in Rust.

I released a simple Google Cloud API client in Rust...

... and a NodeJS stringify equivalent in Rust.

Heybe 2017 release was then ready for prime-time!

Moving cluster to Google Cloud largely reduced HeyBe fixed costs as well.

For the past 5 years I do still follow hacker philosophy and share what I learn, here I gave a lecture at Nantes University about NoSQL.

and there at Ouest-France about Kubernetes philosophy and Containers

I worked on KillBug and wrote code for the animated background...

... and discovered that Google Cloud was blocking outbound connections to ports 25, 465 and 587....

... so I wrote a little SMTP proxy to fix it...

... and also built a PostgreSQL to AMQP tool that forwards some pg_notify events to AMQP (RabbitMQ) for KillBug.

Some weeks later we opened KillBug first private beta

I went on improvising playing piano as well...

... and really enjoyed it ...

In June, like every year (!) I went to Web2Day to give a talk about SaaS industrialization with OpenAPI/Swagger

Tesla X ! #soonTesla3 #tesla

Une publication partagée par François-Guillaume Ribreau (@fgribreau) le

Speaking about talk, I had the chance to give one (very unprepared) at Maia Matter in September, and two at Breizh Camp for my first attending.

During a travel to Paris I rewrote my NodeJS kubectx kubernetes little helper in Rust and saw impressive performance improvements.

In one August night I made my own clap recognition script with NodeJS and Philips Hue!

2017 was also the year I went back to electronics, bringing back souvenirs from childhood 😇

In October I started eating Huel every morning, as I'm taking my breakfast, writing these lines, I still do it today. It was a very impressive life-style change and I feel way more connected everyday since that day.

In October I also added custom domain name support to Image-Charts enterprise plan!

Oh! And I also bought some Raspberry Pi and hacked a Magic Mirror!

In the mean time during that year I will wrote some code in Rust, like this Rust spinner library

Waiting for my Tesla 3, I bought a BMW x3 and asked them to send me the source code and how to subscribe as a developer to build widget for the onboard computer (but failed miserably!)

However I got the source-code as a DVD AND pushed it on gitlab.com/fgribreau/bmw_oss!

During December holidays I hacked on Ouest-France platform tooling

... wrote a JSON-schema documentation-generator

... wrote some BlockProvider examples (I will give a talk at Breizh Camp about what they are, how they start to help bringing change to a large organization like Ouest-France and what an agnostic CMS is 😉)

And a validator-cli for the upcoming Ouest-France platform

Lot of other things were released, updated or experimented during 2017. Let's make 2018 even better than 2017!

« »
 
 
Made with on a hot august night from an airplane the 19th of March 2017.
http://bit.ly/1II1u5L