Notes on using the AWS CDK with Clojure for CloudFormation deployments

July 18, 2022

The AWS CDK is ostensibly an Infrastructure as Code (IaC) tool for “defining cloud infrastructure and provisioning it through AWS CloudFormation”, but it’s really two things:

A TypeScript library with Python, Java, and C# bindings, which represents cloud resources as objects and defines relationships among them. This library is used to construct a tree of resources, which can be “synthesized” into a cloud assembly. The cloud assembly is a self-contained directory that includes the CloudFormation templates and supporting assets, which collectively represent your infrastructure.
A CLI tool for working with the cloud assembly directory. The CLI tool can deploy cloud assembly to AWS and update or destroy deployments, all via AWS CloudFormation. It also invokes the program that you wrote to build the cloud assembly, although this function is ancillary.

Since there are Java bindings (albeit generated from the TypeScript source), you can use Clojure to build the cloud assembly that the CLI tool then uses as input.

Alternatives to the CDK

I considered some other approaches for orchestrating non-trivial AWS deployments, and came away with the following impressions:

The CDK is not the only — or necessarily the best — approach, though it seems to be a perfectly valid choice. Many use Terraform for IaC (especially for more involved builds), and it seems that AWS CDK users predominantly use the TypeScript bindings, which makes sense because this is the language AWS uses to develop the CDK, and code samples abound.
You could also use raw CloudFormation templates in YAML or JSON, which might be a good choice for less involved infrastructure, since with anything complicated it’s easy to accumulate hundreds of lines of template code. One thing that’s nice about the CDK in comparison is that all the tools already present in your coding environment (auto-complete, code / interface discovery, etc.) transfer.
In any case, defining and provisioning cloud infrastructure really should always be done through either a template spec (like raw CloudFormation) or IaC. Provisioning the resources imperatively in a build script that calls the AWS API at each stage or using the AWS web console are strictly further from the efficient frontier of featureful ∝ complex.
Incidentally, the pattern of computing a data structure describing some set of side effects — instead of performing each side effect one by one — is familiar, idiomatic Clojure, and brings with it the familiar benefits.

Project setup

Clojure usage is broadly similar to the Java usage described in the docs:

Set up a new CDK java project

$ mkdir cdk-app
$ cd cdk-app
$ cdk init --language java

Get rid of Maven / Java
```
$ rm -rf src/ pom.xml
```

Add a Clojure deps.edn file referencing the Java AWS CDK

{:paths ["src/"]
 :deps  {org.clojure/clojure                {:mvn/version "1.11.1"}
         software.amazon.awscdk/aws-cdk-lib {:mvn/version "2.33.0"}}}

Add a Clojure source file at src/core.clj

(ns core)
   
(defn synth 
  "Synthesize cloud assembly to the dir at the CDK_OUTDIR env var."
  [& _])

Update the “app” property in the cdk.json file to invoke the Clojure main method by replacing the auto-generated Maven command with a clj invocation
```
{"app": "clj -X core/synth", ...}
```

Now if you cdk synth, you’ll see an error that there’s nothing in the cdk.out/ directory. At this point, just follow the AWS CDK Java documentation to fill in the body of core/synth with some code that will generate cdk.out, and you’re good to go.

See this code sample [github gist] for more information. It’s from my project, and out of context, so you might have to read between the lines a bit.

From here, I followed along with this tutorial from Nathan Peck, translating his TypeScript examples to Clojure while filling out the synth function. The final version of his TypeScript CDK code is here.

Note on translating TypeScript CDK examples to Clojure

Note that, while one could translate literally from TypeScript and subclass the Java CDK classes with proxy …

(proxy
  [software.amazon.awscdk.App]
  [(-> (software.amazon.awscdk.AppProps/builder) (.build))])

the recommended Java idioms will of course feel nicer.

(ns core
  (:import (software.amazon.awscdk App Stack)))

(defn synth [_]
  (let [app (App.)]
    ; Register a stack under the App root
    (Stack. app "MyStackId")

    ; ... register more things under the created stack ...

    ; Synthesize the app (the root of the tree of constructs) to cdk.out
    (println "Synthesized to:" (.getDirectory (.synth app)))))

Using the REPL

Instead of cdk synth, you can build a cloud assembly from the REPL

;; In this case, it's helpful to set the output dir explicitly
(let [app (App. (-> (AppProps/builder) (.outdir "cdk.out") .build))]
  (Stack. app "MyStackId")
  (println "Synthesized to:" (.getDirectory (.synth app))))

and use the –app flag to point the CDK CLI at an existing directory and skip the synth stage.

$ cdk list --app cdk.out/

I think the best case would be the ability to do this from Babashka. This might require a custom pod, though.

Core concepts

After playing around a bit, I found it extremely helpful to read the documentation on the CDK’s core concepts.

Notes from the documentation

The AWS CDK uses the IDs of all constructs in the path from the tree’s root to each child construct to generate the unique IDs required by AWS CloudFormation. This approach means that construct IDs need be unique only within their scope, rather than within the entire stack as in native AWS CloudFormation. It does, however, mean that if you move a construct to a different scope, its generated stack-unique ID will change, and AWS CloudFormation will no longer consider it the same resource.
Resources that maintain persistent data, such as databases and Amazon S3 buckets and even Amazon ECR registries, have a removal policy that indicates whether to delete persistent objects when the AWS CDK stack that contains them is destroyed.
In general, we recommend against using AWS CloudFormation parameters with the AWS CDK.
The whole treatment of assets is pretty cool.
To specify a Java jar as an asset for a lambda, place the jar inside a directory and specify the directory as an asset. This is because jar fmt = zip fmt, and if you didn’t nest it the jar would be extracted.
You can modify the raw CloudFormation that will be generated, if necessary.
consider using the AwsCustomResource. This makes it possible to perform arbitrary SDK calls during an AWS CloudFormation deployment. Otherwise, you should write your own Lambda function to perform the work you need to get done.
Be careful about any refactor of your AWS CDK code that could cause the ID to change, and write unit tests that assert that the logical IDs of your stateful resources remain static.
Consider keeping stateful resources (like databases) in a separate stack from stateless resources. You can then turn on termination protection on the stateful stack, and can freely destroy or create multiple copies of the stateless stack without risk of data loss.
Credentials for production environments should be short-lived. After bootstrapping and initial provisioning, there is no need for developers to have account credentials at all; changes can be deployed through the pipeline. Eliminate the possibility of credentials leaking by not needing them in the first place!

A mishmash. Algo trading & market making @ sixtant.io. I also practice parkour.

mjdowney@protonmail.ch | public key