Experimenting with Compute@Edge
Since its inception, I’ve been serving this website from a single DigitalOcean droplet in the San Francisco region. This approach has served me well and has provided a good simple solution to run this site as well as my personal Nextcloud instance and any other services I wish to experiment with.
However, recently I’ve been reviewing how I can modernize this relatively simple static site for better performance and security. Serving out of a single virtual server is quite sub-optimal for some obvious reasons:
- It’s a special snowflake, it can go down at any time and become unavailable
- I need to consistently stay on top of security configuration and updates
- Users requesting static content from the site experience bad performance in non-United States/West Coast locations
A New Future
Although many of these problems can be solved via a legacy CDN, I’ve been interested recently in Fastly’s Compute@Edge product. Compute@Edge is one of several solutions in the serverless application space that allows one to run their code as close to the user as possible (with some limitations) without needing to maintain and build their own edge infrastructure. Unlike competitors Cloudflare Workers or Cloudfront Functions Compute@Edge offers a tighter integration and better developer experience with WebAssembly to power portable, secure, and highly performant applications at the edge.
These approaches can combine the performance and resiliency benefits of a CDN with the ability to run an application without an origin, meaning no more virtual servers or traditional web hosting providers are necessary to serve my site. I also no longer need to worry about security patching or maintenance beyond that of my application code.
As for cost, I expect my site to easily fit within the free trial for Compute@Edge, and I’m completely happy in the unlikely event that it is unavailable due to a spike in usage.
Fastly provides a really neat utility compute-js-static-publish which will bundle any set of static files into a Compute@Edge application that is publishable on their platform. Setting up my Hugo site was extremely easy, and I was able to automatically configure a free and auto-renewed Let’s Encrypt certificate for SSL encryption to my application within Fastly’s web interface.
I could test my application locally, and then once validated that it was working as expected, deploy it to Fastly’s edge network within seconds. The power of Wasm means that this is possible without needing to care about my runtime environment, including the hardware upon the application was built or that which it is running on.
I have yet to profile the performance of this new site versus the old, but I expect the difference for my personal usage to be marginal and the improvements to mainly come from users far away from the old California-based origin.
A Future Future
Although this use case is quite simple, I am excited for the potential of Wasm based compute in general. Serverless solutions based on interpreted languages (e.g. JavaScript and Python) or containers and microVMs (e.g. Docker, Firecracker) are first-to-market but are somewhat limited in the performance, security, and operability departments. Wasm is mostly lacking portability and developer tooling at the moment, but that is rapidly improving. The tool chain which gives Wasm its particular power in this department is the WebAssembly System Interface. This provides an abstraction for system-level interfaces and for which an API for ML inference and training, wasi-nn, is currently being explored.
In the ML space, there is a rise in bespoke hardware accelerators which currently incentivize vendor lock-in. Wasm has the potential to provide an amazing abstraction between these accelerators and the ecosystem of frameworks that enable ML developers to be productive.
I’m also excited about the potential of combining Wasm with Kubernetes. I expect at some point the combination of wasi-nn
and Kubernetes to become very interesting for ML infrastructure applications as well as the repeatable execution of workloads across increasingly heterogenous architectures in general.