Streaming with Remix on Google Cloud Platform (App Engine Flex & Cloud Run)

Leejjon
11 min readJun 9, 2024

--

There are many blog posts that talk about the benefits that HTTP Streaming brings in React.js apps. But I have not seen one show how to get a Remix app to stream responses on Google Cloud Platform. How do you know if your host even supports streaming? How do you see that your deployed app is streaming instead of sending a buffered request? If you’re looking for that information, read on!

Why am I researching this?

After I figured out how to migrate my React App that was created with the Create-React-app tool to a Remix app (which I wrote about in this blog post), I ran into a “problem” that forced me to reconsider what service I used to host my React app.

I always want to be able to use the latest and greatest features. One of the new features in React 18 was that the React frameworks could use HTTP streaming on the server to send data to the client and use Suspense to use this data in your React components (here’s a good link if you don’t fully follow what I mean in the line above).

The Remix framework uses HTTP steaming to prevent request waterfalls that Single Page Apps tend to have when fetching data. This older “When to Fetch” talk from Ryan Florence (he works on React Router and Remix) does an amazing job on explaining this request waterfall problem and how Remix can simplify data loading and makes it very fast.

That talk really sold me on using Remix. But this warning in the streaming documentation from Remix made me wonder if my host even supported it:

How do you know which hosting providers support HTTP streaming?

There isn’t really a list of supported platforms in the Remix documentation. Which makes sense, there are an insane amount of possible hosting providers around the world. It’s not their job to keep track of who supports HTTP streaming and who doesn’t. You will have to read the documentation of your hosting provider. Or if it is unclear, test it!

My hosting provider is Google Cloud Platform, and I’m running my React application on the Node.js runtime of App Engine Standard.

Unfortunately, the App Engine Standard docs write:

My alternatives in Google Cloud Platform

Call me a vendor locked in loser if you want, but I’m looking to stay on GCP if possible. Otherwise I would have to migrate my Google Datastore database and domains to something else and I’m used to the nice Logging and CI/CD tools that GCP offers out of the box.

  • App Engine Flexible is a more custom version of App Engine. Their documentation does mention a setting that could make it “send bytes directly to the client”. After testing I found it does do HTTP streaming. This is great because I can run App Engine Flex services next to my App Engine Standard services behind the same domain. The downside is that you always have to have one instance of App Engine Flex running 24/7. App Engine Flex is a lot more expensive than App Engine Standard for this reason. App Engine Standard only boots up instances if there is traffic (right now I pay between 10–15$ each month for a hobby app with very low traffic).
  • Cloud Run is another serverless option in Google Cloud Platform. The streaming capabilities of Cloud Run were a bit unclear. There were some StackOverflow posts that said that Cloud Run could not do HTTP streaming. Then in this announcement from Google the title seems to suggest that HTTP and gRPC streaming are both supported, but then the article only talks about using gRPC. After testing I found that it actually does support HTTP streaming.
  • Firebase App Hosting. Actually last month Google ancounced Angular and Next.js hosting in Firebase. This blog posts actually mentions streaming: App Hosting also supports advanced framework features that optimize your app’s performance, like streaming for Next.js and defer (lazy loading) for Angular.
    This sounds great but it is unfortunate that they don’t support Remix. I could use Next.js instead. But I am a bit skeptical here because the whole Firebase part of Google Cloud always felt like they bought Firebase and never properly integrated it with the rest of Google Cloud. It feels like a mess (and apparently is one) and I’ve been burned in similar fashion by AWS that promised to support Next.js on AWS Amplify before (it took them over a year to support Next.js 12). So I am a bit scared to use Next.js on any place that is not Vercel. [Out of scope]
  • GKE is Google’s Managed Kubernetes service. It could definitely do HTTP streaming if I managed to set things up to do so. I don’t even consider it because I think Kubernetes is too much for my use case.
    [Out of scope]
    It reminds me of this tweet:

Other developer friendly options would be Vercel and Fly.io (where Kent C Dodds runs his Epic stack), but for the scope of this blog post I’ll leave those out. I would not recommend AWS (I say that as an AWS Certified Developer) unless you’re very experienced with stuff like AWS ECS.

TLDR: We are only going to cover App Engine Flex and Cloud Run in this post.

Test with an example app that streams!

Ryan Florence didn’t put the Pokemon streaming example from his talk on his public GitHub account. So let’s build our own Pokemon streaming project (view the result on GitHub).

Requirements:

  • A bash terminal (I am on Ubuntu 22.04.4)
  • Git (probably)
  • Modern versions of Node.js and NPM (I’m on Node 20.9.0)

Requirements if you want to deploy the App we are building to Google Cloud Platform

  • Install the Google Cloud CLI
  • A Google Cloud account set up (go to cloud.google.com and click on start free).
  • Create a Google Cloud project. You need to have the following services enabled in API & Services: App Engine, Cloud Build API, Cloud Run Admin API, Cloud Logging API & Cloud Monitoring API.
  • VS Code with the Gemini Code Assist + Google Cloud Code extension
  • Docker (only if you want to deploy to Cloud Run and don’t want to leave the building to Cloud Build)
  • It might be useful to follow the interactive tutorials for App Engine Standard, App Engine Flex and Cloud run so that you get an idea.

Step one: Create a remix project:

In a terminal, run the following command and follow the prompts (giving it a name etc). I named my project remix-vite-with-streaming:

npx create-remix@latest

Step two: Collect all Pokemon and put them in a file:

  • Go into the browser and visit this url:
    https://pokeapi.co/api/v2/pokemon?limit=151
  • Copy the json string with all 151 Pokemon. In Firefox I click on raw data and then press the copy button.
  • Create a file pokemon.json in the app folder of your project and paste the contents copied in step two into it.

Step three: Streaming the Pokemon into your React component:

According to the Remix documentation, streaming is enabled by default. So any loader that responds with JSON is being streamed. Even if your code isn’t optimized for it. Here is some code that loads the contents of our pokemon.json file, sends it to the client and displays the pokemon names in the JSX. Paste the snippet below in the _index.tsx file,

import { json, useLoaderData } from "@remix-run/react";

import pokemonJsonAsAny from "../pokemon.json";

type PokemonNameAndUrl = {name: string, url: string};
type PokemonJsonType = {results: Array<PokemonNameAndUrl>};

export async function loader() {
const pokemonJson = pokemonJsonAsAny as PokemonJsonType;
return json(pokemonJson);
}

export default function Pokemon() {
const data = useLoaderData<typeof loader>();runtime: nodejs20

return (
<>
{data.results.map((pokemon, index) => {
return (
<p key={index}>{++index} - {pokemon.name}</p>
);
})}
</>
);
}

If you now test this in the browser, the data shows up immediately:

This is not how streaming works. Even if this response is streamed using HTTP Streaming, the React Component will just wait for the loader to be done before rendering.

To make sure the React Component can already start rendering before the response is streamed, we need to use defer and Suspense:

import { Await, useLoaderData } from "@remix-run/react";
import { defer } from "@remix-run/node";
import { Suspense } from "react";

type PokemonNameAndUrl = { name: string, url: string };
type PokemonJsonType = { results: Array<PokemonNameAndUrl> };

async function awaitPokemon(): Promise<PokemonJsonType> {
const pokemonJsonAsAny = await import("../pokemon.json");
const pokemonJson = pokemonJsonAsAny as PokemonJsonType;
return pokemonJson;
}

export async function loader() {
const pokemonResult = awaitPokemon();
return defer({ pokemonResult });
}

export default function Pokemon() {
const { pokemonResult } = useLoaderData<typeof loader>();

return (
<Suspense fallback={<p>loading</p>}>
<Await resolve={pokemonResult}>
{(pokemonResult) => (
<ul>
{pokemonResult.results.map((pokemon, index) => {
return (
<p key={index}>{++index} - {pokemon.name}</p>
);
})}
</ul>
)}
</Await>
</Suspense>
);
}

Now our client component isn’t blocked anymore! Even though the React Component could set itself up before the stream was done, you probably wouldn’t notice a difference.

If this file contained 151 billion Pokemon instead of 151, our screen would probably say <p>loading</p> for quite a while before in one *POP* all data would be displayed.

But is it Streaming?

I thought that I would be able to see whether it was streamed or not in the Response headers:

At first I thought this was proof that it was streaming. My non streaming React App didn’t have these headers. In my head it made sense that:

  • The Transfer-Encoding is chunked, because the streamed response would be received in chunks rather than one big file.
  • The Connection had the keep-alive value, as with streaming I would expect it to reuse the same connection to transfer the chunks.

After testing I found out that it is also possible for a server to stream a response, but not have these response headers. This made me want to give my example project even more obvious streaming functionality that makes it incredibly obvious when it does and does not stream.

To make the streaming effects more visible we can do two things:

  • Load more data. For example more Pokemon and maybe more attributes of them like move sets in the game.
  • Hack in some artificial delays with code.

Loading each Pokemon a little more delayed

We are going to add some delays to make it look like we are streaming data. Open up the _index.tsx file and paste the following code:

import { defer } from "@remix-run/node";
import { Await, useLoaderData } from "@remix-run/react";
import pokemonJsonAsAny from "../pokemon.json";
import { Suspense } from "react";

type PokemonResponse = {id: number, name: string};
type PokemonNameAndUrl = {name: string, url: string};
type PokemonJsonType = {results: Array<PokemonNameAndUrl>};

function delay(ms: number) {
return new Promise( resolve => setTimeout(resolve, ms) );
}

async function loadPokemon(pokemonId: number, pokemon: PokemonNameAndUrl): Promise<PokemonResponse> {
// This way every pokemon loads a little longer.
await delay(pokemonId * 100);

return {id: pokemonId, name: pokemon.name} as PokemonResponse;
}

export async function loader() {
const pokemonJson = pokemonJsonAsAny as PokemonJsonType;
const pokemonList = pokemonJson.results.map((value, index) => {
return loadPokemon(++index, value);
});

return defer({...pokemonList});
}

export default function Pokemon() {
const data = useLoaderData<typeof loader>();

const suspenseList = Object.values(data).map((value, index) => {
return (
<Suspense key={index} fallback={<p>loading</p>}>
<Await resolve={value} errorElement={<p>Error</p>}>
{(response) => (
<p>{response.id} - {response.name}</p>
)}
</Await>
</Suspense>
);
});

return (
<>
<div>All first gen pokemon:</div>
{suspenseList}
</>
);
}

But when you try to run this code you’ll see it streams but crashes after the 50th Pokemon due to a timeout:

We can solve this by changing a line in the entry.server.tsx file. By default this file is hidden in Remix, you can enable it by running command in the terminal (at the root of your project):

npx remix reveal

Now that you see the entry.server.tsx file, change the value of the ABORT_DELAY const from 5_000 to 20_000.

Streaming from App Engine Flex

Deploying to Flex is just as easy as it is to deploy to Standard. You don’t need to create a Dockerfile. In the root of you project you create a new file called app.yaml which you fill with the following configuration:

runtime: nodejs
runtime_config:
operating_system: "ubuntu22"
runtime_version: "22"
service: default
env: flex
instance_class: F1
entrypoint: npm run start
handlers:
- url: '.*'
script: auto
secure: always
automatic_scaling:
min_num_instances: 1
max_num_instances: 2

That is (almost) all you need to deploy a Remix app. If you already have run the gcloud init and selected your Google Cloud project, you can deploy with:

gloud app deploy

Unfortunately App Engine does HTTP response buffering by default, so you’ll notice that the request is waiting for 15 seconds until the last Pokemon is loaded (this GIF is a little sped up to reduce the size) before it sends the entire request.

A moderator on the Remix Discord channel told me that hosts that don’t support streaming will try to buffer the whole response and send it all at once.

If we want to stream responses on App Engine flex, we need to disable buffering by adding the following response header in the entry.server.tsx on line 125:

responseHeaders.set("X-Accel-Buffering", "no");

Deploy again and you’ll see that it streams fine (even though this GIF file looks a little laggy, on my screen it looked more fluent):

You probably want to delete this version if you actually deployed, otherwise you’re going to get expensive bills. Either delete the entire Google Cloud Project or:

  • Deploy a version to app engine standard.
  • Migrate traffic to this version.
  • Delete the App Engine Flexible version.

Streaming from Cloud Run

Cloud Run is even simpler in the sense that you don’t need an app.yaml and you don’t need to enable any settings to disable buffering. Another great thing is that Cloud Run doesn’t require you to have a server up 24/7. It will just start an instance when needed, and kill it when it is not in use for a while. Their docs state:

In Cloud Run, each revision is automatically scaled to the number of instances needed to handle all incoming requests, events, or CPU utilization.

When a revision does not receive any traffic, by default it is scaled in to zero instances.

Open the your Remix project in VS Code. If you have set up the Cloud Code plugin and are logged in with your Google Account, you can click on the project in the left bottom of your screen and execute the Deploy to Cloud Run action:

You can just use all default settings and click on deploy. It will create a new service in Cloud Run:

And as you’ll notice Cloud Run streams just fine as well:

Happy streaming! Now obviously this is a very artificial (but hopefully very clear) example.

Update: I found out that Ryan Florence uploaded an even better video to explain how to use streaming to make calls effective.

Follow me for more!

The final source code can be found in this GitHub project. If you enjoyed this blog post, please follow me here on Medium, X or LinkedIN.

--

--

Leejjon
Leejjon

Written by Leejjon

Java/TypeScript Developer. Interested in web/mobile/backend/database/cloud. Freelancing, only interested in job offers from employers directly. No middle men.

No responses yet