Pushing the limit with tilemap rendering

Enterprises trust Teamflow to host their large virtual offices. That means painting huge office layouts without sacrificing app responsiveness. We’ve seen workspaces that span more than 500,000 floor tiles. To sustain 60fps in these environments, our engineering team built a specialized pipeline for tilemap rendering. Let’s dive deep into how we did that!

Tilemaps are a technique borrowed from game development. We use them to render the floor map in workspaces. Each tile is aligned on a 128px grid that spans infinitely across a workspace.

Teamflow’s level editor showing where a tile can be placed in a 128px tilemap grid

The data architecture

Unlike traditional tilemaps, Teamflow’s floor plans are dynamic and can be changed at any time by users. When you edit your floor plan, other people see those changes immediately. The larger floor plans can’t be viewed in one screen either. The requirements for the how we store the tile data were:

  1. Terse to reduce loading time
  2. Easily editable to let people change their floor plan
  3. Random access so you can view a small portion of the entire floor

To facilitate random access, we divided up the tilemap into 32x32 tile blocks and encoded each block independently. Each block is then indexed by position into a single document in MongoDB. This lets us get a block at, say, (32, 32) in the tile grid by hashing its x, y position and using that as a key into this document.

Each tile is a tuple of “position” and “image.” We use a binary format to encode the tiles in each block. Since the tiles of each block are listed together, the x, y position of a tile is implicit by its index in an array. We then encode each tile image as a 16-bit number. Instead of storing image URLs for each tile, we enumerate all the different tile images used in the workspace (ideally less than 65,535 of them!) – and give each image an “asset index.” These 16-bit assets indices let us guarantee that the floor plan is stored in an amortized 16-bits per tile.

The tradeoff here is ease of editing. To alter the floor plan of a workspace, the blocks being edited need to be decoded, be modified, and then be re-encoded. This works very well, however, because floor plans are relatively static over time.

Putting data on the GPU

Since Teamflow’s floor plans are big, we don’t want to re-fetch tilemap data each time you open the app. Instead, we hash each tile block and cache it in IndexedDB. This lets us skip downloading tilemaps when they haven’t changed since last time.

Moreover, we don’t want to keep huge amounts of data sitting in JavaScript memory. That often causes V8 to slow down because of garbage collection. Teamflow uses WebGL for graphics rendering. Since floor tiles are rendered by the GPU, we want the floor tile data to rather be in graphics memory.

We use a custom mesh implementation that takes in fetched tile data, decodes it, converts it into a mesh geometry, then de-references the tile data in JavaScript memory.

Mesh rendering

WebGL is a low-level graphics interface that can draw and shade triangle primitives. In our tilemap, each tile is a square that is drawn as two triangles. Each tile triangle is encoded as three vertices with a x- and y- coordinate, a tile atlas, and the position of the tile image in the atlas.

Since each tile is a small 128px image and GPUs have a limit to how many different images they can render at once, we combine tile images into texture atlases with up to 16 tiles. This reduces the number of times we bind and unbind textures when rendering the floor plan.

A Teamflow texture atlas seen with Spector.js

Since our tilemaps are not predefined, we cannot generate these texture atlases beforehand. These are generated on the user device dynamically based on which tile images are needed for rendering. We use the @pixi-essentials/texture-allocator library I developed to do this, and you can take advantage of dynamic texture atlasing by using it too!

Rendering tile blocks

WebGL requires two programs to be implemented so it can draw a scene:

  1. Vertex shader: The vertices of each triangle being drawn are passed to the GPU here so it can then find each pixel to be colored.
  1. Fragment shading: Each pixel “fragment” is then calculated in the fragment shader. For the floor tilemap, we sample colors from the texture atlas and copy them onto the screen. We clamp where we sample the texture atlas to prevent mipmapping artifacts from ruining the tile rendering in a zoomed out view.

We use a single mesh geometry to render all the triangles in a tile block. This is a form of batching to reduce the number of draw calls needed to paint a floor map. By batching tiles in predefined blocks, we are able to reuse mesh geometries across frames. To draw each frame, we find all the blocks intersecting the visible space and render only them. Since each block is a 4096px square, there are almost always fewer than 16 draw calls per frame.

The results

We’ve tested Teamflow with up to one million tiles in a workspace, and our mesh renderer is able to scale up to that kind of demand smoothly! The frame rate drops temporarily when you pan over a new part of the office, because the mesh renderer has to upload mesh geometries for tile blocks. But when you pan over those areas again, Teamflow is able to sustain near 60-fps performance even with 4x CPU slowdown on my Mac.

Teamflow sustaining 55 fps while panning over an oceanic office at 4x CPU slowdown

Teamflow is the best place for remote and hybrid teams to collaborate and work together. Start your 30-day free trial today.