I am still puzzling through the right way to think about some ideas in compression, so let’s take a break from that and talk about generally cleaning up the source code.

To ease the maintainability of the game going into the future, there are many things that could be done. I’ll detail some of those here. I just spent a few hours trying the most obvious of these; I’ll list the completed tasks first, along with their impact on the source code.

Before this process, the Braid source code was 95,366 lines, counted using:

cloc --no3 --by-file-by-lang

As always, the reported number excludes comments and blank lines. This is just counting source from the game, not including third-party libraries, and in fact missing a few pieces of the game like the shaders, scripts for building and installing on various OSs, etc. But it’s a pretty okay measure of the size of the program.

Things Done So Far.

Delete a bunch of unused code from the game: 10,254 lines removed. Back at the time, Braid was built using the body of code that had been ‘my game library’; I never knew what code Braid might and might not use, so I just left everything in there in case I might want it. At this point, all that code is old enough that even if I did want to reuse some piece of it, it should be rewritten anyway or at least given a good looking-at.

Code deleted: Priority queue, timed-average and timed-average-variance, float quantization and encoding, arithmetic coder, networking, byte packing for network packets, mesh and skeletal animation code, triangle intersection, Jenkins hash, crc32, triangle intersection, cubic spline fitting and editing, view frustum, two different proximity grids, Fridge memory allocator, redundant directory navigation code, bit array.

Remove Xbox-360-specific code: 3,694 lines removed. Remove a bunch of special-case code surrounded by #ifdef XBOX360. Examples: annoying cert requirements about user profiles, menus, leaderboards, xbox live achievements, special cases for launching Speed Run from Xbox interface. The game was also ported to the PlayStation 3, but this was done by an outside party in a different branch of the source, so I did not have to worry about that now.

Get rid of Demo functionality / upgrading to ‘full game’: 319 lines removed. Demos are complicated to manage (on systems that don’t support demos well, which is most of them, it multiplies the number of builds you have to manage by 2 and adds lots of complicated details). Also, demos are not really necessary today; if people want to find out about a game, they can look on YouTube or read what many many people have to say about the game on the internet. On platforms like Steam, players can return a game (that’s been played for less than two hours) for a full refund, so a large part of the reason for a demo has been removed (you could view a demo as a way of not getting screwed by paying for something that’s terrible, but refunds do that too). Empirically, on Steam, very few people download the demo any more.

This only removed 319 lines, but those 319 lines contained a lot of ‘if’ statements and complications around key moments of game flow, so this is a pretty good win.

Data-wise, this also allowed me to remove 3 game levels that were only used in the demo. There are still some complications hanging out in the game data that could be removed (such as member variables on various entities that help control game flow in the demo). There are also a bunch of strings that can now be deleted.

Get rid of non-Steam PC online publishers that have folded: 31 lines removed. Removed ifdefs for GREENHOUSE, IMPULSE, GAMERSGATE. Not a lot of code, nor was it very complicated, but hey, it’s gone now.

That’s everything I have done so far. We’ve dropped the number of lines of code from 95,366 to 81,068, which is a nice start.

Here’s the list of clean-ups that are pending. I’ll cover the impact of each of these in future postings as they happen. I’ll certainly also think of more things to clean up.

Use C++11 initializers to initialize class members; eliminate constructors where appropriate. Prior to C++11, you had to initialize values in a constructor separately from where they are declared. These tend to live in separate files if you care about compile time and readable headers. There’s no good reason for this, and I had long considered it a basic problem in C++, but it was addressed in C++11. In some cases, last I checked, you still can’t do the “right thing” (for example if you want to initialize values in a base class). This should reduce the number of lines of code in the program by a fair amount, and reduce error-proneness (since propensity for failing to initialize things will be reduced) but it will also be very tedious, so I may do it in parts.

Eliminate Bitmap_Loader_Module system. When I first put together my ‘engine’, I knew I would want to load bitmaps of varying types. So I built a system where there is a Bitmap_Loader, and you tell it what filename you want to load, and it parses off the extension, then looks at each Bitmap_Loader_Module to determine whether it handles that extension. The ‘nice’ thing about this was supposed to be that I could implement support for new formats just by dropping new files into the project, without any other part of the project knowing about them (since the new Bitmap_Loader_Modules would register themselves with the Bitmap_Loader via a constructor on a global variable). This is a very ‘abstracted engine’ way of thinking, and you know what, meh. By which I mean after years of experience this way of structuring the code has turned out to be no more useful than just editing a hardcoded procedure that looks at a file extension and calls the appropriate code. (And while being no more useful, it takes more effort to maintain, and took longer to write – there’s boilerplate code there, and an API that each Bitmap_Loader_Module must conform to, and it’s just not necessary).

Drastically simplify bitmap loading. Originally, bitmap loading in Braid was pretty simple. But then, toward the end of the project, it became clear that the game would not fit in memory on the Xbox 360 (which had 512MB of physical RAM, but of course a game could not use all of that), even if all the bitmaps were compressed as DDS. One option would have been to introduce dynamic bitmap streaming, and pause as you enter each level until bitmaps are loaded. But the Xbox 360 was slow, and the file format that Xbox Live games had to use was extra slow, and this just seemed like it would complicate a lot of things and negatively impact the player’s experience. Instead, I chose to make the game fit into RAM. Jpeg gets much better compression than dds, so I stored most bitmaps as jpeg (in two different files, one for color and one for alpha, because jpeg did not handle 4-channel images in practice at that time). But we also needed to keep things small on the GPU, so the jpegs would be decoded and uploaded to the GPU as two different textures, one for the Y channel, and one for the CbCr channels, with CbCr downsampled just as it is in jpeg. But we kept the character animation sheets in DXT because these needed sharp edges. So all shaders needed to have two variants, one for a DXT texture and one for a combination of Y/CbCr textures; all game-level logic about setting textures and applying shaders needed to understand this too. It complicated things by a lot. (Fortunately, Sean Barrett did all this stuff as we were getting close to ship time, so I was able to focus on meeting the Xbox 360’s certification requirements). Finally, because jpeg decompression was not particularly fast on an xbox 360, the game asynchronously decompresses jpegs with a primitive job system; and because none of this was foreseen by the ‘Bitmap Loader Module’ system mentioned above, ability to work with this was hacked into that system in an ugly way.

It all works, but it is not very nice going into the future. Furthermore, there are performance implications (the Y and CbCr textures are stored on the GPU with LINEAR layouts, which becomes increasingly less fast with each generation of GPUs.) So I plan to simplify back to a scheme where we use DXT or whatever other natively compressed format the GPU understands, possibly employing a transcoder library such as the one from binomial.info.

Frame rate and fullscreen-mode stuff. This one is going to be interesting. During development, I was very conscious that I wanted Braid to have consistent physics, regardless of platform or frame rate (on a console, varying frame rate was not really a problem, but on PCs broadly, you have no idea how fast you are going to be able to run on some random unknown machine.) It seemed reasonable to say, 60Hz is the target frame rate, and we design for that, and we won’t ever run faster. (“60Hz should be enough for anyone”). I wrote a physics routine that was meant to run every 1/60sec, which I figured was enough to prevent tunneling. On consoles, we would always run at 60; on PCs, we would run at even multiples of 1/60 sec, the idea being that we would just run the same physics step multiple times, which gives results consistent with just having run it once per frame for many frames. This restricted us on the PC to frame rates that evenly divide 60: 60Hz, 30Hz, 20Hz, 15Hz, 12Hz, 10Hz, etc.

On the PC, I needed to pick one of these frame rates and stick to it, so at startup, the game spends 0.7 seconds rendering a moderately heavy test scene, and seeing how quickly it is able to do that (throwing away the first few frames to discard any effects of non-resident texture maps or cold memory). The game then sets its frame rate to the highest rate it thinks it can consistently hit. If during gameplay it ever starts missing that frame rate, it then drops to the next stable frame rate.

It’s actually more complicated than this, because there is an option to remove visual effects to gain speed, so this startup sequence tries both quality levels, too. And there’s an option to drop resolution to half, so the startup sequence tries that, too (if necessary).

On Windows, because the scheduler is coarse and not meant for what we would today consider solid multimedia applications, it is pretty hard to page-flip your game at exactly 60Hz unless you are synced to a 60Hz display signal. Because Braid uses DirectX9, syncing the display at 60Hz meant going into fullscreen exclusive mode (because you couldn’t vsync in windowed mode). And the problem there is that fullscreen exclusive mode in Windows has always, always been a complete disaster, and very high up the list of support issues. It triggers driver bugs, it makes your game look distorted and ugly (because for some reason monitor and TV vendors decided that if you switch to an aspect ratio different than the one supported by the display, the right thing to do is to STRETCH THE IMAGE NON-PROPORTIONALLY TO FILL EVERY PIXEL OF THE DISPLAY, FFFFFUUUUUUUUUCK YOOUUUUUUUUUUUU you assholes). It causes Windows to lock up sometimes. It makes your monitor go black and flail for an awkward few seconds. Whatever happens, it is almost never good.

Because all of this is so terrible, games these days tend to render at whatever frame rate and resolution the desktop was already in – just not monkeying with the video mode at all. This often requires rendering many more pixels than before, or else rendering to an offscreen surface and upscaling, but GPUs are fast these days, so it’s not that big a deal. However, Braid can’t necessarily do this in a straightforward way (even after switching to a newer DirectX that allows us to vsync in windowed mode), because if the player’s desktop refresh rate is not one of the frame rates listed above, we can’t vsync! So we would have to tear (which looks bad) or change the display frequency (which may cause a lot of problems).

The assumption that “60Hz should be enough for anyone” also, of course, is no longer valid. 144Hz monitors are very popular, G-SYNC is sort of a thing (even though nobody I know understands how to properly render on a G-Sync monitor) and who knows what is going to happen in the future. So, the following stuff needs to happen:

  • Rewrite the physics system to produce consistent results at arbitrary frame rates
  • Change the display code to run at desktop resolution and frame rate
  • Change the DX version to support vsync in windowed mode (maybe by using DXGI, though that is scary and problem-causing, so maybe by upgrading to DX11, which, also yuck).

All this said, it’s worth pointing out that in world 4 (where time moves as you move) we had to simulate for arbitrary amounts of time based on how quickly you were moving, and whereas I did not do anything super dumb here (for example, I subdivide the timestep to try and produce somewhat-consistent results), as the code is structured now, there’s no way to make the results fully consistent with the rest of the game, because you are by necessity seeing results in-between the 1/60 sec official physics samples. This doesn’t seem to have caused problems that anyone notices, so it might be that so long as the physics timestep is below a certain size, nobody will ever complain.

Complete upgrade to Visual Studio 2015; fix header file mess. Yesterday I switched to Visual Studio 2015. The launcher no longer compiles, for mysterious reasons, but everything else works fine. Need to fix the launcher. Need to set up the Visual Studio 2015 redist for the Steam install. Need to figure out whatever timebombs Microsoft has put into Visual Studio 2015 this time that make software harder to reliably distribute (figure out Windows XP compatibility situation, figure out whatever other options I need to turn off to disable telemetry, prevent weird undesired dependencies). Fix header files that have been missing from the project file forever, which makes global-search-and-replace in Visual Studio not effective.

In my newer work the goal is to get away from Visual Studio forever, but that won’t happen for Braid any time too soon, so I might as well keep it nice.

Get rid of Pool, use Isolated_Heap_Pool. Pool is a simple fast allocator, the default flavor of which just uses the C runtime’s malloc on the back end. Isolated_Heap_Pool is a variant of this that creates a separate heap per Pool, which means you can create a bunch of Pools and then destroy them and not worry about causing fragmentation. This was done because on consoles, fragmentation had been a big problem. I might as well keep things this way, since it’s cleaner, but the old Pool is still used by my font code. So if I change the font code (should be a trivial change) I can remove pool.cpp and pool.h.

Make Entities_By_Type use Array and not List. Remove List. List is a linked-list data sturcture. In Braid, every kind of entity has an array of all instances of that entity type. Often gameplay code iterates over such a list (“for every monster in the level, do the following.”) The problem is that linked lists are not very good for cache behavior in this kind of use case (where the collection is seldom modified, and traversed very frequently). So, I want to change this to use an array structure. It’ll be tedious since a lot of sites traverse the lists, and even though they use macros to do it, the list-ness pokes its head into user code sometimes. But it won’t be too bad. We already made this change on The Witness, and it was very helpful.

Remove add_color and weird use of multiply_color. Back during development, I knew I wanted the screen to fade to black sometimes, and to white at other times. However, GPUs were slow enough that it was not clear we could afford a postpass on most machines. So fade-to-black is achieved by a per-entity multiplicative color that goes down to 0, and fade-to-white is achieved by a per-entity additive color that goes up to 1. This does not give exactly the same effect as a full-screen fade but it is ‘close enough’ and nobody has ever noticed or complained. However, this prevents us from doing more-sophisticated transitions, and passing those extra colors around has speed implications. The code will get simpler if I just enact all these fades in a post-pass. Then, after that, we can work on doing something more interesting with them.