<- Back
Comments (74)
- psanchezThis reminds me of a story from 15 years ago, where I was developing a technology to download games on demand by hooking into the OS calls.There was a particular game that was superslow when this tech was applied. Original game loading took around 15-20 seconds, whereas once the tech was applied it took easily 3-5 min, even with all data already downloaded.When I started digging into it, I realized the reason was the game was using something like fread(data, 1, 65536, fptr); instead of fread(data, 65536, 1, fptr); Which basically expanded back in the day to 65k reads of 1 byte for several MB file. Each fread translated to 65k reads of ReadFile Windows API. Since my code was hooking on ReadFile system call, and my call was heavier than ReadFile, the game loading felt really slow. Unusable. It would have not been fun for players.The easy fix was to swap arguments for certain calls. The long fix required to use an internal cache to account for these cases so that the hooked ReadFile was faster when data was already in disk.Funny thing is that as we started rolling out the tech and applying it to more and more games we realized lots of games did this. We went for the cache fix and games ended up loading faster than before. Honestly, games could have load all the data in a couple of seconds by just swapping the args. I'm guessing developers did this on purpose so that games seemed like they were loading a lot of stuff, although you never know.
- dlcarrierSimCity had a read-after-free bug that Microsoft patched in Windows 95. That was a lot easier for customers than having Maxis fix it, which could have required exchanging copies of the game.
- hodgehog11I think we're starting to see more of this sort of thing happening now with Proton and Wine gaining prominence in the Linux community. Some games (Elden Ring comes to mind) have bad enough PC ports when they come out that the compatibility layer can incorporate a hotfix to improve performance, while users of the software on the original platform still had to suffer.
- selcukaTo be fair it is possible that the developer enabled a special "unroll all loops, no matter what" optimisation flag during compilation.I agree it would be stupid for a compiler to even support such a flag, but those were the 1980s/90s.
- kazinator> Anyway, my colleague found that there was one program that needed to allocate around 64KB of memory on the stack and initialize it. The standard way of doing this is to perform a stack probe to ensure that 64KB of memory is available, then subtracting 65536 from the stack pointer, and then initializing the memory in a small, tight loop.Actually, the standard way of allocating 64 kB of memory on the stack is to just assume you can do it, subtract 64k from the stack pointer, and hope for the best.Most stack allocations in the wild are not checked.
- ashdnazgI worked on a transpiler from Nand2tetris assembly to WebAssembly, and had some really annoying memory corruption bug that I just couldn't solve.That is, until I checked the program I used for testing (which I didn't write), and found the following code: dealloc(this) return this->field With the original allocator, this worked fine, since the deallocation didn't touch the memory.My allocator, however, overwrote the field during the deallocation with bookkeeping stuff, which meant the returned value was not what the programmer intended and after a short while the program crashed.Unlike TFA, I had the luxury of just fixing the test program.
- classichasclassBetting Alpha was the native architecture in question. It seemed to have the best support.
- electroglyphheh, when Raymond Chen dunks on the MSVC team =)
- jeffbeePeople from Transmeta told me stories about how their translators were full of special case optimizations to fix horrors they discovered in Microsoft Windows itself.
- notorandit> they fixed it during emulationIt means the fix was applied to run during the emulation loop execution, not that the fix was found and applied while the emulation loop was running.Which would have made it an emulation code escape.
- m1rCouldn't they just turn the optimization off for this loop?
- ant6nArguably more of an optimization, rather than a fix. Looks like un-unrolling a loop, or better, rolling a loop. Or rolling straight line code?
- yieldcrv> All in all, it took this program 256 kilobytes of code to initialize 64 kilobytes of data.solidity sweating profusely
- anonundefined
- rohitsriram[flagged]