<- Back
Comments (138)
- zahlman> Accidental O(n²) with Streams Inside LoopsMan that code looks awful. Really reminds me of why I drifted away from Java over time. Not just the algorithm, of course; the repetitiveness, the hoops that you have to jump through in order to do pretty "stream processing"... and then it's not even an FP algorithm in the end, either way!Honestly the only time I can imagine the "process the whole [collection] per iteration" thing coming up is where either you really do need to compare (or at least really are intentionally comparing) each element to each other element, or else this exact problem of building a histogram. And for the latter I honestly haven't seen people fully fall into this trap very often. More commonly people will try to iterate over the possible buckets (here, hour values), sometimes with a first pass to figure out what those might be. That's still extra work, but at least it's O(kn) instead of O(n^2).You can do this sort of thing in an elegant, "functional" looking way if you sort the data first and then group it by the same key. That first pass is O(n lg n) if you use a classical sort; making a histogram like this in the first place is basically equivalent to radix sort, but it's nice to not have to write that yourself. I just want to show off what it can look like e.g. in Python: def local_hour(order): return datetime.datetime.fromtimestamp(order.timestamp).hour groups = itertools.groupby(sorted(orders, key=local_hour), key=local_hour) orders_by_hour = {hour: list(orders) for (hour, orders) in groups} Anyway, overall I feel like these kinds of things are mostly done by people who don't need to have the problem explained, who have simply been lazy or careless and simply need to be made to look in the right place to see the problem. Cf. Dan Luu's anecdotes https://danluu.com/algorithms-interviews/ , and I can't seem to find it right now but the story about saving a company millions of dollars finding Java code that was IIRC resizing an array one element at a time.
- liampullesUnderstanding algorithmic complexity (in particular, avoiding rework in loops), is useful in any language, and is sage advice.In practice though, for most enterprise web services, a lot of real world performance comes down to how efficiently you are calling external services (including the database). Just converting a loop of queries into bulk ones can help loads (and then tweaking the query to make good use of indexes, doing upserts, removing unneeded data, etc.)I'm hopeful that improvements in LLMs mean we can ditch ORMs (under the guise that they are quicker to write queries and the inbetween mapping code with) and instead make good use of SQL to harness the powers that modern databases provide.
- layer8For #5, the “fix” [0] is incomplete, because you will still get a NumberFormatException when the value is out of range. For int, you could check if there are more or less than 10 digits, and use parseLong() when there are exactly 10 digits. For long, you can use BigInteger when there are exactly 18 digits. Or you could just replicate the parsing implementation and change the part where it throws NumberFormatException (at the possible cost of foregoing JIT intrinsics).A second bug is that Character.isDigit() returns true for non-ASCII Unicode digits as well, while Integer.parseInt() only supports ASCII digits.Another bug is that the code will fail on "-".[0] public int parseOrDefault(String value, int defaultValue) { if (value == null || value.isBlank()) return defaultValue; for (int i = 0; i < value.length(); i++) { char c = value.charAt(i); if (i == 0 && c == '-') continue; if (!Character.isDigit(c)) return defaultValue; } return Integer.parseInt(value); }
- cmovqWhen you're using a programming language that naturally steers you to write slow code you can't only blame the programmer.I was listening to someone say they write fast code in Java by avoiding allocations with a PoolAllocator that would "cache" small objects with poolAllocator.alloc(), poolAllocator.release(). So just manual memory management with extra steps. At that point why not use a better language for the task?
- spankaleeAvoiding Java's string footguns is an interesting problem in programming languages design.The String.format() problem is most immediately a bad compiler and bad implementation, IMO. It's not difficult to special-case literal strings as the first argument, do parsing at compile time, and pass in a structured representation. The method could also do runtime caching. Even a very small LRU cache would fix a lot of common cases. At the very least they should let you make a formatter from a specific format string and reuse it, like you can with regexes, to explicitly opt into better performance.But ultimately the string templates proposal should come back and fix this at the language level. Better syntax and guaranteed compile-time construction of the template. The language should help the developer do the fast thing.String concatenation is a little trickier. In a JIT'ed language you have a lot of options for making a hierarchy of string implementations that optimize different usage patterns, and still be fast - and what you really want for concatenation is a RopeString, like JS VMs have, that simply references the other strings. The issue is that you don't want virtual calls for hot-path string method calls.Java chose a single final class so all calls are direct. But they should have been able to have a very small sealed class hierarchy where most methods are final and directly callable, and the virtual methods for accessing storage are devirtualized in optimized methods that only ever see one or two classes through a call site.To me, that's a small complexity cost to make common string patterns fast, instead of requiring StringBuilder.
- cogman10Nitpick just because.Orders by hour could be made faster. The issue with it is it's using a map when an array works both faster and just fine.On top of that, the map boxes the "hour" which is undesirable.This is how I'd write it long[] ordersByHour = new long[24]; var deafultTimezone = ZoneId.systemDefault(); for (Order order : orders) { int hour = order.timestamp().atZone(deafultTimezone).getHour(); ordersByHour[hour]++; } If you know the bound of an array, it's not large, and you are directly indexing in it, you really can't do any better performance wise.It's also not less readable, just less familiar as Java devs don't tend to use arrays that much.
- kyrraFirst request latency also can really suck in Java before hotpathed code gets through the C2 compiler. You can warm up hotpaths by running that code during startup, but it's really annoying having to do that. Using C++, Go, or Rust gets you around that problem without having to jump through the hoops of code path warmup.I wish Java had a proper compiler.
- kpw94The Autoboxing example imo is a case of "Java isn't so fast". Why can't this be optimized behind the scenes by the compiler ?Rest of advice is great: things compilers can't really catch but a good code reviewer should point out.
- wood_spiritA subject close to my heart, I write a lot of heavily optimised code including a lot of hot data pipelines in Java.And aside from algorithms, it usually comes down to avoiding memory allocations.I have my go-to zero-alloc grpc and parquet and json and time libs etc and they make everything fast.It’s mostly how idiomatic Java uses objects for everything that makes it slow overall.But eventually after making a JVM app that keeps data in something like data frames etc and feels a long way from J2EE beans you can finally bump up against the limits that only c/c++/rust/etc can get you past.
- OkxThe code: public int parseOrDefault(String value, int defaultValue) { if (value == null || value.isBlank()) return defaultValue; for (int i = 0; i < value.length(); i++) { char c = value.charAt(i); if (i == 0 && c == '-') continue; if (!Character.isDigit(c)) return defaultValue; } return Integer.parseInt(value); } Is probably worse than Integer.parseInt alone, since it can still throw NumberFormatExceptions for values that overflow (which is no longer handled!). Would maybe fix that. Unfortunately this is a major flaw in the Java standard library; parsing numbers shouldn't throw expensive exceptions.
- titzerFor fillInStackTrace, another trick is to define your own Exception subclass and override the method to be empty. I learned this trick 15+ years ago.It doesn't excuse the "use exceptions for control flow" anti-pattern, but it is a quick patch.
- sgbealSlight correction:> StringBuilder works off a single mutable character buffer. One allocation.It's one allocation to instantiate the builder and _any_ number of allocations after that (noting that it's optimized to reduce allocations, so it's not allocating on every append() unless they're huge).
- hiyerI ran into 5 and 7 in a Flink app recently - was parsing a timestamp as a number first and then falling back to iso8601 string, which is what it was. The flamegraph showed 10% for the exception handling bit. While fixing that, also found repeated creation of datetimeformatter. Both were not in loops, but both were being done for every event, for 10s of 1000s of events every second.
- zvqcMMV6Zcr> Exceptions for Control FlowThis one is so prevalent that JVM has an optimization where it gives up on filling stack for exception, if it was thrown over and over in exact same place.
- Izkata"Java is slow" is a reputation it earned in the 90s/2000s because the JVM startup (at least on Windows) was extremely slow, like several seconds, with a Java-branded splash screen during that time. Even non-technical people made the association.
- jerfAny non-trivial program that has never had an optimizer run on it has a minimal-effort 50+% speedup in it.
- urauraI thought those were common sense until I worked on a program written by my colleague recently.
- comrade1234Also finding the right garbage collector and settings that works best for your project can help a lot.
- taspeotisKnock KnockWho’s there?long pauseJava
- jandrewrogersYou can write many of the bad examples in the article in any language. It is just far more common to see them in Java code than some other languages.Java is only fast-ish even on its best day. The more typical performance is much worse because the culture around the language usually doesn't consider performance or efficiency to be a priority. Historically it was even a bit hostile to it.
- ww520The autoboxing in a loop case can be handled by the compiler.
- bearjawsJavaScript can be fast too, it's just the ecosystem and decisions devs make that slow it down.Same for Java, I have yet to in my entire career see enterprise Java be performant and not memory intensive.At the end of the day, if you care about performance at the app layer, you will use a language better suited to that.
- latchkeyWhen they say that AI will replace programmers, I think of this article and come to terms with my own job security.Most of this stuff is just central knowledge of the language that you pick up over time. Certainly, AI can also pick this stuff up instantly, but will it always pick the most efficient path when generating code for you?Probably not, until we get benchmarks into the hot path of our test suite. That is something someone should work on.
- victor106this is great, so practical!!!any other resources like this?
- spwa4Java IS fast. The time between deciding to use Java and Oracle's lawyers breaking down your door is measured in just weeks these days.
- abitabovebytes[dead]
- null-phnix[dead]
- ryguz[dead]
- andrewmcwatters[dead]
- r_lee[flagged]
- tripple6Do good, don't do bad. Okay.
- koakuma-chanAs much as I love Java, everybody should just be using Rust. That way you are actually in control, know what's going on, etc. Another reason specifically against Java is that the tooling, both Maven and Gradle, still stucks.
- seuI'm a bit surprised to see those examples, because there's nothing really new here. These are typical beginner pitfalls and have been there for at least a decade or more. Or maybe it's because I learned java in the late 90s and later used it for J2ME, and then using things like StringBuilder (StringBuffer in the old days) were almost mandatory, and you would be very careful trying to avoid unnecessary object allocations.