1. 9

    No, you don’t need C aliasing to obtain vector optimization for this sort of code. You can do it with standards-conforming code via memcpy(): https://godbolt.org/g/55pxUS

    1. 2

      Wow, it’s actually completely optimizing out the memcpy()? While awesome, that’s the kind of optimization I hate to depend on. One little seemingly inconsequential nudge and the optimizer might not be able to prove that’s safe, and suddenly there’s an additional O(n) copy silently going on.

      1. 2

        memset/memcpy get optimized out a lot, hence libraries making things like this: https://monocypher.org/manual/wipe

        1. 1

          Actually it’s not optimizing it out, it’s simply allocating the auto array into SIMD registers. You always must copy data into SIMD registers first before performing SIMD operations. The memcpy() code resembles a SIMD implementation more than the aliasing version.

        2. 1

          You can - and thanks for the illustration - but the memcpy is antethical to the C design paradigm in my always humble opinion. And my point was not that you needed aliasing to get the vector optimization, but that aliasing does not interfere with the vector optimization.

          1. 8

            I’m sorry but the justifications for your opinion no longer hold. memcpy() is the only unambiguous and well-defined way to do this. It also works across all architectures and input pointer values without having to worry about crashes due to misaligned accesses, while your code doesn’t. Both gcc and clang are now able to optimize away memcpy() and auto vars. An opinion here is simply not relevant, invoking undefined behavior when it increases risk for no benefit is irrational.

            1. -2

              Au contraire. As I showed, C standard does not need to graft on a clumsy and painful anti-alias mechanism and programmers don’t need to go though stupid contortions with allocation of buffers that disappear under optimization , because the compiler does not need it. My code does’t have alignment problems. The justification for pointer alias rules is false. The end.

              1. 10

                There are plenty of structs that only contain shorts and char, and in those cases employing aliasing as a rule would have alignment problems while the well-defined version wouldn’t. It’s not the end, you’re just in denial.

                1. -2

                  In those cases, you need to use an alignment modifier or sizeof. No magic needed. There is a reason that both gcc and clang have been forced to support -fnostrict_alias and now both support may_alias. The memcpy trick is a stupid hack that can easily go wrong - e.g one is not guaranteed that the compiler will optimize away the buffer, and a large buffer could overflow stack. You’re solving a non-problem by introducing complexity and opacity.

                  1. 10

                    In what world is memcpy() magic and alignment modifiers aren’t? memcpy() is an old standard library function, alignment modifiers are compiler-specific syntax extensions.

                    memcpy() isn’t a hack, it’s always well-defined while aliasing can never be well-defined in all cases. Promoting aliasing as a rule is like promoting using the equality operator between floats – it can never work in all cases, though it may be possible to define meaningful behavior in specific cases. Promoting aliasing as a rule is promoting the false idea that C is a thin layer above contemporary architectures, it isn’t. Struct memory is not necessarily the same as array memory, not every machine that C supports can deference an int32 inside of an int64, not every machine can deference an int32 at any offset. Do you want C to die with x86_64 or do you want C to live?

                    Optimizations don’t need to be guaranteed when the code isn’t even correct in the first place. First make sure your code is correct, then worry about optimizing. You talk about alignment modifiers but they are rarely used, and usually they are used after a bug has already occurred. Code should be correct first, and memcpy() is the rule we should be promoting since it is always correct. Optimizers can meticulously add aliasing for specific cases once a bottleneck has been demonstrated. You’re solving a non-problem by indulging in premature optimization.

                    1. 3

                      Do you want C to die with x86_64 or do you want C to live?

                      Heh I bet you’d get quite varied answers to this one here

                      1. -1

                        The memcpy hack is a hack because the programmer is supposed to write a copy of A to B and then back to A and rely on the optimizer to skip the copy and delete the buffer. So unoptimized the code may fault on stack overflows for data structures that exist only to make the compiler writers happier. And with a novel architecture, if the programmer wants to take advantage of a new capability - say 512 bit simd instructions , she can wait until the compiler has added it to its toolset and be happy with how it is used.

                        As for this not working in all cases: Big deal. C is not supposed to hide those things. In fact, the compiler has no idea if the memory is device memory with restrictions on how it can be addressed or memory with a copy on write semantics or …. You want C to be Pascal or Java and then announce that making C look like Pascal or Java can only be solved at the expense of making C unusable for low level programming. Which programming communities are asking for such insulation? None. C works fine on many architectures. C programmers know the difference between portable and non-portable constructs. C compilers can take advantage of SIMD instructions without requiring C programmers to give up low level memory access - one of the key advantages of programming in C. Basically, people who don’t like C are trying to turn C into something else and are offended that few are grateful.

                        1. 4

                          You aren’t writing a copy of a buffer back and forth. In your example, you are reducing an encoding of a buffer into a checksum. You are only copying one way, and that is for the sake of normalization. All SIMD code works that way, you always must copy into SIMD registers first before doing SIMD operations. In your example, the aliasing code doesn’t resemble SIMD code both syntactically and semantically as much the memcpy() code does and in fact requires a smarter compiler to transform.

                          The chance of overflowing the stack is remote, since stacks now automatically grow and structs tend to be < 512 bytes, but if that is a legitimate concern you can do what you already do to avoid that situation, either use a static buffer (jeopardizing reentrancy) or use malloc().

                          By liberally using aliasing, you are assuming a specific implementation or underlying architecture. My point is that in general you cannot assume arbitrary internal addresses of a struct can always be dereferenced as int32s, so in general that should not be practiced. In specific cases you can alias, but those are the exceptions not the rule.

                          1. 1

                            The chance of overflowing the stack is remote, since stacks now automatically grow and structs tend to be < 512 bytes, but if that is a legitimate concern you can

                            … just copy the ints out one at a time :) https://godbolt.org/g/g8s1vQ

                            The compiler largely sees this as a (legal) version of the OP’s code, so there’s basically zero chance it won’t be optimised in exactly the same way.

                            1. 0

                              All copies on some architectures reduce to: load into register, store from register. So what? That is why we have a high level language which can translate *x = *y efficiently. The pointer alias code directly shows programmer intent. The memcpy code does not. The “sake of normalization” is just another way of saying “in order to cooperate with the fiction that the inconsistency in the standard produces”.

                              In many contexts, stacks do NOT automatically grow.Again, C is not Java. OS code, drivers, embedded code, even many applications for large systems - all need control over stack size. Triggering stack growth may even turn out to be a security failure for encryption which is almost universally written in C because in C you can assure time invariance (or you could until the language lawyers decided to improve it). Your proposal that programmers not only use a buffer, but use a malloced buffer, in order to allow the optimizer (they hope) not to use it, is ridiculous and is a direct violation of the C model.

                              “3. C code can be non-portable. Although it strove to give programmers the opportunity to write truly portable programs, the Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler;” the ability to write machine-specific code is one of the strengths of C. It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program.” ( http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2021.htm)

                              Give me an example of an architecture where a properly aligned structure where sizeof(struct x)%sizeof(int32) == 0 cannot be accessed by int32s ? Maybe the itanium, but I doubt it. Again: every major OS turns off strict alias in the compilers and they seem to work. Furthermore, the standard itself permits aliasing via char* (as another hack). In practice, more architectures have trouble addressing individual bytes than addressing int32s.

                              I’d really like to see more alias analysis optimization in C code (and more optimization from static analysis) but this poorly designed, badly thought through approach we have currently is not going to get us there. To solve any software engineering problem, you have to first understand the use cases instead of imposing some synthetic design.

                              Anyways off the airport. Later. vy

                              1. 2

                                I’m willing to agree with you that the aliasing version more clearly shows intent in this specific case but then I ask, what do you do when the code aliases a struct that isn’t properly aligned? There are a lot of solutions but in the spirit of C, I think the right answer is that it is undefined.

                                So I think what you want is the standard to define one specific instance of previously undefined behavior. I think in this specific case, it’s fair to ask for locally aliasing an int32-aligned struct pointer to an int32 pointer to be explicitly defined by the standards committee. What I think you’re ignoring, however, is all the work the standards committee has already done to weigh the implications of defining behavior like that. At the very least, it’s not unlikely that there will be machines in the future where implementing the behavior you want will be non-trivial. Couple that with the burden of a more complex standard. So maybe the right answer to maximize global utility is to leave it undefined and to let optimization-focused coders use implementation-defined behavior when it matters but, as I’m arguing, use memcpy() by default. I tend to defer to the standards committees because I have read many of their feature proposals and accompanying rationales and they are usually pretty thorough and rarely miss things that I don’t miss.

                                Everybody arguing here loves C. You shouldn’t assume the standards committee is dumb or that anyone here wants C to be something it’s not. As much as you may think otherwise, I think C is good as it is and I don’t want it to be like other languages. I want C to be a maximally portable implementation language. We are all arguing in good faith and want the best for C, we just have different ideas about how that should happen.

                                1. 1

                                  what do you do when the code aliases a struct that isn’t properly aligned? There are a lot of solutions but in the spirit of C, I think the right answer is that it is undefined.

                                  Implementation dependent.

                                  Couple that with the burden of a more complex standard.

                                  The current standard on when an lvalue works is complex and murky. Wg14 discussion on how it applies shows that it’s not even clear to them. The exception for char pointers was hurriedly added when they realized they had made memcpy impossible to implement. It seems as if malloc can’t be implemented in conforming c ( there is no method of changing storage type to reallocate it)

                                  C would benefit from more clarity on many issues. I am very sympathetic to making pointer validity more transparent and well defined. I just think the current approach has failed and the c89 error has not been fixed but made worse. Also restrict has been fumbled away.

                        2. 2

                          You don’t need a large buffer. You can memcpy the integers used for the calculation out one at a time, rather than memcpy’ing the entire struct at once.

                          Your designation of using memcpy as a “stupid hack” is pretty biased. The code you posted can go wrong, legitimately, because of course it invokes undefined behaviour, and is more of a hack than using memcpy is. You’ve made it clear that you think the aliasing rules should be changed (or shouldn’t exist) but this “evidence” you’ve given has clearly been debunked.

                          1. 0

                            Funny use of “debunked”. You are using circular logic. My point was that this aliasing method is clearly amenable to optimization and vectorization - as seen. Therefore the argument for strict alias in the standard seems even weaker than it might. Your point seems to be that the standard makes aliasing undefined so aliasing is bad. Ok. I like your hack around the hack. The question is: why should C programmers have to jump through hoops to avoid triggering dangerous “optimizations”? The answer: because it’s in the standard, is not an answer.

                            1. 3

                              Funny use of “debunked”. You are using circular logic. My point was that this aliasing method is clearly amenable to optimization and vectorization - as seen

                              You have shown a case where, if the strict aliasing rule did not exist, some code could [edit] still [/edit] be optimised and vectorised. That I agree with, though nobody claimed that the existence of the strict aliasing rule was necessary for all optimisation and vectorisation, so it’s not clear what you do think this proves. Your title says that the optimisation is BECAUSE of aliasing, which is demonstrably false. Hence, debunked. Why is that “funny”? And how is your logic any less circular then mine?

                              The question is: why should C programmers have to jump through hoops to avoid triggering dangerous “optimizations”?

                              Characterising optimisations as “dangerous” already implies that the code was correct before the optimisation was applied and that the optimisation can somehow make it incorrect. The logic you are using relies on the code (such as what you’ve posted) being correct - which it isn’t, according to the rules of the language (which, yes, are written in a standard). But why is using memcpy “jumping through hoops” whereas casting a pointer to a different type of pointer and then de-referencing it not? The answer is, as far as I can see, because you like doing the latter but you don’t like doing the former.

                      2. 1

                        The end.

                        The internet has no end.

                1. -1

                  Yes.

                  1. 1

                    Do you think people who don’t enjoy coding have a place in the industry?

                    1. 5

                      Yes, assuming you don’t outright loathe it. How long have you been a professional software engineer?

                      1. 1

                        Going on five years. My ability to tolerate it varies on the work environment tbh.

                  1. 2

                    Maybe until a year or so ago I would have answered yes without hesitation, but recently I’ve realized that the actual act of coding has lost its luster for me. I like making stuff, and coding is now just a means to that end. One might think that’s a distinction without a difference, but the upshot is that dealing with uninteresting grunt work is that much harder.

                    I’ve found myself drawn to other creative endeavors in my personal time (emphasis on “create”) such as woodworking and gardening. Both have filled that make-something niche in my life quite nicely.

                    1. 1

                      I feel similarly. I think it was novel at first but now I want to make different things.

                    1. 4

                      I was interested in what he was saying just up until he said

                      Some may even be lucky enough to find themselves doing Extreme Programming, also known as ‘The Scrum That Actually Works’.

                      My experience with XP was that it was extremely heavyweight and did not work well at all. It created the greatest developer dissatisfaction of any of the versions of Agile I’ve encountered.

                      1. 5

                        Couldn’t disagree more – the most successful team I was on was heavily into XP. When people say it’s heavyweight, they’re usually talking about pair programming. I’m not sure what people have against it; I’ve found it’s a great way to train junior developers, awesome for tricky problems, and generally a great way to avoid the problem of, “Oh this PR looks fine but redo it because you misunderstood the requirements.”

                          1. 2

                            I don’t want to discount your experience, but it sounds like the issues you’ve had with pair programming are more with the odd choices your employer imposed.

                            Both people have specialized editor configs? Sure, switch off computers or whatever too; no need to work in an unfamiliar environment.

                            And if one person is significantly less experienced than the other, that person should be at the keyboard more often than not – watching the master at work will largely be useless.

                        1. 3

                          Why I like XP over anything else is the focus on development practices rather than business practices. Pairing, TDD, CI, <10 minute builds, WIP, whole team estimation, etc are all used to produce a better product, faster.

                          The weekly retrospective offers a way to adjust practices that aren’t working and bolster those that are.

                          1. 2

                            Agreed 100%. It turned my head a bit when he thought Agile was too prescriptive, but then was considering an even more prescriptive methodology.

                            1. 1

                              What was your experience with XP? Also, scrum is heavyweight as well in my experience and doesn’t work excellently in an actually agile environment like a startup. Feels like it could work in corp. though.

                            1. 3

                              Huh, I didn’t realize the 8086 only had a 20 pin address bus. Makes everything seem a bit more sane, although why make the segments overlap? I can’t see the benefit and it makes expanding the address space that much harder.

                              1. 3

                                I guess one (tiny) benefit of having segments overlap every 16 bytes is that a malloc() implementation could return pointers of XXXX:0000 format, i.e. only concern itself with segments? And then, if you want to index into such an array, you can put the array element’s index/offset in a register without having to add a base address offset, since the array always starts at 0000 (within the given segment).

                                1. 3

                                  Overlapping has a lot of sense if you take into account that non-trivial amount of programs only ever needed one segment, so you could use “near” pointers and shorter jump instructions that only deal with offsets.

                                2. 2

                                  More silly trivia: All wintel PCs boot with line 20 disabled, in order to default to 8086 mode. And if you turn it on, you talk to the keyboard controller. Some quick googling led me to an example here: https://github.com/Clann24/jos/blob/master/lab2/code/obj/boot/boot.asm#L29

                                  Of course these days all these devices exist on-die, but back in the day they would have been discrete ASICs.

                                1. 2

                                  We’re considering introducing this in at work, because apparently maintaining docs and client libs is too difficult for today’s engineers. Am curious what other lobsters have had as experiences.

                                  1. 10

                                    My experiences have been pretty positive. At the very least, it’s way better than what most organizations use instead (i.e. nothing).

                                    My first time using it, I was grafting it onto an existing API and was pleasantly surprised to see that it was expressive enough to bend to the existing functionality. All the stuff it enables is great – automated conformance testing being the big one, in my book.

                                    1. 1

                                      Ah, neat. I already had been running boring documentation and updates (wiki ops, wheee), and am resistant to change if something works.

                                      But, if it’s been working well for other folks, it’s worth investigating.

                                      1. 3

                                        maintaining docs and client libs

                                        Have you thought about gRPC, or twirp at all? If the use-case is internal-facing systems then I think what a lot of people really want is RPC. Swagger seems great for external-facing systems, though.

                                        (Disclaimer: I haven’t used any of the tools I just mentioned :D)

                                    2. 1

                                      We have also used it quite a bit at work, getting the most benefit from automatic documentation of endpoints. The automatic client lib generation didn’t work very well for us, and hand written clients were our approach.

                                      Depending on what your API is written in, you may have quite good support for generating pretty detailed docs.

                                      Also, this opens the door for some kind of automated testing against the API, based on the Swagger definition, but, I never got around to doing that.

                                      1. 1
                                      1. 4

                                        What’s the meaning of the last line, “I showed up for him”?

                                        1. 13

                                          As used here, “him” implicitly means more than just “Steve Jobs.” It alludes to the essence of Jobs, the things that make him who he is. Carmack is saying he dropped everything he was working on when Jobs asked for him because of his admiration and respect for Jobs’ as a person, not just for Jobs’ fame or influence. Sentences like these are frequently written with “him” italicized, and spoken with strong emphasis on “him.”

                                          That’s how I read it anyway.

                                          Sorry if my explanation seems patronizing, I just went ahead and assumed you’re a non-native English speaker.

                                          1. 6

                                            Thanks – not patronizing at all, even though I am in fact a native speaker :) Just not familiar with that phrase and unsure if I should take it literally.

                                            1. 8

                                              As a non-native speaker, that’s quite reassuring to know that English subtleties are deep, even for a native speaker :-)

                                            2. 6

                                              I read it as John showing up for Jobs’ funeral :-)

                                          1. 2

                                            There’s something about Scala that my brain just can’t handle; I think the designers just have fundamentally different tastes from mine. The “implicit” keyword for example strikes me as crazy – the name alone says to me it’s a bad idea!

                                            1. 1

                                              This is pretty damn cool – has anyone used this or had a need? I’m not sure I’ve ever needed to grab a piece of Java and stick it in my python, but maybe now that I know it’s straightforward it’ll come up.

                                              1. 5

                                                Another “quirks” question: did you find any unexpected quirks of Go that made writing this emulator harder or easier?

                                                1. 5

                                                  In this particular case, it feels like the code isn’t too far from what C code would be: here are some basic data structures and here are some functions that operate on them, mostly on the bit level. No fancy concurrency models nor exciting constructs. I think given the fact that this is an inherently low level program, most nicieties from Go weren’t immediately needed.

                                                  I did use some inner functions/closures and hash maps, but could’ve just as well done without them. The bottom line is that the language didn’t get in the way, but I didn’t feel like it was enourmously helpful, other than making it easier to declare dependencies and handling the build process for me.

                                                  1. 4

                                                    Did you run into any issues with gc pauses? That’s one of the things people worry about building latency sensitive applications in go.

                                                    1. 3

                                                      Not the OP, but I would assume this kind of application generates very little garbage in normal operation.

                                                      1. 2

                                                        The gc pauses are so miniscule now, for the latest releases of Go, that there should be no latency issues even for realtime use. And it’s always possible to allocate a blob of memory at the start of the program and just use that, to avoid gc in the first place.

                                                        1. 2

                                                          The garbage collector hasn’t been an issue either. Out of the box, I had to add artificial delays to slow things down and maintain the frame rate, so I haven’t done much performance tuning/profiling. I am interested in scenarios where this would be critical though.

                                                          1. 1

                                                            Go’s GC pauses are sub-millisecond so it’s not an issue.

                                                        2. 3

                                                          Interested in this as well. I’ve toying with the idea of writing a CHIP-8 emulator in Go and would love to hear about how is the experience of writing emulators.

                                                          1. 3

                                                            I did exactly this as a project to learn Go! I used channels in order to control the clock speed and the timer frequency and it ended up being a really nice solution. The only real hangup I had was fighting with the compiler with respect to types and casting, but having type checking overall was a good thing.

                                                        1. 0

                                                          This is a good step. But, I personally not agree with nowadays languages that most of them do not have backward compatibility.

                                                          1. 7

                                                            I respectfully disagree. I think as advancements are made in languages it’s only natural that you’re going to reach a point where additions or changes will force incompatibilities. It’s a natural, albeit sometimes painful, part of progress.

                                                            1. 7

                                                              Otherwise you get a few billion lines of COBOL powering all the critical stuff. ;)

                                                            2. 3

                                                              What are these languages without backward compatibility? From what I can tell, Go and Rust both seem to maintain backward compatibility pretty well.

                                                              1. 4

                                                                Frankly, python’s backwards compatibility isn’t bad in my opinion. Outside of the python 2 and 3 differences, there isn’t really much to complain about.

                                                                1. 3

                                                                  I’m mostly annoyed that one interpreter can’t handle both 2 and 3 code. The changes are small enough this seems totally reasonable.

                                                                  1. 1

                                                                    In terms of syntax I might agree with you, but under the hood it changed enough that’s it’s acceptable.

                                                                2. 2

                                                                  Go and Rust are both very young, and neither has even had a major version increase yet. Combined they have a much smaller installed base than Python and therefore fewer people driving new changes. They’re also tightly controlled by corporations who are likely to take a conservative stance on compatibility.

                                                                  Most older languages have had backward compatibility issues. C++, for example, has added keywords, deprecated and removed auto_ptr, made changes to how lambdas behave, etc. Ada made major changes between Ada83, 95, and 2005, which are mostly compatible, but incompatible in some corner cases.

                                                                  Nobody likes breaking compatibility, but refusing to do so implies the language is perfect or that the users must live with mistakes forever.

                                                              1. 3

                                                                My main complaint with celery is its deployment model – it’s a non-trivial amount of operations work to keep the cluster of workers updated as the code evolves. This is in contrast to systems that ship code from the master, where the worker nodes are very dumb.

                                                                My other complaint is that the software itself seems to encourage coupling the worker code very tightly with the celery job system, making it difficult to run locally or test. While you can explicitly run a job in the current thread, that falls apart once you do anything interesting with chaining jobs and other fun control flow related operations.

                                                                Does Dramatiq solve either of those pain points? As far as I can see in the docs the answer is “no.” I don’t mean to come off sounding like a curmudgeon, it looks like a cool project! I think I just need something fundamentally different.

                                                                1. 3

                                                                  Hey-o! This is how we do it at my company, a startup with about ten software engineers:

                                                                  Provision with terraform, deploy with ansible. Ansible would do very little, just pull down our packaged software via apt-get which was python virtualenvs. Since the virtualenv was completely self contained it didn’t need to deal with dependencies at all – the idea of installing application level dependencies via pip or whatever is insane to me.

                                                                  We supposedly adhered to the “devops” mantra of “devops is a shared responsibility,” but in practice no one wanted to deal with it so it usually fell on one or two people to keep things sane.

                                                                  1. 1

                                                                    This is pretty similar to what I’ve done in the past although I’d like to have better answers than Terraform or Ansible. Ansible especially turns into a ball-ache once you’re trying to do more than just rsync a binary.

                                                                    I’ve been thinking about writing a CLI around https://github.com/frontrowed/stratosphere that uses cloudformation changesets to give me diffs like what Terraform does.

                                                                    1. 1

                                                                      Yeah, agreed RE: ansible – as soon as you’re doing something complicated with it, you’re doing it wrong. And it’s very tempting to do so simply because it has so much functionality built-in.

                                                                      Our infrastructure was designed to be immutable once provisioned, so it really would have made sense to go the kubernetes / ECS route.

                                                                  1. 3

                                                                    Are there any implications of moving the storage layer to RocksDB? Or does the fact that it stores data sorted by key basically make it the same as a Cassandra sstable?

                                                                    1. 4

                                                                      The article mentions some. For example, “The existing streaming implementation was based on the details in the current storage engine”. As I understand the article, Instagram still uses SSTable for streaming.

                                                                    1. 3

                                                                      The linked opinion poll by Gasarch is a pretty fun read, and has a section with various academics’ thoughts on the matter:

                                                                      Only a few people will follow the proof. Whoever does will spend the rest of his life convincing people it is correct.

                                                                      Even though P6=NP people should still work on trying to prove P=NP to see what goes wrong. I think the P=BPP question is almost as interesting. I am appalled that people take for granted that P=BPP.

                                                                      1. 1

                                                                        That’s a cool one to read. I like seeing the different opinions of respectable mathematicians and computer scientists’ two cents on the issue. As an aside, the professor who taught me Calc 2 is quoted in this list, which was a nice surprise :)

                                                                      1. 2

                                                                        Has anyone here tried using Godot in anger yet? I’m tempted to use this instead of Unity for things, but a bit unsure of how difficult it might be to use Godot for simple prototyping (namely whether docs are complete enough).

                                                                        Would love to hear the pros of this in terms of usability.

                                                                        1. 2

                                                                          Godot is great. I’ve used Unity a bit but honestly I don’t think I’d ever choose it over Godot (except for maybe non-technical reasons, like the asset store).

                                                                          As for the docs, I found them to be pretty good, and things are reasonably discoverable in-editor too.

                                                                          1. 2

                                                                            I quit Unity years ago and switched to Godot. The docs aren’t great (they seem to be working on that) but the builtin stuff is amazing, it’s kinda weird at times but it always seems like it has the one specific thing you want. I spent days implementing things that I usually don’t find in editors just to find out that they were already implemented in Godot, just a bit hidden.

                                                                            GDScript is acquired taste, I still don’t love it, but it’s grown on me enough that I can use it comfortably. For all its quirkiness (tries to be python but fails) it hasn’t given me a single issue or unwanted behavior, unlike the spotty C# in Unity (although I’ve heard it’s getting better?).

                                                                            1. 4

                                                                              Looks like they support C# now via Mono. Looking forward to someone writing a wrapper for F#.

                                                                              1. 2

                                                                                The main reason I never took Godot seriously is that they decided to invent their own language because “none of the existing ones were good enough” which tells me that the project leadership at the time were not very sensible. It’s a good sign that they’ve realized that was a mistake.

                                                                                1. 2

                                                                                  I think that was nearly ubiquitous in game engines prior to, say, 2009–2010 or so. Unreal has its own language, UnrealScript, based on a previous in-house language, ZZT-oop. And Unity started with a DIY language, UnityScript, which had Javascript-like syntax and was sometimes called “Javascript” in the docs, but wasn’t really JS, and was finally axed just a few months ago. So it’s not that surprising Godot would also have one, even if it started a little later than those two engines.

                                                                                  I’m not 100% sure on the timeline, but I think Lua was one of the first third-party languages to be widely picked up as a game scripting language. At the time it was seen as necessary for game-scripting languages that they be lightweight, small implementations that are easily embeddable, and ideally permissively licensed, which Lua fit the bill. Though now things have moved on to where embedding Mono isn’t a dealbreaker.

                                                                          1. 1

                                                                            Looks pretty cool! I hate to be “that guy” but what is something like this commonly used for?

                                                                            1. 1

                                                                              If you’re doing 3D printing or experimenting with parametric architecture (let’s add two stories to that building), this is very useful. Also, there’s a big community of people experimenting with generative design, see Nervous System https://n-e-r-v-o-u-s.com/ or Marius Watz http://mariuswatz.com/ – or just browse Creative Applications http://www.creativeapplications.net/

                                                                            1. 1

                                                                              I’m having a hard time discerning exactly what the Kappa architecture is from this article. It sounds like it’s just, “Serve your real-time and historic information with the same backend.” Is that really deserved a name? So the vast majority of all APIs are “kappa architecture”?

                                                                              1. 25

                                                                                I used to do the things listed in this article, but very recently I’ve changed my mind.

                                                                                The answer to reviewing code you don’t understand is you say “I don’t understand this” and you send it back until the author makes you understand in the code.

                                                                                I’ve experienced too much pain from essentially rubber-stamping with a “I don’t understand this. I guess you know what you’re doing.” And then again. And again. And then I have to go and maintain that code and, guess what, I don’t understand it. I can’t fix it. I either have to have the original author help me, or I have to throw it out. This is not how a software engineering team can work in the long-term.

                                                                                More succinctly: any software engineering team is upper-bound architecturally by the single strongest team member (you only need one person to get the design right) and upper-bound code-wise by the single weakest/least experience team member. If you can’t understand the code now, you can bet dollars to donuts that any new team member or new hire isn’t going to either (the whole team must be able to read the code because you don’t know what the team churn is going to be). And that’s poison to your development velocity. The big mistake people make in code review is to think the team is bound by the strongest team member code-wise too and defer to their experience, rather than digging in their heels and say “I don’t understand this.”

                                                                                The solution to “I don’t understand this” is plain old code health. More functions with better names. More tests. Smaller diffs to review. Comments about the edge cases and gotchas that are being worked around but you wouldn’t know about. Not thinking that the code review is the place to convince the reviewer to accept the commit because no-one will ever go back to the review if they don’t understand the code as an artifact that stands by itself. If you don’t understand it as a reviewer in less than 5 minutes, you punt it back and say “You gotta do this better.” And that’s hard. It’s a hard thing to say. I’m beginning to come into conflict about it with other team members who are used to getting their ungrokkable code rubber stamped.

                                                                                But code that isn’t understandable is a failure of the author, not the reviewer.

                                                                                1. 7

                                                                                  More succinctly: any software engineering team is upper-bound architecturally by the single strongest team member (you only need one person to get the design right) and upper-bound code-wise by the single weakest/least experience team member.

                                                                                  Well put – hearing you type that out loud makes it incredibly apparent.

                                                                                  Anywhoo, I think your conclusion isn’t unreasonable (sometimes you gotta be the jerk) but the real problem is upstream. It’s a huge waste when bad code makes it all the way to review and then and then needs to be written again; much better would be to head it off at the pass. Pairing up the weaker / more junior software engineers with the more experienced works well, but is easier said than done.

                                                                                  1. 4

                                                                                    hmm, you make a good point and I don’t disagree. Do you think the mandate on the author to write understandable code becomes weaker when the confusing part is the domain, and not the code itself? (Although I do acknowledge that expressive, well-structured and well-commented code should strive to bring complicated aspects of the problem domain into the picture, and not leave it up to assumed understanding.)

                                                                                    1. 3

                                                                                      I think your point is very much applicable. Sometimes it takes a very long time to fully understand the domain, and until you do, the code will suffer. But you have competing interests. For example, at some point, you need to ship something.

                                                                                      1. 2

                                                                                        Do you think the mandate on the author to write understandable code becomes weaker when the confusing part is the domain, and not the code itself?

                                                                                        That’s a good question.

                                                                                        In the very day-to-day, I don’t personally find that code reviews have a problem from the domain level. Usually I would expect/hope that there’s a design doc, or package doc, or something, that explains things. I don’t think we should expect software engineers to know how a carburetor works in order to create models for a car company, the onus is on the car company to provide the means to find out how the carburetor works.

                                                                                        I think it gets much tricker when the domain is actually computer science based, as we kind of just all resolved that there are people that know how networks work and they write networking code, and there’s people who know how kernels work and they write kernel code etc etc. We don’t take the time to do the training and assume if someone wants to know about it, they’ll learn themselves. But in that instance, I would hope the reviewer is also a domain expert, but on small teams that probably isn’t viable.

                                                                                        And like @burntsushi said, you gotta ship sometimes and trust people. But I think the pressure eases as the company grows.

                                                                                        1. 1

                                                                                          That makes sense. I think you’ve surfaced an assumption baked into the article which I wasn’t aware of, having only worked at small companies with lots of surface area. But I see how it comes across as particularly troublesome advice outside of that context

                                                                                      2. 4

                                                                                        I’m beginning to come into conflict about it with other team members

                                                                                        How do you resolve those conflicts? In my experience, everyone who opens a PR review finds their code to be obvious and self-documenting. It’s not uncommon to meet developers lacking the self-awareness required to improve their code along the lines of your objections. For those developers, I usually focus on quantifiable metrics like “it doesn’t break anything”, “it’s performant”, and “it does what it’s meant to do”. Submitting feedback about code quality often seems to regress to a debate over first principles. The result is that you burn social capital with the entire team, especially when working on teams without a junior-senior hierarchy, where no one is a clear authority.

                                                                                        1. 2

                                                                                          Not well. I don’t have a good answer for you. If someone knows, tell me how. If I knew how to simply resolve the conflicts I would. My hope is that after a while the entire team begins to internalize writing for the lowest common denominator, and it just happens and/or the team backs up the reviewer when there is further conflict.

                                                                                          But that’s a hope.

                                                                                          1. 2

                                                                                            t’s not uncommon to meet developers lacking the self-awareness required to improve their code along the lines of your objections. For those developers, I usually focus on quantifiable metrics like “it doesn’t break anything”, “it’s performant”, and “it does what it’s meant to do”. Submitting feedback about code quality often seems to regress to a debate over first principles.

                                                                                            Require sign-off from at least one other developer before they can merge, and don’t budge on it – readability and understandability are the most important issues. In 5 years people will give precisely no shits that it ran fast 5 years ago, and 100% care that the code can be read and modified by usually completely different authors to meet changing business needs. It requires a culture shift. You may well need to remove intransigent developers to establish a healthier culture.

                                                                                            The result is that you burn social capital with the entire team, especially when working on teams without a junior-senior hierarchy, where no one is a clear authority.

                                                                                            This is a bit beyond the topic at hand, but I’ve never had a good experience in that kind of environment. If the buck doesn’t stop somewhere, you end up burning a lot of time arguing and the end result is often very muddled code. Even if it’s completely arbitrary, for a given project somebody should have a final say.

                                                                                            1. 1

                                                                                              The result is that you burn social capital with the entire team, especially when working on teams without a junior-senior hierarchy, where no one is a clear authority.

                                                                                              This is a bit beyond the topic at hand, but I’ve never had a good experience in that kind of environment. If the buck doesn’t stop somewhere, you end up burning a lot of time arguing and the end result is often very muddled code. Even if it’s completely arbitrary, for a given project somebody should have a final say.

                                                                                              I’m not sure.

                                                                                              At very least, when no agreement is found, the authorities should document very carefully and clearly why they did take a certain decision. When this happens everything goes smooth.

                                                                                              In a few cases, I saw a really seasoned authority to change his mind while writing down this kind of document, and finally to choose the most junior dev proposal. And I’ve also seen a younger authority faking a LARGE project just because he took any objection as a personal attack. When the doom came (with literally hundreds of thousands euros wasted) he kindly left the company.

                                                                                              Also I’ve seen a team of 5 people working very well for a few years together despite daily debates. All the debates were respectful and technically rooted. I was junior back then, but my opinions were treated on pars with more senior colleagues. And we were always looking for syntheses, not compromises.

                                                                                          2. 2

                                                                                            I agree with the sentiment to an extent, but there’s something to be said for learning a language or domain’s idioms, and honestly some things just aren’t obvious at first sight.

                                                                                            There’s “ungrokkable” code as you put it (god knows i’ve written my share of that) but there’s also code you don’t understand because you have had less exposure to certain idioms, so at first glance it is ungrokkable, until it no longer is.

                                                                                            If the reviewer doesn’t know how to map over an array, no amount of them telling me they doesn’t understand will make me push to a new array inside a for-loop. I would rather spend the time sitting down with people and trying to level everyone up.

                                                                                            To give a concrete personal example, there are still plenty of usages of spreading and de-structuring in JavaScript that trip me up when i read them quickly. But i’ll build up a tolerance to it, and soon they won’t.

                                                                                          1. 1

                                                                                            This seems like a considerably simpler way to do dimensionality reduction as compared to say PCA or SVD. I take it the drawback must be that it doesn’t preserve enough structure? Or that the real value in the other algorithms is that they’re much better at removing dimensions of lesser variance?

                                                                                            1. 2

                                                                                              Out of my gut:

                                                                                              This method uses the theoretical upper bound of dimensions required to keep distortion under the given epsilon (though is it really a hard upper bound?). It works for any inputs (random or not). It doesn’t care about structure, it doesn’t try to fit the data. Really cool trick.

                                                                                              But yeah, if you have structured data, and you care about that structure, you probably want to do something like PCA anyway (which is all about fitting your data). Even if you only care about reducing dimensionality, you want something smarter than this random projection because the number of dimensions you need for your structured data is far less than the theoretical upper bound. So yeah, they’re more optimal as they will find the dimensions with little variance which you can then drop.

                                                                                              Since machine learning was mentioned, I wonder how feasible it would be to take one of these random matrices and train it to better match the structure (and then reduce dimensions).

                                                                                              Disclaimer: I know nothing about maths.