1. 4

    I’ll probably continue to play with optimal compression of various formats.

    After releasing the new version of BriefLZ (compression library) with an optimal compression level, I used that in an example of the same for LZ4 (see discussion over at Encode’s Forum if interested).

    I made one for CRUSH and Snappy as well, but they are all dreadfully slow, so mostly of theoretical interest.

    1. 5

      I wrote a very simple data compression library called BriefLZ many years ago, which recently appears to have found some use in a few other projects.

      I’ve been implementing some algorithms to get better compression out of it without changing the data format. I am hoping to get a few of these polished enough to release.

      1. 3

        I haven’t used Solaris since university, so the thing that excites me the most about this is the potential for dropping support for CMake 2.8.6 (Solaris 11.4 appears to include CMake 3.9).

        I think the next is then RHEL/CentOS7 at CMake 2.8.11 (which is a big step up from 2.8.6). Once they update we might actually have CMake 3.0 and “modern CMake”.

        1. 2

          I think that we will see RHEL8 this fall with newer CMake.

        1. 1

          A story related to the second nugget – MSVC has a file listing.inc containing padding macros, which is included in the assembly output the compiler produces.

          The first 64-bit versions did not include an updated listing.inc. For instance a NPAD 2 would result in the 2 byte instruction mov edi, edi, which on AMD64 this has the side effect of clearing the high 32 bits of rdi.

          So you could have code that worked fine, but if you produced an assembly listing and ran MASM on it, it would crash.

          1. 2

            I must admit I never noticed Knuth names this variant Fibonacci hashing in TAOCP, I’ve always just seen it referred to as Knuth’s multiplicative hashing. It appears in a number of data compression algorithms – for instance you can try searching for the constant 2654435761, which is a prime close to 2^32/phi often used.

            Somehow It seems unlikely that the developers of STL implementations were not aware of this.

            1. 3

              After running a Debian VM in Windows for years, it’s wild how quickly WSL became normal for me. ConEmu plus bash is such a nice convenience.

              1. 2

                I’ve tried several times to make ConEmu palatable as an app and have failed every time. I find it beyond ugly as an app and I can’t get the colours for Solarized (light) to look right. If you have any tips, I’d appreciate it.

                As for WSL, it’s the only reason to use Windows, frankly.

                1. 1

                  I am using Cmder as a frontend for ConEmu, it provides a slightly more polished setup out of the box. Just for the record, I am using Cmder mini version 1.1.4, which was the last release before some larger changes which (for some reason I cannot remember) did not work for me.

              1. 2

                Shame about the camera angle and the sound, seemed interesting, but hard to watch for an hour.

                1. 1

                  We didn’t have a good recording device this time, unfortunately. I did my best to clean up the sound, but it’s still not great. Sorry about that :(

                1. 8

                  Problem one: there is no such thing as an “internal change that fixes incorrect behavior” that is “backwards compatible”. If a library has a function f() in its public API, I could be relying on any observable behaviour of f() (potentially but pathologically including its running time or memory use, but here I’ll only consider return values or environment changes for given inputs)

                  I don’t think that is much of a problem tbh. I wouldn’t consider it breaking backwards compatibility as long as the function still adheres to the API contract set forth by the documentation and general expectations. The function String() should return a string, the formatting of that string may change if it is not explicitly documented. The function ParseFile() will continue to parse the input file and return a value containing the parsed data. As long as the same set of input files is parsable according to the documentation there is no breakage in my view.

                  However the limit of that would be if any behaviour ends up being widely used or was documented in a bad manner.

                  What would be breaking backwards compatibility would be any change that would result in having to change documentation or function signatures. Those would require a more major release IMO.

                  Of course, as long as majorVer=0 I will break backwards compatibility like the Hulk after being asked to write an essay on “smashing considered harmful” since it’s (IMO) not ready for production yet.

                  1. 5

                    Indeed, the “backwards compatible” in semver is with respect to the public API. This is not directly specified in the quoted part, but if you read the whole semver spec, I think it is pretty clear.

                    So I would say, if you depend on observable behavior that is not part of the public API (including the documentation), then it is your own responsibility to have tests in place that ensure a new version exhibits that behavior.

                    1. 0

                      Of course, as long as majorVer=0 I will break backwards compatibility like the Hulk after being asked to write an essay on “smashing considered harmful” since it’s (IMO) not ready for production yet.

                      I really hate this rule in semver. It seems designed to keep the major number small, people seem to have some lizard brain fear of large major versions, but it defeats the purpose of semver, IMO.

                      1. 3

                        Well, it’s a thing because majorVer=0, for me, implies the product is not ready for production. I’m still working out the kinks. I’m not going to make any promises.

                        Once I have everything set, I put down majorVer>=1, since I’m now more confident in the problem domain I also have less of a problem with upgrading major versions if it becomes necessary.

                        I don’t really fear large major versions, I tend to avoid it since most of the time you can bolt on the new functionality and provide shims for the old functions.

                        I don’t think it defeats the purpose of semVer as long as you eventually go majorVer=1.

                        1. 1

                          Well, it’s a thing because majorVer=0, for me, implies the product is not ready for production

                          I think the difference is versions and production are orthogonal to me. My workflow is deploying packages from branch builds until I feel it’s ready, so versionless.

                    1. 17

                      The problem likely goes a bit more like this:

                      Joe buys one month of GitHub premium, creates a private repo, and invites his 100 best friends to it. They all clone the repo, and he cancels his account. Now they each have a free private repo they can use forever.

                      1. 4

                        Another potential bug many might miss is the multiply by 2. On a 64-bit system where int is 32-bit (Windows for instance), it would be possible for this operation to overflow, resulting in undefined behavior.

                        1. 2

                          That would mean they had to allocate 2.1GiB * sizeof(int) (or at least 1.05GiB * sizeof(int)). Even so, calling malloc on a negative integer would just have it return NULL in new_data, so the assert would fail and the program would be terminated before anything bad happens.

                          1. 2

                            I think “that would require unlikely input” and “undefined behavior does nothing bad on my platform” can be dangerous when it comes to C.

                            For instance, given the compiler knows the capacity starts at 1 (if we fix that bug) and is always multiplied by 2, since overflowing would be undefined behavior, it can assume that will never happen and generate a shift left for the multiply. That would result in overflowing to 0 (which could trap), which when passed to malloc could (implementation defined) return a non-NULL pointer that cannot be dereferenced.

                            I know that’s all highly unlikely, but likely isn’t safe.

                        1. 4

                          Nice work, some thoughts:

                          • Print line number where assertion failed
                          • Way to compare doubles, possible with an optional precision
                          • Way to compare blocks of memory
                          • Consider renaming to snow.h
                          1. 3

                            I really would have liked to print the line number where the assertion fails, but I’m not sure if that’s possible. Because of the use of macros, everything ends up on the same line after the preprocessor, so __LINE__ will be the same for everything. If you know of a way to fix that, I’d love to hear it. (The "in example.c:files" message was originally supposed to be "in example.c:<line number>")

                            More different asserts is a good idea, and so is renaming the header - the thing was under the temporary name “testfw” until right before I made this post.

                            1. 2

                              Looks neat! I feel that the line number of the end-of-block would still be useful, but don’t quite see how to word that without seeming incorrect.

                              1. 2

                                It’s not just at the end of the it block which the error occurs in; it’s the end of the entire invokation of the describe macro. In the example.c file for example, __LINE__ inside of any of those it blocks will, as of the linked commit, be 62.

                          1. 6

                            I really like the idea behind the theme and the general feel.

                            For my personal taste, as a text editor theme, the background is a little too light and the comments are barely visible. And I say that as someone who uses Solarized.

                            These things are highly subjective of course, and also depend on what room you’re in, what time of day, and what display.

                            1. 2

                              For my personal taste, as a text editor theme, the background is a little too light and the comments are barely visible.

                              The emacs theme deals with this by adding nord-comment-brightness.

                              I think I’m finally going to replace Zenburn (my previous all-time favourite theme)!

                              1. 1

                                Personally, I’ve tweaked my nofrils-like theme for emacs using these colors.

                            1. 12

                              I never understood the popularity of solarized. It lacks contrast and makes my eyes hurt.

                              1. 12

                                There was a blog post which said it was made with science or whatever. Science can’t be wrong.

                                1. 4
                                  1. 3

                                    The implication that the goodness of something so subjective can be quantified really irks me. However, I think a lot of people ate this up, as I’ve seen people non-ironically citing this as a reason it is good.

                                    1. 2

                                      I hear it’s Cave Johnson’s favorite IDE color scheme.

                                    2. 5

                                      I’m more and more in favour of highlighting comments more than the individual parts in the code (variables, strings, …) – and I find that comments often have the least contrast :(

                                      1. 3

                                        In Visual Studio Code you can quite easily try this out since you can add your own customizations to the highlighting in the settings. For instance, you could add

                                            "editor.tokenColorCustomizations": {
                                                "comments": "#e1a866"
                                            }
                                        

                                        to change the color of all comments.

                                      2. 1

                                        I think it depends a lot on lighting. I use the dark theme at evening/night, and don’t have a lot of light in the room. More contrast rich themes like Monokai hurt my eyes in that setting.

                                        The Solarized theme that comes with Visual Studio Code actually uses a base color with more contrast than the original design. But I find that rather annoying in the light theme, especially since they also use bold.

                                        1. 2

                                          More contrast rich themes like Monokai hurt my eyes in that setting.

                                          That makes sense. It’s funny, at night I will continue using typical white-on-black high-contrast color schemes but just drop the monitor brightness a lot if I happen to be hacking away in the dark. Usually I just turn the lights on, though.

                                          1. 1

                                            For me both variants of Solarized are difficult to read in the daytime on a nice display and borderline unusable at any time of day on a low-end display. On the other hand, I find high-contrast dark themes too harsh, so I tend to use dark themes that are somewhere in the middle (~#999 on ~#222) and higher-contrast light themes (~#222 on ~#f5f5f5).

                                            1. 1

                                              gruvbox dark works well for me :)

                                              1. 2

                                                I think the red is perhaps, well, a bit too red in gruvbox. The (over-)use of red/orange/pink in many Solarized themes was part of the reason I made this variant.

                                                Darktooth is another interesting gruvbox-like theme.

                                          2. 1

                                            Agree on the importance of contrast. Lots of color themes are happy to use tons of different colors on things that aren’t completely semantically different (a numeric literal doesn’t always need to stand out a lot) while ignoring the more subtle details such as contrast.

                                            I want the attention to detail Solarized has, but with more contrast, and something besides an ocean or a piece of parchment as the background. I’ve been using a version of Github’s color scheme in my editor for awhile, but have yet to really find a color theme that I really like.

                                          1. 1

                                            Some years back I made a syntax theme based on the Solarized color scheme for Sublime Text, and I’ve been working on making it available for Visual Studio Code and Atom. So I will probably spend some time fixing little issues and fine-tuning colors.

                                            You might wonder why yet another Solarized theme (Yast)? there are plenty, and most editors even come with them. I find many of them look too busy for my taste, they assign colors to every possible syntax entity. Some of them also choose colors that are hard to see for things like selection highlighting (which is not directly specified by the original scheme). So I attempted to make a version that, as far as possible, only assigns colors to the root groups specified in the TextMate documentation (which highlighting in both editors originated from).

                                            Besides that I am also in the process of moving some of my personal programming projects to meson.

                                            1. 1

                                              I am always happy to see alternatives to gitflow (which I think is overly complicated for many projects). This is a nice idea, but perhaps it works best with specific types of development. A few thoughts:

                                              If two developers are working on separate features that affect the same piece of code

                                              if (featureA) {
                                                // changes from developer A
                                              } else if (featureB) {
                                                // changes from developer B
                                              } else {
                                                // old code
                                              }
                                              

                                              How do you rename or remove a variable as part of refactoring, in a way that makes all four combinations of feature flags still work?

                                              I guess it will depend on the types of changes, and how the developers communicate, if this is easier than feature branches (where the conflict resolution is deterministic at the end, and diff works), or if this leads to feature flag spaghetti, and time wasted adapting your changes to be able to run alongside another developers changes, which might end up not being merged.

                                              Also, what if you add a file? then I guess your build system will need feature flags. What if your build system uses globbing and you remove or rename a file? Some changes can’t both be there and not be there.

                                              1. 1

                                                I know people are going to feel differently about this, but I lean heavily toward explicit being better than implicit, and the presence of magic should be minimal, even if it means some redundant work. Redundant work can be verified and semi-automated to keep explicit things up-to-date.

                                                How do you rename or remove a variable as part of refactoring, in a way that makes all four combinations of feature flags still work?

                                                A feature branch just delays this question to the big bang merge conflict. Forcing you to do this work upfront means you talk about this with the other people working in the same code region.

                                                Like you say, the other side of the coin is that the feature branch might never be merged. Early merging optimizes for the happy path. But then again, if you merge your work early and discuss with others, it can’t remain in the twilight state of not merged, not discarded, which may improve coder efficiency.

                                                Also, what if you add a file? then I guess your build system will need feature flags.

                                                Just adding a file shouldn’t affect anything unless some other file references it.

                                                What if your build system uses globbing

                                                Please don’t, especially if you also automatically siphon those files into some automatic extension of functionality.

                                                Merge conflicts are annoying, but clean merges that have semantic conflicts are even worse.

                                                Of course plugin systems are super useful – when they are user accessible and are used for deployment. But then the API would be well-defined, restricted and conservative. Probably the plugins would even be in separate repos and the whole branch vs flag point is moot.

                                                Testing plugin interactions is probably worth an article series of its own.

                                              1. 5

                                                Several people here are recommending CMake as an alternative. I’ve only interacted with CMake at a fairly surface level, but found it pretty unwieldy and overcomplicated (failed the “simple things should be simple” test). Does it have merits that I wasn’t seeing?

                                                1. 3

                                                  CMake can generate output both for Unix and for Windows systems. That’s one (good) reason lots of C++ libraries use CMake.

                                                  1. 2

                                                    CMake is pretty nice and has nice documentation. You can also pick stuff up from reading other people’s CMakeLists. For simple projects the CMake file can be pretty compact.

                                                    1. 3

                                                      I actually found the CMake documentation to be quite terrible for new users. The up-to-date documentation factually describes what the different functions do, but has very little examples of how to actually write real-world CMake scripts. There are a few official tutorials that try to do this, but they are made for ancient versions like CMake 2.6. So in order to learn how to use CMake, you are stuck reading through tons of other peoples scripts to try to deduce some common best practices.

                                                      While modern CMake is not terrible, you often have to restrict yourself to some ancient version (2.8.6 I believe is common) in order to support certain versions of CentOS/RHEL/Ubuntu LTS (and there were some big changes in CMake around 2.8.12/3.0).

                                                      Also, having string as the only data type has led to some absurd corner cases.

                                                  1. 1

                                                    I liked the introduction with the explanation of why we have the gamma functions, but I think it is important to note that sRGB is a color space. Having a section titled “What is sRGB?” and then only explaining the associated gamma function is a bit misleading (I know there is a comment at the end of the section about sRGB containing some other stuff).

                                                    I assume this is written from the viewpoint of DirectX, because sRGB is not limited to 8 bits per channel, you can use whatever precision you like for the components (PNG for instance supports both 8 and 16 bpp).

                                                    1. 5

                                                      Titus Winters was asked about build times at this years CppCon:

                                                      https://youtu.be/tISy7EJQPzI?t=1h23m10s

                                                      1. 3

                                                        He never considers the opportunity cost of using that massive build farm for C++. Think of all the bitcoin that could be mined.

                                                        1. 4

                                                          Opportunity cost? More like opportunity; without C++’ compile times, no such build farm would’ve been built. Because the build farm is built, and presumably doesn’t have to compile code 24/7, there’s now an opportunity to automatically mine bitcoin while the CPUs aren’t busy compiling!

                                                          1. 3

                                                            Bitcoin mining with CPUs hasn’t been profitable for years.

                                                        1. 1

                                                          Anyone else feeling this trend of hand-drawn illustrations is slowly getting out of hand?

                                                          1. 12

                                                            My apologies. All my approaches to diagrams were getting out of hand and nothing was cooperating, so I dropped these in more as “artistic sketches” than informational diagrams. I will probably update them at some point. (Also, I repeatedly redrew these but only had really bleedy paper and my fountain pen, and the scanner wasn’t cooperating, et cetera… excuses, excuses, I know.)

                                                            1. 11

                                                              I’ve gotten the feeling that it’s done for production speed rather than style, but they always give me a warm reminder of the 60s-mid 80s DIY manuals I grew up reading.

                                                              1. 1

                                                                There is also the problem of what to use for such drawings. I remember using xfig for such things a long time ago, I have no idea if there is anything suitable these days?

                                                                1. 6

                                                                  I often make diagrams in ipe or inkscape, but in this case I was thinking the best thing would be through tikz, but it’s a fair bit of work. ipe in particular is really nice for quickly throwing together figures.

                                                                  1. 3

                                                                    I draw them on my tablet and take a screenshot. It’s a lot faster than trying to wrangle something in graphviz or a drag’n drop shapes.

                                                                    1. 2

                                                                      I still sometimes use Xfig + transfig. Depending on the use case, I might use VUE (Visual Understanding Environment — it is nice for drawing graphs, even if it is only a part of its supposed purpose) or GraphViz. I think I haven’t used Dia for a long time; I do use Kig when I want to draw something geometrical. If I want to draw something complicated precisely, I generate it using Asymptote.

                                                                  2. 4

                                                                    I used to do them because I could sketch things quickly with my hand versus software. I was often writing things in notebooks and such.

                                                                    1. 3

                                                                      Normally I find them charming, but I admit these ones are rather fuzzy.

                                                                      1. 2

                                                                        Usually I don’t care either way as long as they get the message across, but several of these are unreadable.

                                                                      1. 1

                                                                        I think it is perhaps worse that it seems that in Ruby:

                                                                        • "-".split("-") is []
                                                                        • "-x".split("-") is ["", "x"]
                                                                        • "x-".split("-") is ["x"]

                                                                        Looking at the documentation, trailing fields are suppressed unless you supply a negative limit parameter. Python and AWK return two elements for each of these.

                                                                        1. 1

                                                                          Perl acts like Ruby in this case, unless you supply a negative LENGTH parameter.

                                                                          Edit edit AWK does return 2 elements, it’s only Perl and Ruby in the weird corner :D

                                                                          My limited knowledge of AWK indicates it acts in the same way as Perl and Ruby, but this could be because I’m writing as

                                                                          $ awk -F: ‘{split($0,a,”-”); print a[1],”:”,a[2]}’ <<< “x-”

                                                                          so I’m implicitly assuming the array a will have 2 elements.

                                                                          Edit the manual for my version of AWK (GNU Awk 4.1.3) states:

                                                                          If string is null, the array has no elements. (So this is a portable way to delete an entire array with one statement. See Delete.)