Threads for Claudius

  1. 2

    To search for a sub-string in another one, we often use the strstr in C or the std::string::find in C++. The least we can say is that it is far, very far from optimal…

    But is it? Shouldn’t those library functions be heavily optimized since they’re so widely used? (And if they aren’t, why not?)

    1. 1

      I guess it is a very simple case one size fits for all.

    1. 1

      Actually, the mod of a power of two is always a simple and with 1^^n - 1… If you compute the mod of a value with 11, all your values will be between 0 and 10… If you compute the mod of v with 8, all your values will be between 0 and 7, which is v & 0b111

      1. 1

        what is LispE

        1. 1

          If you go up a couple of directories in that github you’ll find the ReadMe for the language.

          Seems to be an interpreted, lazy lisp with pattern matching & arithmetic types a la Haskell. Sounds fun!

          1. 1

            And array instructions à la APL… It is fun… :-)

        1. 2

          It almost feels like Python was written in a deliberately slow way. Like:

          sum([w * e for (w,e) in zip(a,x)])
          

          Simply removing square brackets to avoid creation of intermediate lists in memory would speed this line up very significantly.

          And that’s not even touching on the subject of numpy: most people actually using Python for statistics would use it for such a task, because why not? So it’s not really “LispE vs. Python”, it’s “LispE vs. deliberately unoptimized pure-Python code”.

          1. 4

            For your second point, the problem with punting to a library is that the same (or similar) libraries can be used from Lisp, making the comparison pointless. I’m not sure about LispE, but Common Lisp at least has bindings for a lot of math libraries (BLAS, GSL, LAPACK, etc.).

            On the other hand, using one of those seems like overkill for the examples in the article. And if a language is so slow that any meaninful work needs to pull in third party libraries like NumPy, then that’s worth knowing, too.

            1. 2

              Simply removing square brackets to avoid creation of intermediate lists in memory would speed this line up very significantly.

              Didn’t seem to have much effect when I tried it out. If anything the code became slightly slower, 63ish ms vs 59ish on my computer. Why? No idea. The variance on either was +/- 3 ms anyway though, just from eyeballing it, so it’s a little difficult to know whether that difference was real.

              1. 1

                That’s a good observation, thanks for going through the trouble :-). It is true that on small lists generator expressions can be slower than list comprehensions. It was just something that jumped on me from cursory looking.

                1. 2

                  Yeah list comprehensions are faster at a certain size, because think of the machinery to do generators. Unfortunately, I think that size is pretty large :)

              2. 2

                And that’s not even touching on the subject of numpy: most people actually using Python for statistics would use it for such a task, because why not? So it’s not really “LispE vs. Python”, it’s “LispE vs. deliberately unoptimized pure-Python code”.

                This seems like a pretty incurious reading of the article. Mine is that it’s quite explicit on exactly this question: the answer is, this algorithm was chosen as an exemplar of a sufficiently computationally complex yet still-understandable benchmark case for comparing the two languages as general compute languages. Not as a demonstration of why you should use this lisp to do your data engineering. The fact that the domain of this algorithm, in the real world, tends to revolve around some other highly optimized library is completely neither here nor there.

                1. 2

                  Thank you, you are absolutely right

              1. 1

                What is LispE? I can’t find anything about it using google. Also not mentioned in the Lisp wikipedia article anywhere.

                1. 1

                  The github repo says:

                  Lisp Elémentaire, a version of Lisp that is ultra-minimal but contains all the basic instructions of the language.

                  1. 1

                    This a Lisp that is being developed at Naver Lab Europe (France) You have pre-compiled versions in the binaries section for Windows and Mac OS. You can also download it and compile it for Linux…

                  1. 1

                    Hum… I wonder if you had a look on Haskell? Function composition is very elegant in this language:

                    filter . map (+) [1..10]

                    It even have an operator to flip arguments…

                    1. 2

                      Look at K and Mathematica (aka Wolfram) for two languages that integrate the ideas from APL and Lisp in a more interesting way. In these languages, multidimensional arrays are represented by nested lists, and all scalar operations are generalized to operate on lists. This is a powerful idea whose benefits go well beyond the linear algebra implemented by LispE.

                      The idea is most powerful if you have only one kind of list (like in K). So a character string is just a list of characters. A matrix is a list of lists of numbers. Instead of representing bit strings as integers (as LispE does with the bitwise operators), a bit string is a list of booleans. All of your scalar operations (on numbers, booleans, and characters) are generalized to work on lists.

                      1. 1

                        Do you have any good resources for learning K? I have a few bookmarked here but in everything I’ve found there seems to be a lack of practical exercises/projects so I end up forgetting everything I read.

                        1. 1

                          I’m not sure if any of my resources are “good”. Another problem is that there are many dialects of K.

                          1. 1

                            Thanks. The last one is the one I’m trying to follow this time.

                          2. 1

                            K crash course was good. It refers to k7, which is no longer current, though it can still be run via web.

                            Q for mortals is decent, though very slow.

                            1. 1

                              I believe this is the same as https://github.com/kparc/kcc

                              1. 1

                                Latter is incomplete, in the process of being updated to k9.

                                1. 1

                                  Oh, good to know.

                          3. 1

                            LispE operates on tensors, which are implemented as nested lists.

                            1. 2

                              Tensors do not an apl make. Tensors, a concise notation, a rank operator…


                              In apl, everything is an array, and functions operate homogenously on the same. Power emerges from that.

                              In lisp, everything is an atom or a pair of things, and functions operate homogenously on the same. Power emerges from that, too.

                              Newer lisps add things like arrays, but an array is little more than a concession to performance; its role is similar to that of the cons: a container for other objects. Lisp’s language is abstraction, and it speaks it well.

                              When you try to blend the two, you end up with something that is neither a very good lisp nor a very good apl. (It may be good at being something else. I do not mean to make a value judgement.) Tensors feel more scalar than lisp’s aggregates and more aggregate than lisp’s scalars, and the homogeneity of each language is lost.

                              1. 1

                                Actually, you should have a look on how lists are implemented in LispE (see https://github.com/naver/lispe/blob/master/include/listes.h). They are implemented as arrays, not as lists. I worked with many versions of Lisp over the years, and I had an issue with the idea of pushing elements at the beginning of the list, which is usually quite unnatural in most languages. I opted instead to a different implementation which is composed of two layers: ITEM (the array implementation) and LISTE, which contains a pointer to an ITEM. LISTE also contains an index to the first element in the array. A CDR for instance, creates a LISTE object that shares the same ITEM with the initial list, but move its index one notch over.

                                1. 1

                                  I don’t see why something’s being unnatural in one language should have any bearing on that thing’s role in another language.

                                  Which being said, you can push elements to the end of a list, by keeping a pointer to the final cons. Another pattern which is very common is to add elements to the head of a list and reverse it once done.

                                  That’s all mostly beside the point, though; again, the semantic role of arrays in lisp is almost identical to that of conses in this context.

                                  1. 1

                                    You may be interested in gamelisp for a different take on how to do a “lisp” with a vector rather than cons cells.

                            1. 7

                              I don’t see why this had to be implemented in C++ when you’re already working with a Lisp. The nice thing with Lisp is that the macro system allows for stuff like this to be added.

                              Examples, CL: https://github.com/mrcdr/polisher and https://www.cliki.net/ugly-tiny-infix-macro

                              Scheme even has a SRFI dedicated to this: https://srfi.schemers.org/srfi-105/srfi-105.html

                              1. 2

                                LispE is described as a pretty minimal LISP, so it may not have macros; I didn’t find any mention of them when I skimmed the language description.

                                I’m wondering if this could be applied to concatenative (Forth-like) languages. I have a toy one I’m working on, that I might do some syntactic experiments with.

                                  1. 1

                                    LispE is described as a pretty minimal LISP, so it may not have macros

                                    Then that should probably be implemented as soon as possible, at least much sooner than infix operator features.

                                    I’m wondering if this could be applied to concatenative (Forth-like) languages

                                    If you’re referring to infix operators, it seems to be possible: https://arduino-forth.com/article/FORTH_exemples_convertInfixToPostfix

                                  2. 1
                                    1. 1

                                      because with macros you cannot easily handle operator precedence

                                      1. 1

                                        But it has already been done in the examples I’ve showed.

                                        Anyways, whatever. I would implement the entire thing in Lisp from the ground up, like SBCL has done, but judging from your codebase it looks like you have a strong preference for C. I won’t argue with you any further, but I urge you to explore Lisp itself a bit before taking the C route.

                                        1. 1

                                          You don’t even need macros, you can write an metacircular interpreter that can be switched to infix or postfix notation. Changing how a lisp is interpreted is the whole point of a lisp.

                                          1. 1

                                            Ah, I’m showing my ignorance it seems.

                                            I haven’t had the ability to explore metacircular evaluators yet, but I am excited to see they are a topic of SICP which I am currently reading. Do you know any other introductions to the topic?

                                            Changing how a lisp is interpreted is the whole point of a lisp.

                                            This in particular is what piqued my interest, as I love Lisp but don’t have any experience “changing how it’s interpreted”.

                                            1. 1

                                              Reading the documentation of racket is probably the best place to start. It is the lisp that I would advise anyone starting off to look at. It feels like python felt in the early 00s.

                                              1. 1

                                                Thanks for the tip, I’ll check it out.

                                        1. 1

                                          actually, there is a macro implemtation: defmacro (see https://github.com/naver/lispe/wiki/6.1-Description-of-methods-and-operators#defmacro)

                                          1. 1

                                            Why was this implemented in C instead of in terms of a macro then?

                                          1. 5

                                            I was implementing a programming language with multi-threading, and I was heavily testing it when I started having weird crashes that I could not understand. It took about a week to discover that the problem was linked with my reference counting mechanism. Sometimes the reference would jump to 0 and the object would be deleted while being still in use with another thread. Sometime the reference would be 2 and the object would never be destroyed, creating massive memory leaks (it was implemented in C++). Eventually, I managed to understand that it was a concurrency problem and I implemented the reference counter as an atomic value, which saved the day.

                                            1. 4

                                              This is progressing towards the rediscovery of regular expressions.

                                              1. 4

                                                Indeed. If anyone finds this article interesting and doesn’t know regular expressions (“regexes”) yet, I recommend reading regular-expressions.info/tutorial.html. When I learned regexes from that site I found it to be well-written. The site plugs the author’s own regex-testing tool in between explanations, but you can just use regex101.com, which is free and equally powerful.

                                                Here’s an example of using a regex in Python to extract text within square brackets:

                                                import re
                                                string = "123[4567]890"
                                                re.search(r'\[(.*)\]', string).group(1)  # evaluates to '4567'
                                                
                                                # You could also write the regex with whitespace for readability:
                                                # re.search(r'\[ (.*) \]', string, re.X).group(1)
                                                

                                                Regexes have some advantages over the extract DSL defined in the article. They support powerful features such as extracting multiple parts of the text with one search. They are supported by all major programming languages. Most text editors let you use them to search your text. They are also very concise to type. However, they have flaws of their own, particularly how hard they can be to read. So though regexes are useful to learn, they are not the ultimate way of extracting parts of text.

                                                Here are some projects that aim to improve on regexes (but are much less widespread):

                                                • Regexes in the Raku language. Raku, formerly known as Perl 6, breaks compatibility with Perl 5’s regex syntax (the same syntax used by most regex engines) in an attempt to make regexes more readable.
                                                • Egg expressions, or eggexes, are a syntactic wrapper for regexes that will be built into the Oil shell.
                                                1. 2

                                                  And Parse in Red is also a nice alternative to regexes.

                                                  1. 2

                                                    I’d prefer r'\[(.*?)\]' or r'\[([^]]*)\]' to avoid multiple square brackets in the string matching more than expected. Also, in newer versions of Python, you can use [1] instead of .group(1)

                                                    https://www.rexegg.com/ is another great site for learning regexp. And https://github.com/aloisdg/awesome-regex is a good collection of resources for tools, libraries, regexp collections, etc.

                                                  2. 3

                                                    Perhaps we can coin a new aphorism! Greenspun’s Tenth Zawinski’s Law: Any sufficiently complicated Lisp text processing program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of regular expressions.

                                                    Edit: Or perhaps ‘Every Lisp program attempts to expand until it contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of regular expressions. Programs which cannot so expand are replaced by those which can.’

                                                    1. 4

                                                      Zork was originally implemented in an obscure LISP dialect called MDL, at MIT. (I’ve seen snippets of the source code and it was not purely S-expressions; it used square and angle brackets too.) It has an interesting history beyond that too…

                                                      It was then translated into FORTRAN(!) by “a DEC engineer who wishes to remain nameless,” probably because the task drove him mad. That was the form, named DUNGEON, in which it spread around the ARPAnet in the late ’70s.

                                                      After that, the Zork implementors founded a startup called Infocom and managed to port it to 8-bit personal computers like the Apple ][. They split the game into three parts, but even so they had to resort to some interesting tricks to squeeze it onto a 112KB floppy. The main one was to compile the MDL to a bytecode virtual machine called the Z-machine, which made the code small and portable. I think this is one of the earliest cases of using a bytecode interpreter for portability. (The Z-machine instruction set is still in use by the interactive fiction community, although they had to extend it several times to allow for larger games. These days there’s an entirely different language called Inform that you write the games in. The Z-machine interpreter has been ported to a ton of systems including Web browsers.)

                                                      Scott Adams is another interesting story — he played the original ADVENTURE and tried to write something similar but much smaller on a TRS-80 with 8KB of RAM, in BASIC. The source code for that is quite tricky and strange.

                                                      1. 1

                                                        I owned a TRS80 clone in 1981, and the game was shipped with it.