1. 4

    Wait, Ruby doesn’t have constant folding for literals?

    1. 5

      I’m not a rubyist, but I think that it’s because symbols are supposed to take the place of string literals, and ruby strings are mutable by default.

      Symbols are interned automatically, and you’re expected to use symbols in most places where you would use interned strings in other languages. My guess is that symbols are supposed to be thought of as keys with semantic meaning (and since each key is meaningful, may be reused many times) whereas strings are thought of as fancily encoded array bytes, and may be sliced and diced and combined with other strings any which way.

      Strings are mutable by default, and you must “freeze” strings (as noted by @benolee) if you want an immutable string. If string literals were interned by default, and we also had mutable strings, they would have very odd semantics. There are two options:

      1. We intern mutable strings. This means that when I ask for a “foo” but someone else has already used “foo” in the code, and then modified it to be uppercase, I actually see “FOO” when I inspect my “foo”.

      2. We intern immutable strings, but COW them and adjust the reference of the var so that they turn into mutable ones later on. So when I check whether my two literals strings “bar” are references to the same object, they evaluate to true! But when I uppercase one or both of them, then they point to separate objects. Or else we create a copy when we inspect the object so that a equal? b is consistent. So this means that equal? may entail an object allocation … again this is a very weird semantic.

      So yeah, it seems a little unusual to me, since I spend most of my time on the JVM, but I guess it works for them.

      1. 2

        Ruby 2.1 added some optimizations for frozen strings: http://tmm1.net/ruby21-fstrings

        1. 1

          Not for literal String anyway.

        1. 2

          That does sound annoying. I guess you’re effectively forced to locally wrap fmt.Print(m) with Debug(m) that calls fmt.Print(m) if you want to avoid the hassle.

          1. 1

            I’ve run into it in C before with certain compiler flags that complain about unused variables. I’d just wrap the code in an if (0) {} rather than commenting it out so it’s is still there according to the compiler, but it just never runs. Then once you’re done debugging, rip out the whole block.

            1. 1

              I’ve seen people use preprocessor conditionals in the same way, which I guess has the added advantage of not getting compiled at all.

              Example:

              #if 0
              ...code...
              #endif
              
              1. 1

                Yeah but then you still have variable declarations and (in Go’s case, includes) that will still be there when the preprocessor rips out all of the code between #if/#endif.

                1. 1

                  Ah, I see. I misunderstood your original comment. Neat trick.

          1. 2

            So Mirah without type annotations then? ;)

            Cool stuff, didn’t know Mr. Nutter was working on this.

            1. 1

              I could be wrong, but I think this used to be called “fastruby.” If I remember correctly, you can use all of ruby’s core + stdlib, and it’ll grab the necessary dependencies during compilation. Joking aside, this seems Xtra Cool, as I think Mirah doesn’t come with ruby’s stdlib or anything like that. I’m on a phone, or I’d try it.

            1. 1

              Ruby has quite an ornate grammar, so the first thing I tried broke in opal. I opened an issue.

              https://github.com/opal/opal/issues/137

              Looking through the issue tracker there are a number of other ruby features which are unimplemented/implemented poorly. Like blocks: https://github.com/opal/opal/issues/130

              It’s a nice idea, but they have a long slog ahead of them to implement ruby.

              1. 1

                If you try to reimplement the Ruby grammar yourself, you’re gonna have a bad time.

                This is still a neat project, even though it will probably only be a toy.

                1. 2

                  There are two known good ways to re-implement the ruby grammar

                  • cargo cult what’s in parse.y, EXPR_MID and all. (There seems to be a bunch of unused lexer states that appear in every attempt).

                  • use a generalized parser and disambiguate the parse tree later.

                  The ruby grammar is somewhat ambiguous, so either you have to disambiguate in the lexer, or the parser. The lexer route is the parse.y technique – using the symbol table, and having the parser change the lexer state to force it to return specific tokens (DO_COND/DO_LAMBDA, etc). The parser route is cleaner, but comes with the cost of GLR parsing.

                  The ruby grammar is somewhat ambiguous, so either you have to disambiguate in the lexer, or the parser.

                  1. 1

                    Well, if it ever gets beyond the toy stage, it would be fun to build a Ruby VM by cross compiling it to JS and evaluating it with V8. It would provide a third Ruby VM that actually compiles to machine code (Rubinius and JRuby being the only others I know of).

                    1. 1

                      I know there’s at least one MRI bytecode vm written in JS, but I can’t think of the link now

                      1. 1

                        coldruby looks like what you’re thinking of. From the readme it seems it runs YARV bytecode in a JavaScript runtime. Although I don’t see it mentioned, there’s an older project hotruby that’s got to be the inspiration for coldruby. It might be fun to flesh out @tenderlove’s Scheme to YARV bytecode compiler to mix ruby, scheme, and js on coldruby (although that’s missing the point/coolest part of his gist (omg RubyVM::InstructionSequence.load with fiddle!))

                1. 3

                  I work with MediaWiki all day every day, and I simply cannot agree that this approach is better.

                  First off, in what way is this better? You may want to add that to your README. As far as I can tell, you’re just randomizing in the application layer instead of the database layer — using cached data, no less. I’d like to know your arguments for this approach. We already need to load the article from the database once we have selected the random article, so we are cutting down on queries by doing it in the database layer.

                  Is your random somehow scientifically more random? Do we need that kind of precision?

                  Finally, as a side point, if you have a problem with the way MediaWiki does something, then write an extension to fix it, or try contributing to the project. It’s an open project with a sane review process, so there’s nothing stopping you. That way, you could be helping to improve the product rather than just bemoan it and hack your own solution using an entirely different toolset. But perhaps we’re better off this way. A cached csv of URLs is just not a maintainable approach to this problem.

                  1. 2

                    Yeah this is stupid-simple, and a 5 minute hack (there’s almost no reason to use csv, and the redirect probably isn’t even valid html!). It’s not meant to replace Special:Random at all, but merely to provide a randomly-chosen article from the (static) list of articles curated for Wikipedia Version 0.8.

                    It’s only “better” in that the ~47,000 articles chosen for Wikipedia 0.8 are specifically higher quality and more interesting in general than the stubs and other nearly-contentless articles Special:Random redirects you to.

                    I originally wanted a way to specify articles within certain Categores for Special:Random, and that actually exists on the toolserver here: http://toolserver.org/~erwin85/randomarticle.php. However, within category Physics, I got like 10 articles before it started repeating results (in the same order). And I didn’t want to specify every Category. I just wanted a curated list of “good” articles, at random.

                    Special:Random is probably used for many things other than just providing a randomly “good” article for me to read on my phone while lying in bed. I’m not trying to replace that, and I didn’t really mean for the name to imply that. I don’t see how linking to a random 0.8 article would be an extension anybody would want in Wikipedia, and there’s no reason to include Wikipedia-specific tools in MediaWiki. Sorry for the confusion.

                    1. 1

                      Thanks for giving a detailed response on this. When it comes to MediaWiki as a platform, manually curating articles isn’t scalable. There are programmatic ways to do this right that you can then contribute to the core project.

                      I would suggest randomly selecting one of the top N articles in a given category sorted by some combination of page views, revisions, age, or number of editors. Take a look at http://www.mediawiki.org/wiki/API:Main_page if you’re interested in really playing with what the MediaWiki API has to offer.