1. 18
    1. 11

      For the fun of it, two alternatives to shorten your solutions.

      The first uses juxt and is mostly point-free, but terse:

      (->> "aaaabbbcca"
           (partition-by identity)
           (map (juxt (comp str first) count))))

      The second one with for, a bit chattier but also more legible:

      (for [p (partition-by identity "aaaabbbcca" )]
        [(-> p first str) (count p)])

      Since you were mentioning suggestions in the article, this is one of the

      1. 6

        Here’s my point-free take on it using Dyalog APL

              (((1∘+ +\) ⊢ ≠ ¯1∘⌽) (((⊃,≢) ⊢) ⌸) ⊢) 'aaaabbbcca'
        a 4
        b 3
        c 2
        a 1
        1. 4

          Nice! Here’s mine:

            a 4  b 3  c 2  a 1
                 ↑x ⍝new formatting, same great taste!
           a 4
           b 3
           c 2
           a 1

          And for bonus points, the decode:

                 g x

          I really wish dyalog had capped forks—they would obviate the last set of parentheses in the encoder.

          1. 4

            I love it! I wonder what differences in memory or performance would be in applying (⊃,≢) as you do to each partition after grouping them, vs during the grouping as I do with the key operation. I’d forgotten about the “n-wise reduce” variant you used, which is much nicer than my ancient rotation idiom.

            It’s interesting how similar the Clojure and APL solutions for a problem like this can be. I mean (juxt first count) and (⊃,≢) are remarkably alike. It makes me wonder what Clojure would be like if it had function trains instead of threading macros.

            1. 2

              This has to be the most interesting discussion I’ve seen this week. Downside: now I’ll have to relearn APL enough to understand your solutions. :D

            2. 1

              memory or performance

              IME performance is generally not super intuitive to reason about in APL, because of the interpreters’ ability to do pattern recognition.

              It’s interesting how similar the Clojure and APL solutions for a problem like this can be. I mean (juxt first count) and (⊃,≢) are remarkably alike. It makes me wonder what Clojure would be like if it had function trains instead of threading macros.

              This is a good read.

          2. 1

            duh; simpler: (⊃,≢)¨⊢⊆⍨(+\1,2≠/⊢)

        2. 2

          Having never seen this language before, it looks like someone has incorrectly rendered a bunch of UTF-8 characters lol

      2. 3

        That’s a nice use of juxt to make it point-free (what do you mean, mostly point-free?). My first intuition was pretty much the same with an anonymous function instead of juxt.

        I believe getting used to threading macros make data manipulation tasks like this one much, much easier to approach. Both when writing and when reading.

        I have a hard time believing that most candidates cannot solve the problem. Do they claim to know Clojure?

        1. 3

          It looks like the original tweet wasn’t about clojure?

      3. 3

        Unrelated, but I thought I’d comment anyhow since I know you’ll read it:

        Your brainfuck interpreter changed the way I thought about everything. Long, long story short, more than ten years ago I began my journey learning Scheme through SICP, and over time I just lost the spark and sorta fell away from writing beautiful programs and wanted to just write clever programs. I was working on a little side project involving a genetic algorithm and brainfuck, and along the way, I thought “oh, what the hell, I may as well read a bf interpreter and then just try and write my own” and I stumbled across yours.

        I won’t bore you, but let me just tell you that when I read that code, it became as clear as day to me how long it had been since I read something elegant and beautiful. Since then I’ve been getting deeper and deeper into Clojure, and that old part of my brain that sought after beautiful programs is starting to feel rejuvenated. I’d been reading Ruby classes and Python classes for SO long, it’s been really nice digging my brain out from all of that.

        This may be a really weird comment to read, sorry. I just really had a ~*~moment~*~ after reading your code. It’s immaculate.

        1. 3

          This means a lot, thank you.

      4. 3

        Thanks. Both suggestions look good, the first one little bit more readable.

      5. 3

        Oh wow, juxt there makes it sweet. Nicely done!

    2. 9

      While not a (single) programming language, my first thought was to use uniq(1):

      echo 'aaaabbbcca' | grep -o '.' | uniq -c
      1. 1

        Wow, that’s elegant.

    3. 5

      Way to nerd-snipe me.

      Using regex:

      (map (juxt first count)
           (str/split "aaaabbbcca" #"(?<=(.))(?!\1)"))

      The regex is a little weird – basically, we want to split the string on any boundary where the characters before and after the boundary are different. So “aaaabbbcca” becomes four strings: [“aaaa” “bbb” “cc” “a”]. It does this by using a lookbehind for any character – the (?<=(.)) – and then a negative lookahead for not that character – (?!\1). We don’t want to include either of those characters in the match, so the characters are returned as part of the new strings.

      The rest is relatively simple: juxt returns a function that takes a single argument, then returns a vector with the results of calling the two functions separately on that argument.

      I don’t know that I’d write this in production code, but it is kind of fun.

      1. 3

        Kind of the same idea but in Raku:

        "aaaabbbcca".subst( /(.) $0*/ , {"({$0},{$/.chars})" }, :g )

        The regex for the substitution is /(.)$0*/ where $0 is the first character matched by the joker grouped that could appears zero or more times. the second argument is a block where a string to follow the format (letter, number of times in a row) is followed. $0 is the character and $/ contains the match objects. The chars method returns the number of characters in the match object by grapheme. The last argument :g applies the substitution globally, without it the method will stop after the first ‘a’.

        my $str = "aaaabbbcca";
        $str ~~ s:g/(.) $0+/{ "({$0}, {$/.chars})" }/

        is the same with in-place substitution using s/// construct instead of the subst methods.

    4. 4

      Ruby one-liner, if you like long lines.

      irb(main):007:0> "aaaabbbcca".split("").chunk_while {|a,b| a==b}.map {|a| [a.first, a.size]}
      => [["a", 4], ["b", 3], ["c", 2], ["a", 1]]

      Never mind that it took 7 tries…..

      1. 2
        'aaaabbbcca'.chars.chunk(&:itself).map{|x,y| [x, y.size]}
        1. 2

          Bravo! I daresay that’s optimally golfed

      2. 2

        Two improvements:

        .split("") == .chars

        You can use chunk_while(&:==)

        1. 1

          Oh wow, that’s fantastic. I’d overlooked both entirely. Thanks!

    5. 3

      If you don’t know of partition-by this code also works:

      (defn translate [in]
        (loop [left in
               counts []]
          (if (empty? left)
             (map #(list (str (first %)) (second %)) counts)
             (recur (rest left)
                    (if (= (first (last counts)) (first left))
                        (update-in counts [(dec (count counts)) 1] inc)
                        (conj counts [(first left) 1]))))))

      Ugly as sin, but works. 8 minutes including writing this response, no documentation lookups, all in a readline environment. (I know about help but am not yet sure how to find useful functions other than scrolling around the cheatsheet)

    6. 3

      Reading the Clojure solution involves following the thought process of how the author ended up there. Reading the equivalent Go solution has very little cognitive overhead.

      How does Clojure programmers write code that is not just brief and elegant in the eyes of the author, but can also be read quickly by others without time consuming mental gymnastics?

      1. 4

        I guess it’s just a matter of what you’re used to. I find both Go solutions above arduous to understand because you have to follow the algorithm and construct what it functionally does in your head. Whereas in most of the clojure solutions the code directly tells you what’s happening. Eg. the top post with comments:

        ; threading macro (take input, then do stuff to it in steps)
        (->> "aaaabbbcca"
             ; split into lists of consecutive same character
             (partition-by identity)
             ; map each item (list of chars) with a function
               ; return list of outputs of functions
                 ; first character as string
                 (comp str first)
                 ; count of characters

        The information is much more dense. If you are good at reading it, you understand it at a glance.

        I believe the same is true for the APL solutions. I can’t read them, but someone who does sees the algorithm clearly.

        1. 5

          I find both Go solutions above arduous to understand

          Same here; they just seem super tedious and low-level; like you’re spoon-feeding the compiler on basic things instead of just describing the algorithm naturally.

          If you don’t have the vocabulary to describe an algorithm, of course the low-level version is going to be easier to read, much like someone who is only just learning English will find the “Simplified English” wikipedia more accessible than the regular one that often uses specialized jargon. But if you work with algorithms every day, you owe it to yourself to invest in learning the jargon!

          1. 1

            Isn’t this just because Clojure has a much bigger standard library than Go? I’m sure the Go solutions would be much higher level if you allowed for third party packages.

            1. 3

              I don’t think it has to do with the size necessarily but rather the focus on sequence manipulation as a core competency; the acknowledgement that data structures are at the center of what languages need to be good at.

              Allowing 3rd-party packages would make it shorter at the expense of readability; you can’t assume your reader is familiar with idioms outside the language.

          2. 1

            Lower level also means that it’s closer to what’s happening on the computer when the code is running. Depending on the type of program, this can be useful.

      2. 2

        Sort of the same principle as how people can jump into a rails codebase that’s got thousands of classes and be productive quickly.

        It’s a lot of convention, and a lot of practice.

        That Clojure code is also very Clojure-y if you know what I mean, it’s not doing anything clever or obscure.

        You might benefit from looking over some style guides and really reflecting how your code reads in the lens of what the style guide says.

      3. 2

        That’s a great question and it basically boils down to good taste. You can do code golfing and try to get the code as small and concise as possible, or you can write it the laborious but obvious way. This goes for any language with abstractions, you can always go too far in using certain abstractions. Nowadays I lean more towards “dumb” code because it’s not just easier to write but also easier to understand. However, the “code golf” style solution can be more robust. “Dumb” code tends to have more edge cases (i.e. bugs).

        I like how the above short go code deals with the starting letter edge case by simply prefixing the string; I struggled a bit with that myself as I was coming up with a solution. Now that I think about it, there’s an edge case that gets triggered if the first character in the string is a space; then the code crashes.

    7. 2

      Just because I haven’t seen these yet: in Ruby you can use Enumerable#tally and in Python you can use Counter (in ‘collections’)

      Both of these give the equivalent result in one call, so are good for some golfing.

      1. 2

        Enumerable#tally is not a valid solution. It returns => {"a"=>5, "b"=>3, "c"=>2}

        1. 2

          I see - I hadn’t spotted the difference, thanks!

          1. 1

            Np! I actually made the same exact mistake trying to solve it, that’s why I commented on yours.

        2. 1

          Same with Counter:

          from collections import Counter
          # gives [('a', 5), ('b', 3), ('c', 2)],
          # not [('a', 4), ('b', 3), ('c', 2), ('a', 1)]
    8. 2

      Not that the requirements say anything about space consumption, but most (all?) of the short solutions I’ve seen so far are not streaming friendly. Here’s a lazy version in Clojure that does not materialize intermediate sequences to count them:

      (defn count-consecutive [coll]
        (letfn [(head [s]
                    (when-first [x s]
                      (tail x 1 (next s)))))
                (tail [group acc s]
                  (if-let [[x & xs] s]
                      (if (= group x)
                        (tail group (inc acc) xs)
                        (cons [group acc] (head s))))
                    [[group acc]]))]
          (head coll)))
      (count-consecutive "aaaabbbcca")
    9. 2

      I like Ruby on this:

      “aaaabbbcca”.chars.slice_when {|c,n| c != n }.map {|s| [s[0], s.size] }

    10. 2

      I know this is a Clojure post, but out of curiosity I wrote it in Go (what I’ve been working in lately) just to see if I could solve it quickly. Took about 10 minutes, subtract 3-4 for fighting with runes:

      package main
      import "fmt"
      type result struct {
      	letter string
      	count  int
      func main() {
      	const input = "aaaabbbcca"
      	var ret []result
      	currentLetter := string(input[0])
      	countCurrentLetter := 1
      	for _, elem := range input[1:] {
      		elemAsString := string(elem)
      		if currentLetter == elemAsString {
      		} else {
      			ret = append(ret, result{currentLetter, countCurrentLetter})
      			currentLetter = elemAsString
      			countCurrentLetter = 1
      	ret = append(ret, result{currentLetter, countCurrentLetter})
      	fmt.Printf("%+v", ret)

      It’s not particularly elegant, but it works.

      1. 2

        It’s not particularly elegant, but it works.

        That’s my problem with many other (non-Lispy) languages. The programs are not elegant, even though they do work. What works for a computer, don’t always work for me.

      2. 1

        Okay, I am 5 months late, but this code is terrible and I must object to it because there’s no good Go code in this thread. You are mixing up two problems, lexing a token, and figuring out the next token. Apart from that, the code is very nonIdiomaticWithTheseVariableNames, but more importantly blows up on non-ASCII strings.

        Here’s two solutions, one imperative: https://play.golang.org/p/-zdWZAnmBip, and one recursive: https://play.golang.org/p/TBudEZBphv7.

        The proposed solutions:

        1. actually work on unicode input
        2. there’s no else in sight.
        3. all ifs are early returns.
        4. all loops are exhaustive, no weird [1:] boundary conditions.
        5. I don’t have to keep around accumulators for the results.
        6. no useless types.
        7. much easier to read because the code just tells you what it does, why it does it is obvious.
        1. 2

          Cool, I guess?

          I didn’t say I solved it well. I hacked something together.

      3. 1

        Here’s my take on it. It took me (roughly) the same 8-10 minutes to type it in the web-ui. In Emacs I could shave some time off of it.

        type tuple struct {
        	s string
        	i int
        func splitStringReturnTuples(str string) []tuple {
        	str = " " + str
        	res := []tuple{}
        	for i := 1; i < len(str); i++ {
        		if str[i] != str[i-1] {
        			res = append(res, tuple{string(str[i]), 1})
        		} else {
        	return res

        Runnable code at the go playground

        1. 2

          This loops over the bytes in the string instead of the runes in the string. Try inserting a multi-byte rune such as 本 in the string, and see what happens.

          1. 2

            The problem statement clearly stated the data set, there was no multi-byte symbols. But the tweet gave a clear understanding that interviewer expects solution to be give in a limited time frame. Therefore the solution was provided in terse notation with abbreviated variables taking provided test input and returning expected output. Not more nor less.

            But point is taken. Here’s the code correctly handling multi-byte encodings. The logic is the same, but the part of casting passed string into a slice of runes.

            When I interview people I don’t expect them to write code perfectly handling every possible input in the limited time. What I’m interested in first, if they are able to come up with straightforward solution leveraging data structures and algorithms helping them solve the problem with optimal complexity. Second, if they can clearly communicate their approach. And coding comes third.

            1. 2

              That makes sense. I did not mean to criticize your solution in particular, just highlight that this is a common “gotcha” in Go. Casting strings to []rune or looping with for _, r := range str is, as far as I know, the only built-in way to access the letters in strings correctly. I’ve seen many problems arise from assuming that str[x] returns a rune instead of a byte. I think it would be more useful and intuitive if []byte(str)[x] was needed to return a byte, while just str[x] could be used to return a rune.

    11. 2

      Since everyone is posting their solutions, here’s mine in Erlang:

      L = "aaaabbbcca",
              (Elt, [ {[Elt], N} | R ]) -> [ {[Elt], N + 1} | R];
              (Elt, Acc) -> [{[Elt], 1} | Acc]
      > [{"a",4},{"b",3},{"c",2},{"a",1}]
    12. 2


      (for/fold ([p #f]
                 [cnt 0]
                 [counts null]
                 #:result (cdr (reverse (cons `(,p ,cnt) counts))))
                ([c (in-string "aaaabbbcca")])
        (if (equal? p c)
            (values c (add1 cnt) counts)
            (values c 1 (cons `(,p ,cnt) counts))))
      1. 1

        Also using fold, with mnm.l :

        (def challenge (IN)
            (foldr (\ (C ACC)
                     (let ((((V . N) . TL) . ACC))
                       (if (= C V)
                         (cons (cons C (+ N 1)) TL)
                         (cons (cons C 1) ACC))))
              IN NIL))
      2. 1

        Racket’s group-by is wonderful but I usually want to group consecutive equal items into clumps as they arise rather than a single monolithic group.

        (define (group xs)
          (match xs
            [(cons x _)
             (define-values (ys zs) (splitf-at xs (curry equal? x)))
             (cons ys (group zs))]
            [_ null]))
        (define (encode xs)
          (for/list ([x xs])
            (list (first x) (length x))))
        (encode (group (string->list "aaaabbbcca")))
    13. 1

      I started using this question in job interviews a year or two ago and now this tweet and post has to go viral. Thanks a lot jerk!

      I like this python one-liner solution (with some help from the standard library):

      >>> [(k, len(list(g))) for k, g in itertools.groupby("aaaabbbcca")]
      [('a', 4), ('b', 3), ('c', 2), ('a', 1)]
    14. 1

      I saw that tweet and really enjoyed it. I spent a few minutes while doing other work kicking around the solution in F# in my head. It’s a good example of a question tough enough to exercise a few brain cells but not one of those “I’m the interviewer see how smart I am” things. Of course the best answer, which a few replies gave, was simply to print out the required result. Don’t write code you don’t have to, and since you were given the answer and you can write code to get that using only one statement, do it.

      But of course that’s too cheeky. For the record, in F# it looks to me like:

      “aaaabbbcca” |> Seq.countBy id;;

      I don’t know how to make it any simpler than that.

      1. 4

        Unfortunately, this doesn’t return the expected output. That last “a” is what throws a wrench in solutions like this.

        1. 1

          Ah okay. My bad. I didn’t see the second “a”

          Let’s try this again. This is a bit more gnarly. Needs cleaning up a lot. I wrote for clarity instead of brevity.

          let rec breakIntoRepeatingChunks<'a when 'a:equality>(inputSeq:'a seq) acc =
              if inputSeq|>Seq.isEmpty then acc else
              let letterWereMatching=inputSeq|>Seq.head
              let chunk=inputSeq |> Seq.takeWhile(fun y->y=letterWereMatching)
              let chunkLength=chunk|>Seq.length
              let newInputSeq = inputSeq |> Seq.skip chunkLength
              let newItem=seq{letterWereMatching, chunkLength}
              breakIntoRepeatingChunks newInputSeq (newItem |> Seq.append acc)
    15. 1

      Is it cheating to use the stdlib function that already does the work?

      (frequencies "aaaabbbcca")
      ;; => {\a 5, \b 3, \c 2}

      Or if you wanted to respect the types in the OP:

      (->> "aaaabbbcca"
                 (map (fn [[k v]] [(str k) v]))
                 (into []))
      ;; => [["a" 5] ["b" 3] ["c" 2]]
      1. 2

        Frequencies doesn’t help here. “A” needs to be in the output twice.

      2. 2

        Check the requirements again, it’s not frequency, it’s “runs” of characters.

        I started working on this in Perl then rapidly lost interest when I saw what was required ;)

        Edit changed my mind, here’s a straightforward solution, slightly golfed to fit in 24 lines https://gist.github.com/gustafe/117768d83df175b308aa15dcc75d2061

      3. [Comment removed by author]

    16. 1

      Here’s my very elegant and short take in Sed:

      : next_char
      t inc_start
      : inc_start
      s/^\([0-9]*\)0|/\11/; t inc_end
      s/^\([0-9]*\)1|/\12/; t inc_end
      s/^\([0-9]*\)2|/\13/; t inc_end
      s/^\([0-9]*\)3|/\14/; t inc_end
      s/^\([0-9]*\)4|/\15/; t inc_end
      s/^\([0-9]*\)5|/\16/; t inc_end
      s/^\([0-9]*\)6|/\17/; t inc_end
      s/^\([0-9]*\)7|/\18/; t inc_end
      s/^\([0-9]*\)8|/\19/; t inc_end
      : inc_loop
      t inc_loop
      s/^|/1/; t inc_end
      b inc_start
      : inc_end
      t inc_start
      t next_char


      sh$ echo aaaabbbcca | sed -f my-take.sed
    17. 1

      So, two things:

      1. I don’t know that Run Length Encoding is the best filter for candidates, it feels a little bit on the puzzly side.
      2. Here’s my quick and dirty solution using janet’s seq. It’s imperative, and not pretty and functional (janet can do both), but I was able to build it quickly, and it illustrates the power of seq (and loop), and I think it’d be nice and clear to explain for non-functional programmers:
      (def input "aaaabbbcca")
      (var old (string/from-bytes (input 0)))
      (var cnt 0)
      (pp (array/push (seq [c :in (map string/from-bytes input)
                            :after (++ cnt)
                            :when (not= old c)
                            :after (do (set cnt 0) (set old c))] 
                        [old cnt])
                      [old cnt]))
      # => @[("a" 4) ("b" 3) ("c" 2) ("a" 1)]
    18. 1

      Since no one contributed a Rust example:

      fn run_length_encode(ins: &str) -> Vec<(char, usize)> {
          let mut out = vec![];
          let mut i = ins.chars();
          if let Some(mut c) = i.next() {
              let mut count = 1;
              for new_c in i {
                  if new_c == c {
                      count += 1;
                  } else {
                      out.push((c, count));
                      count = 1;
                      c = new_c;
              out.push((c, count));

      and in Haskell we can use group and mapMaybe for the heavy lifting:

      import Data.List (group)
      import Data.Maybe (mapMaybe)
      runLengthEncode :: Eq a => [a] -> [(a, Int)]
      runLengthEncode = mapMaybe f . group
          f [] = Nothing 
          f (x:xs) = Just (x, 1 + length xs) 
    19. 1

      My simple procedural answer in Julia:

      function rle(itr)
      	acc = Tuple{eltype(itr), Int}[]
      	x = iterate(itr)
      	isnothing(x) && return acc # iterable is empty
      	last, state = x
      	n = 1
      	for v in Iterators.rest(itr, state)
      		if last == v
      			n += 1
      			push!(acc, (last, n))
      			last = v
      			n = 1
      	push!(acc, (last, n))
      	return acc



      Give a run-length-encoding of iterable itr.

      Bonus features:

      • works on arbitrary iterables
      • typed


      julia> rle("aaabbca")
      Tuple{Char, Int64}[('a', 3), ('b', 2), ('c', 1), ('a', 1)]
      julia> rle(rand(Bool, 5))
      Tuple{Bool, Int64}[(true, 1), (false, 1), (true, 3)]
      julia> let io = IOBuffer("foo\nbar\n")
      Tuple{String, Int64}[("foo", 1), ("bar", 1)]
    20. 1

      My initial solution, written in 11 minutes with the help of online documentation:

      (vec (map
             (fn [run] (conj '() (count run) (first run)))
             (partition-by identity input)))

      The hard part was discovering the partition-by function. Knowing about Ruby’s Enumerable#partition method helped a lot with that.

      After reading some other answers, I decided I should not accept my hacky conj, and should look up the actual way to construct a list. My final solution:

      (vec (map
             (fn [run] (list (first run) (count run)))
             (partition-by identity input)))

      I was testing with ClojureScript, so first was enough to get a one-character string like "a". With Clojure on JVM, it is indeed necessary to use (comp str first) instead, as the post does.

      I used vec and list because the original question specified the output using that syntax: [("a", 4), ("b", 3), ("c", 2), ("a", 1)]. When I look at the question more, though, I think that data structure was meant to be Python, so my vec was unnecessary.

    21. 1

      Haskell.. Includes the newline, but that’s an easy fix..

      main = interact soln
      soln = show . reverse . go []
      go counts [] = counts
      go counts (x:xs) = go (inc x counts) xs
      inc c ((x,ct):xs) | c == x = (x,ct+1):xs
      inc c xs = (c, 1):xs
    22. 1

      interesting interview-question, my problem with it is that the required effors totally depends on whether your language has this builtin or not.

      for example, look at the python solution:

       [(a,len(list(b))) for a,b in itertools.groupby('aaabbbbbbcca')]

      itertools.groupby does all the work, you just format the result to the required specification.

      the article used clojure where partition-by seems to do the same.

      if your language does not provide this function, you will have to reimplement it, or do something else. my point is, the effort is a lot larger.

      to take the argument to the extreme: imagine the interview-question is : “how to reverse an array”, and some candidates use languages where “array.reverse” exists, and some use languages where it does not exist.

      1. 1

        I don’t think the interview question was reliant on any one language - the interviewee has 20m to solve it, presumably in the language used by the organization being applied to.

        I solved in unimaginative Perl, while I’m sure it can be done in a one-liner using regex.

        1. 2

          yes, the task can be done in 20m in any language, i agree (not sure about assembly-level stuff though). i guess it just feels like you are measuring two different things depending on the programming language, that’s all.