I love it! I wonder what differences in memory or performance would be in applying (⊃,≢) as you do to each partition after grouping them, vs during the grouping as I do with the key operation. I’d forgotten about the “n-wise reduce” variant you used, which is much nicer than my ancient rotation idiom.
It’s interesting how similar the Clojure and APL solutions for a problem like this can be. I mean (juxt first count) and (⊃,≢) are remarkably alike. It makes me wonder what Clojure would be like if it had function trains instead of threading macros.
IME performance is generally not super intuitive to reason about in APL, because of the interpreters’ ability to do pattern recognition.
It’s interesting how similar the Clojure and APL solutions for a problem like this can be. I mean (juxt first count) and (⊃,≢) are remarkably alike. It makes me wonder what Clojure would be like if it had function trains instead of threading macros.
That’s a nice use of juxt to make it point-free (what do you mean, mostly point-free?). My first intuition was pretty much the same with an anonymous function instead of juxt.
I believe getting used to threading macros make data manipulation tasks like this one much, much easier to approach. Both when writing and when reading.
I have a hard time believing that most candidates cannot solve the problem. Do they claim to know Clojure?
Unrelated, but I thought I’d comment anyhow since I know you’ll read it:
Your brainfuck interpreter changed the way I thought about everything. Long, long story short, more than ten years ago I began my journey learning Scheme through SICP, and over time I just lost the spark and sorta fell away from writing beautiful programs and wanted to just write clever programs. I was working on a little side project involving a genetic algorithm and brainfuck, and along the way, I thought “oh, what the hell, I may as well read a bf interpreter and then just try and write my own” and I stumbled across yours.
I won’t bore you, but let me just tell you that when I read that code, it became as clear as day to me how long it had been since I read something elegant and beautiful. Since then I’ve been getting deeper and deeper into Clojure, and that old part of my brain that sought after beautiful programs is starting to feel rejuvenated. I’d been reading Ruby classes and Python classes for SO long, it’s been really nice digging my brain out from all of that.
This may be a really weird comment to read, sorry. I just really had a ~*~moment~*~ after reading your code. It’s immaculate.
(map (juxt first count)
(str/split "aaaabbbcca" #"(?<=(.))(?!\1)"))
The regex is a little weird – basically, we want to split the string on any boundary where the characters before and after the boundary are different. So “aaaabbbcca” becomes four strings: [“aaaa” “bbb” “cc” “a”]. It does this by using a lookbehind for any character – the (?<=(.)) – and then a negative lookahead for not that character – (?!\1). We don’t want to include either of those characters in the match, so the characters are returned as part of the new strings.
The rest is relatively simple: juxt returns a function that takes a single argument, then returns a vector with the results of calling the two functions separately on that argument.
I don’t know that I’d write this in production code, but it is kind of fun.
The regex for the substitution is /(.)$0*/ where $0 is the first character matched by the joker grouped that could appears zero or more times. the second argument is a block where a string to follow the format (letter, number of times in a row) is followed. $0 is the character and $/ contains the match objects. The chars method returns the number of characters in the match object by grapheme. The last argument :g applies the substitution globally, without it the method will stop after the first ‘a’.
Ugly as sin, but works. 8 minutes including writing this response, no documentation lookups, all in a readline environment. (I know about help but am not yet sure how to find useful functions other than scrolling around the cheatsheet)
Reading the Clojure solution involves following the thought process of how the author ended up there. Reading the equivalent Go solution has very little cognitive overhead.
How does Clojure programmers write code that is not just brief and elegant in the eyes of the author, but can also be read quickly by others without time consuming mental gymnastics?
I guess it’s just a matter of what you’re used to. I find both Go solutions above arduous to understand because you have to follow the algorithm and construct what it functionally does in your head. Whereas in most of the clojure solutions the code directly tells you what’s happening. Eg. the top post with comments:
; threading macro (take input, then do stuff to it in steps)
(->> "aaaabbbcca"
; split into lists of consecutive same character
(partition-by identity)
; map each item (list of chars) with a function
(map
; return list of outputs of functions
(juxt
; first character as string
(comp str first)
; count of characters
count)))
The information is much more dense. If you are good at reading it, you understand it at a glance.
I believe the same is true for the APL solutions. I can’t read them, but someone who does sees the algorithm clearly.
I find both Go solutions above arduous to understand
Same here; they just seem super tedious and low-level; like you’re spoon-feeding the compiler on basic things instead of just describing the algorithm naturally.
If you don’t have the vocabulary to describe an algorithm, of course the low-level version is going to be easier to read, much like someone who is only just learning English will find the “Simplified English” wikipedia more accessible than the regular one that often uses specialized jargon. But if you work with algorithms every day, you owe it to yourself to invest in learning the jargon!
Isn’t this just because Clojure has a much bigger standard library than Go? I’m sure the Go solutions would be much higher level if you allowed for third party packages.
I don’t think it has to do with the size necessarily but rather the focus on sequence manipulation as a core competency; the acknowledgement that data structures are at the center of what languages need to be good at.
Allowing 3rd-party packages would make it shorter at the expense of readability; you can’t assume your reader is familiar with idioms outside the language.
Lower level also means that it’s closer to what’s happening on the computer when the code is running. Depending on the type of program, this can be useful.
That’s a great question and it basically boils down to good taste. You can do code golfing and try to get the code as small and concise as possible, or you can write it the laborious but obvious way. This goes for any language with abstractions, you can always go too far in using certain abstractions. Nowadays I lean more towards “dumb” code because it’s not just easier to write but also easier to understand. However, the “code golf” style solution can be more robust. “Dumb” code tends to have more edge cases (i.e. bugs).
I like how the above short go code deals with the starting letter edge case by simply prefixing the string; I struggled a bit with that myself as I was coming up with a solution. Now that I think about it, there’s an edge case that gets triggered if the first character in the string is a space; then the code crashes.
Not that the requirements say anything about space consumption, but most (all?) of the short solutions I’ve seen so far are not streaming friendly. Here’s a lazy version in Clojure that does not materialize intermediate sequences to count them:
I know this is a Clojure post, but out of curiosity I wrote it in Go (what I’ve been working in lately) just to see if I could solve it quickly. Took about 10 minutes, subtract 3-4 for fighting with runes:
package main
import "fmt"
type result struct {
letter string
count int
}
func main() {
const input = "aaaabbbcca"
var ret []result
currentLetter := string(input[0])
countCurrentLetter := 1
for _, elem := range input[1:] {
elemAsString := string(elem)
if currentLetter == elemAsString {
countCurrentLetter++
} else {
ret = append(ret, result{currentLetter, countCurrentLetter})
currentLetter = elemAsString
countCurrentLetter = 1
}
}
ret = append(ret, result{currentLetter, countCurrentLetter})
fmt.Printf("%+v", ret)
}
That’s my problem with many other (non-Lispy) languages. The programs are not elegant, even though they do work. What works for a computer, don’t always work for me.
Okay, I am 5 months late, but this code is terrible and I must object to it because there’s no good Go code in this thread. You are mixing up two problems, lexing a token, and figuring out the next token. Apart from that, the code is very nonIdiomaticWithTheseVariableNames, but more importantly blows up on non-ASCII strings.
Here’s my take on it. It took me (roughly) the same 8-10 minutes to type it in the web-ui. In Emacs I could shave some time off of it.
type tuple struct {
s string
i int
}
func splitStringReturnTuples(str string) []tuple {
str = " " + str
res := []tuple{}
for i := 1; i < len(str); i++ {
if str[i] != str[i-1] {
res = append(res, tuple{string(str[i]), 1})
} else {
res[len(res)-1].i++
}
}
return res
}
This loops over the bytes in the string instead of the runes in the string. Try inserting a multi-byte rune such as 本 in the string, and see what happens.
The problem statement clearly stated the data set, there was no multi-byte symbols. But the tweet gave a clear understanding that interviewer expects solution to be give in a limited time frame. Therefore the solution was provided in terse notation with abbreviated variables taking provided test input and returning expected output. Not more nor less.
When I interview people I don’t expect them to write code perfectly handling every possible input in the limited time. What I’m interested in first, if they are able to come up with straightforward solution leveraging data structures and algorithms helping them solve the problem with optimal complexity. Second, if they can clearly communicate their approach. And coding comes third.
That makes sense. I did not mean to criticize your solution in particular, just highlight that this is a common “gotcha” in Go. Casting strings to []rune or looping with for _, r := range str is, as far as I know, the only built-in way to access the letters in strings correctly. I’ve seen many problems arise from assuming that str[x] returns a rune instead of a byte. I think it would be more useful and intuitive if []byte(str)[x] was needed to return a byte, while just str[x] could be used to return a rune.
I saw that tweet and really enjoyed it. I spent a few minutes while doing other work kicking around the solution in F# in my head. It’s a good example of a question tough enough to exercise a few brain cells but not one of those “I’m the interviewer see how smart I am” things. Of course the best answer, which a few replies gave, was simply to print out the required result. Don’t write code you don’t have to, and since you were given the answer and you can write code to get that using only one statement, do it.
But of course that’s too cheeky. For the record, in F# it looks to me like:
“aaaabbbcca” |> Seq.countBy id;;
I don’t know how to make it any simpler than that.
Let’s try this again. This is a bit more gnarly. Needs cleaning up a lot. I wrote for clarity instead of brevity.
let rec breakIntoRepeatingChunks<'a when 'a:equality>(inputSeq:'a seq) acc =
if inputSeq|>Seq.isEmpty then acc else
let letterWereMatching=inputSeq|>Seq.head
let chunk=inputSeq |> Seq.takeWhile(fun y->y=letterWereMatching)
let chunkLength=chunk|>Seq.length
let newInputSeq = inputSeq |> Seq.skip chunkLength
let newItem=seq{letterWereMatching, chunkLength}
breakIntoRepeatingChunks newInputSeq (newItem |> Seq.append acc)
h
: next_char
s/^/|\
/
t inc_start
: inc_start
s/^\([0-9]*\)0|/\11/; t inc_end
s/^\([0-9]*\)1|/\12/; t inc_end
s/^\([0-9]*\)2|/\13/; t inc_end
s/^\([0-9]*\)3|/\14/; t inc_end
s/^\([0-9]*\)4|/\15/; t inc_end
s/^\([0-9]*\)5|/\16/; t inc_end
s/^\([0-9]*\)6|/\17/; t inc_end
s/^\([0-9]*\)7|/\18/; t inc_end
s/^\([0-9]*\)8|/\19/; t inc_end
: inc_loop
s/^\([0-9]*\)9|/\1|0/
t inc_loop
s/^|/1/; t inc_end
b inc_start
: inc_end
s/\n\(.\)\1/|\
\1/
t inc_start
P
s/^[^[:space:]]*.//
s/\(.\).*/\1/p
g
s/\(.\)\1*//
/^$/q
h
t next_char
Result:
sh$ echo aaaabbbcca | sed -f my-take.sed
4
a
3
b
2
c
1
a
I don’t know that Run Length Encoding is the best filter for candidates, it feels a little bit on the puzzly side.
Here’s my quick and dirty solution using janet’s seq. It’s imperative, and not pretty and functional (janet can do both), but I was able to build it quickly, and it illustrates the power of seq (and loop), and I think it’d be nice and clear to explain for non-functional programmers:
(def input "aaaabbbcca")
(var old (string/from-bytes (input 0)))
(var cnt 0)
(pp (array/push (seq [c :in (map string/from-bytes input)
:after (++ cnt)
:when (not= old c)
:after (do (set cnt 0) (set old c))]
[old cnt])
[old cnt]))
# => @[("a" 4) ("b" 3) ("c" 2) ("a" 1)]
fn run_length_encode(ins: &str) -> Vec<(char, usize)> {
let mut out = vec![];
let mut i = ins.chars();
if let Some(mut c) = i.next() {
let mut count = 1;
for new_c in i {
if new_c == c {
count += 1;
} else {
out.push((c, count));
count = 1;
c = new_c;
}
}
out.push((c, count));
}
out
}
and in Haskell we can use group and mapMaybe for the heavy lifting:
import Data.List (group)
import Data.Maybe (mapMaybe)
runLengthEncode :: Eq a => [a] -> [(a, Int)]
runLengthEncode = mapMaybe f . group
where
f [] = Nothing
f (x:xs) = Just (x, 1 + length xs)
function rle(itr)
acc = Tuple{eltype(itr), Int}[]
x = iterate(itr)
isnothing(x) && return acc # iterable is empty
last, state = x
n = 1
for v in Iterators.rest(itr, state)
if last == v
n += 1
else
push!(acc, (last, n))
last = v
n = 1
end
end
push!(acc, (last, n))
return acc
end
The hard part was discovering the partition-by function. Knowing about Ruby’s Enumerable#partition method helped a lot with that.
After reading some other answers, I decided I should not accept my hacky conj, and should look up the actual way to construct a list. My final solution:
I was testing with ClojureScript, so first was enough to get a one-character string like "a". With Clojure on JVM, it is indeed necessary to use (comp str first) instead, as the post does.
I used vec and list because the original question specified the output using that syntax: [("a", 4), ("b", 3), ("c", 2), ("a", 1)]. When I look at the question more, though, I think that data structure was meant to be Python, so my vec was unnecessary.
Haskell.. Includes the newline, but that’s an easy fix..
main = interact soln
soln = show . reverse . go []
go counts [] = counts
go counts (x:xs) = go (inc x counts) xs
inc c ((x,ct):xs) | c == x = (x,ct+1):xs
inc c xs = (c, 1):xs
interesting interview-question, my problem with it is that the required effors totally depends on whether your language has this builtin or not.
for example, look at the python solution:
[(a,len(list(b))) for a,b in itertools.groupby('aaabbbbbbcca')]
itertools.groupby does all the work, you just format the result to the required specification.
the article used clojure where partition-by seems to do the same.
if your language does not provide this function, you will have to reimplement it, or do something else. my point is, the effort is a lot larger.
to take the argument to the extreme: imagine the interview-question is : “how to reverse an array”, and some candidates use languages where “array.reverse” exists, and some use languages where it does not exist.
I don’t think the interview question was reliant on any one language - the interviewee has 20m to solve it, presumably in the language used by the organization being applied to.
I solved in unimaginative Perl, while I’m sure it can be done in a one-liner using regex.
yes, the task can be done in 20m in any language, i agree (not sure about assembly-level stuff though). i guess it just feels like you are measuring two different things depending on the programming language, that’s all.
For the fun of it, two alternatives to shorten your solutions.
The first uses
juxt
and is mostly point-free, but terse:The second one with
for
, a bit chattier but also more legible:Since you were mentioning suggestions in the article, this is one of the
Here’s my point-free take on it using Dyalog APL
Nice! Here’s mine:
And for bonus points, the decode:
I really wish dyalog had capped forks—they would obviate the last set of parentheses in the encoder.
I love it! I wonder what differences in memory or performance would be in applying
(⊃,≢)
as you do to each partition after grouping them, vs during the grouping as I do with the key operation. I’d forgotten about the “n-wise reduce” variant you used, which is much nicer than my ancient rotation idiom.It’s interesting how similar the Clojure and APL solutions for a problem like this can be. I mean
(juxt first count)
and(⊃,≢)
are remarkably alike. It makes me wonder what Clojure would be like if it had function trains instead of threading macros.This has to be the most interesting discussion I’ve seen this week. Downside: now I’ll have to relearn APL enough to understand your solutions. :D
IME performance is generally not super intuitive to reason about in APL, because of the interpreters’ ability to do pattern recognition.
This is a good read.
duh; simpler:
(⊃,≢)¨⊢⊆⍨(+\1,2≠/⊢)
Having never seen this language before, it looks like someone has incorrectly rendered a bunch of UTF-8 characters lol
That’s a nice use of
juxt
to make it point-free (what do you mean, mostly point-free?). My first intuition was pretty much the same with an anonymous function instead ofjuxt
.I believe getting used to threading macros make data manipulation tasks like this one much, much easier to approach. Both when writing and when reading.
I have a hard time believing that most candidates cannot solve the problem. Do they claim to know Clojure?
It looks like the original tweet wasn’t about clojure?
Unrelated, but I thought I’d comment anyhow since I know you’ll read it:
Your brainfuck interpreter changed the way I thought about everything. Long, long story short, more than ten years ago I began my journey learning Scheme through SICP, and over time I just lost the spark and sorta fell away from writing beautiful programs and wanted to just write clever programs. I was working on a little side project involving a genetic algorithm and brainfuck, and along the way, I thought “oh, what the hell, I may as well read a bf interpreter and then just try and write my own” and I stumbled across yours.
I won’t bore you, but let me just tell you that when I read that code, it became as clear as day to me how long it had been since I read something elegant and beautiful. Since then I’ve been getting deeper and deeper into Clojure, and that old part of my brain that sought after beautiful programs is starting to feel rejuvenated. I’d been reading Ruby classes and Python classes for SO long, it’s been really nice digging my brain out from all of that.
This may be a really weird comment to read, sorry. I just really had a
~*~moment~*~
after reading your code. It’s immaculate.This means a lot, thank you.
Thanks. Both suggestions look good, the first one little bit more readable.
Oh wow, juxt there makes it sweet. Nicely done!
While not a (single) programming language, my first thought was to use
uniq(1)
:Wow, that’s elegant.
Way to nerd-snipe me.
Using regex:
The regex is a little weird – basically, we want to split the string on any boundary where the characters before and after the boundary are different. So “aaaabbbcca” becomes four strings: [“aaaa” “bbb” “cc” “a”]. It does this by using a lookbehind for any character – the
(?<=(.))
– and then a negative lookahead for not that character –(?!\1)
. We don’t want to include either of those characters in the match, so the characters are returned as part of the new strings.The rest is relatively simple:
juxt
returns a function that takes a single argument, then returns a vector with the results of calling the two functions separately on that argument.I don’t know that I’d write this in production code, but it is kind of fun.
Kind of the same idea but in Raku:
The regex for the substitution is
/(.)$0*/
where$0
is the first character matched by the joker grouped that could appears zero or more times. the second argument is a block where a string to follow the format(letter, number of times in a row)
is followed.$0
is the character and$/
contains the match objects. Thechars
method returns the number of characters in the match object by grapheme. The last argument:g
applies the substitution globally, without it the method will stop after the first ‘a’.is the same with in-place substitution using
s///
construct instead of the subst methods.Ruby one-liner, if you like long lines.
Never mind that it took 7 tries…..
Bravo! I daresay that’s optimally golfed
Two improvements:
.split("")
==.chars
You can use
chunk_while(&:==)
Oh wow, that’s fantastic. I’d overlooked both entirely. Thanks!
If you don’t know of
partition-by
this code also works:Ugly as sin, but works. 8 minutes including writing this response, no documentation lookups, all in a readline environment. (I know about
help
but am not yet sure how to find useful functions other than scrolling around the cheatsheet)Reading the Clojure solution involves following the thought process of how the author ended up there. Reading the equivalent Go solution has very little cognitive overhead.
How does Clojure programmers write code that is not just brief and elegant in the eyes of the author, but can also be read quickly by others without time consuming mental gymnastics?
I guess it’s just a matter of what you’re used to. I find both Go solutions above arduous to understand because you have to follow the algorithm and construct what it functionally does in your head. Whereas in most of the clojure solutions the code directly tells you what’s happening. Eg. the top post with comments:
The information is much more dense. If you are good at reading it, you understand it at a glance.
I believe the same is true for the APL solutions. I can’t read them, but someone who does sees the algorithm clearly.
Same here; they just seem super tedious and low-level; like you’re spoon-feeding the compiler on basic things instead of just describing the algorithm naturally.
If you don’t have the vocabulary to describe an algorithm, of course the low-level version is going to be easier to read, much like someone who is only just learning English will find the “Simplified English” wikipedia more accessible than the regular one that often uses specialized jargon. But if you work with algorithms every day, you owe it to yourself to invest in learning the jargon!
Isn’t this just because Clojure has a much bigger standard library than Go? I’m sure the Go solutions would be much higher level if you allowed for third party packages.
I don’t think it has to do with the size necessarily but rather the focus on sequence manipulation as a core competency; the acknowledgement that data structures are at the center of what languages need to be good at.
Allowing 3rd-party packages would make it shorter at the expense of readability; you can’t assume your reader is familiar with idioms outside the language.
Lower level also means that it’s closer to what’s happening on the computer when the code is running. Depending on the type of program, this can be useful.
Sort of the same principle as how people can jump into a rails codebase that’s got thousands of classes and be productive quickly.
It’s a lot of convention, and a lot of practice.
That Clojure code is also very Clojure-y if you know what I mean, it’s not doing anything clever or obscure.
You might benefit from looking over some style guides and really reflecting how your code reads in the lens of what the style guide says.
That’s a great question and it basically boils down to good taste. You can do code golfing and try to get the code as small and concise as possible, or you can write it the laborious but obvious way. This goes for any language with abstractions, you can always go too far in using certain abstractions. Nowadays I lean more towards “dumb” code because it’s not just easier to write but also easier to understand. However, the “code golf” style solution can be more robust. “Dumb” code tends to have more edge cases (i.e. bugs).
I like how the above short go code deals with the starting letter edge case by simply prefixing the string; I struggled a bit with that myself as I was coming up with a solution. Now that I think about it, there’s an edge case that gets triggered if the first character in the string is a space; then the code crashes.
Just because I haven’t seen these yet: in Ruby you can use Enumerable#tally and in Python you can use Counter (in ‘collections’)
Both of these give the equivalent result in one call, so are good for some golfing.
Enumerable#tally is not a valid solution. It returns
=> {"a"=>5, "b"=>3, "c"=>2}
I see - I hadn’t spotted the difference, thanks!
Np! I actually made the same exact mistake trying to solve it, that’s why I commented on yours.
Same with Counter:
Not that the requirements say anything about space consumption, but most (all?) of the short solutions I’ve seen so far are not streaming friendly. Here’s a lazy version in Clojure that does not materialize intermediate sequences to count them:
I like Ruby on this:
I know this is a Clojure post, but out of curiosity I wrote it in Go (what I’ve been working in lately) just to see if I could solve it quickly. Took about 10 minutes, subtract 3-4 for fighting with runes:
It’s not particularly elegant, but it works.
That’s my problem with many other (non-Lispy) languages. The programs are not elegant, even though they do work. What works for a computer, don’t always work for me.
Okay, I am 5 months late, but this code is terrible and I must object to it because there’s no good Go code in this thread. You are mixing up two problems, lexing a token, and figuring out the next token. Apart from that, the code is very
nonIdiomaticWithTheseVariableNames
, but more importantly blows up on non-ASCII strings.Here’s two solutions, one imperative: https://play.golang.org/p/-zdWZAnmBip, and one recursive: https://play.golang.org/p/TBudEZBphv7.
The proposed solutions:
else
in sight.if
s are early returns.[1:]
boundary conditions.Cool, I guess?
I didn’t say I solved it well. I hacked something together.
Here’s my take on it. It took me (roughly) the same 8-10 minutes to type it in the web-ui. In Emacs I could shave some time off of it.
Runnable code at the go playground
This loops over the bytes in the string instead of the runes in the string. Try inserting a multi-byte rune such as 本 in the string, and see what happens.
The problem statement clearly stated the data set, there was no multi-byte symbols. But the tweet gave a clear understanding that interviewer expects solution to be give in a limited time frame. Therefore the solution was provided in terse notation with abbreviated variables taking provided test input and returning expected output. Not more nor less.
But point is taken. Here’s the code correctly handling multi-byte encodings. The logic is the same, but the part of casting passed string into a slice of runes.
When I interview people I don’t expect them to write code perfectly handling every possible input in the limited time. What I’m interested in first, if they are able to come up with straightforward solution leveraging data structures and algorithms helping them solve the problem with optimal complexity. Second, if they can clearly communicate their approach. And coding comes third.
That makes sense. I did not mean to criticize your solution in particular, just highlight that this is a common “gotcha” in Go. Casting strings to
[]rune
or looping withfor _, r := range str
is, as far as I know, the only built-in way to access the letters in strings correctly. I’ve seen many problems arise from assuming thatstr[x]
returns a rune instead of a byte. I think it would be more useful and intuitive if[]byte(str)[x]
was needed to return a byte, while juststr[x]
could be used to return a rune.Since everyone is posting their solutions, here’s mine in Erlang:
Racket:
Also using fold, with
mnm.l
:Racket’s
group-by
is wonderful but I usually want to group consecutive equal items into clumps as they arise rather than a single monolithic group.I started using this question in job interviews a year or two ago and now this tweet and post has to go viral. Thanks a lot jerk!
I like this python one-liner solution (with some help from the standard library):
I saw that tweet and really enjoyed it. I spent a few minutes while doing other work kicking around the solution in F# in my head. It’s a good example of a question tough enough to exercise a few brain cells but not one of those “I’m the interviewer see how smart I am” things. Of course the best answer, which a few replies gave, was simply to print out the required result. Don’t write code you don’t have to, and since you were given the answer and you can write code to get that using only one statement, do it.
But of course that’s too cheeky. For the record, in F# it looks to me like:
“aaaabbbcca” |> Seq.countBy id;;
I don’t know how to make it any simpler than that.
Unfortunately, this doesn’t return the expected output. That last “a” is what throws a wrench in solutions like this.
Ah okay. My bad. I didn’t see the second “a”
Let’s try this again. This is a bit more gnarly. Needs cleaning up a lot. I wrote for clarity instead of brevity.
Is it cheating to use the stdlib function that already does the work?
Or if you wanted to respect the types in the OP:
Frequencies doesn’t help here. “A” needs to be in the output twice.
Check the requirements again, it’s not frequency, it’s “runs” of characters.
I started working on this in Perl then rapidly lost interest when I saw what was required ;)
Edit changed my mind, here’s a straightforward solution, slightly golfed to fit in 24 lines https://gist.github.com/gustafe/117768d83df175b308aa15dcc75d2061
[Comment removed by author]
Here’s my very elegant and short take in Sed:
Result:
So, two things:
seq
. It’s imperative, and not pretty and functional (janet can do both), but I was able to build it quickly, and it illustrates the power ofseq
(andloop
), and I think it’d be nice and clear to explain for non-functional programmers:Since no one contributed a Rust example:
and in Haskell we can use
group
andmapMaybe
for the heavy lifting:My simple procedural answer in Julia:
Docstring:
Give a run-length-encoding of iterable
itr
.Bonus features:
Examples
My initial solution, written in 11 minutes with the help of online documentation:
The hard part was discovering the
partition-by
function. Knowing about Ruby’sEnumerable#partition
method helped a lot with that.After reading some other answers, I decided I should not accept my hacky
conj
, and should look up the actual way to construct a list. My final solution:I was testing with ClojureScript, so
first
was enough to get a one-character string like"a"
. With Clojure on JVM, it is indeed necessary to use(comp str first)
instead, as the post does.I used
vec
andlist
because the original question specified the output using that syntax:[("a", 4), ("b", 3), ("c", 2), ("a", 1)]
. When I look at the question more, though, I think that data structure was meant to be Python, so myvec
was unnecessary.Haskell.. Includes the newline, but that’s an easy fix..
interesting interview-question, my problem with it is that the required effors totally depends on whether your language has this builtin or not.
for example, look at the python solution:
itertools.groupby
does all the work, you just format the result to the required specification.the article used
clojure
wherepartition-by
seems to do the same.if your language does not provide this function, you will have to reimplement it, or do something else. my point is, the effort is a lot larger.
to take the argument to the extreme: imagine the interview-question is : “how to reverse an array”, and some candidates use languages where “array.reverse” exists, and some use languages where it does not exist.
I don’t think the interview question was reliant on any one language - the interviewee has 20m to solve it, presumably in the language used by the organization being applied to.
I solved in unimaginative Perl, while I’m sure it can be done in a one-liner using regex.
yes, the task can be done in 20m in any language, i agree (not sure about assembly-level stuff though). i guess it just feels like you are measuring two different things depending on the programming language, that’s all.