1. 6
  1. 2

    Going to repeat the author’s praise of ggplot2 here - it is a very simple library to use! You can stack layers on top of each other and generally manipulate your graph to display pretty complex data. My prof last semester used it heavily for ML graphs, and even made a fork for animated / web hosted plots.

    The Python adaptation is also simple to use: https://plotnine.readthedocs.io/en/stable/

    1. 2

      (copy of HN comment, full thread)

      The popularity of the Tidyverse is a major blow to your motivation to learn R. Why would anyone want to learn a language that is treated as secondary to some packages? Worse still, if that turns out to be the best way to use R, then you’re forced to admit that R is a polished turd with a fragmented community.

      As others have mentioned, just use tidyverse. I picked it up 4 years ago, and last week I went back to the code I wrote then.

      I was productive in minutes. I could read the code, modify it, and easily test it in the REPL. The docs for dplyr are good.

      ggplot2 is still awesome and the docs are good there too. ggplot2 is the fastest way to figure out what you want and make a pretty plot.

      (However one thing that still annoys me is that R moves faster than Debian. So it’s possible to do install.packages() in R, and it will break telling you your Debian R interpreter is too old. There is no easy solution for this, just a bunch of workarounds)

      OK, sure you can call it a polished turd, and to some degree that’s true. But a polished turd is better than just using … a turd!

      The error messages in R are not quite as good as Python, but I wouldn’t call it a problem. I’m able to localize the source of an error, even when using tidyverse.

      My article comparing tidyverse to some other solutions:

      What Is a Data Frame? (In Python, R, and SQL) http://www.oilshell.org/blog/2018/11/30.html

      But would I recommend learning it to anyone else? Absolutely not. We can do so much better.

      I would recommend with the caveat that it’s one of the hardest languages I’ve had to learn. However that is partly because it changes how you think. But if you have a certain type of problem then you have to change how you think, or you’ll never get it done. Data analysis is surprisingly laborious even for people who have say written compilers and such.

      1. 2

        But would I recommend learning it to anyone else? Absolutely not. We can do so much better.

        I’m wondering what the author had in mind when he said we could do better? I’m not aware any other ecosystem with such ergonomic and performant libraries.

        1. 1

          Yeah the problem is that Julia and Python are both imitating R and playing “catch up”. Unfortunately that takes 10 or more years. Meanwhile R has a lot of very smart people moving the ecosystem forward.

          I started using R before tidyverse – around 2010, and tidyverse was 2016 or so. And maybe in 2016 you could make an argument that Python was catching up.

          But then tidyverse was released and documented with books, and now R is light years ahead again.

          So yeah in my mind, there’s basically nothing equivalent, so you if you want to solve certain problems, you just have to suck it up.

          Sort of like shell. Shell has a lot of warts, but at the end of the day it’s easier just to learn it than to flail around with inferior alternatives

          (As a side note, tidyverse generally performs well, but with R in general you do have to be wary of performance. So often I use Python or shell as “pre-filtering steps” to cut down what you deal with in R. I generally deal only with clean TSV files in R; dealing with arbitrary text can be very slow)

      2. 2

        What this article doesn’t mention is my biggest frustration with R: lack of language support for 64-bit integers. Given that I frequently work with timestamps with nanosecond precision, it’s frustrating. Even with bit64, things are not so easy.