I use Google Keep notes. I have a daily checklist and a weekly checklist that each get rotated with unfinished tasks. It’s nice because works great on mobile and my desktop.
I’m a bit of a Scala newcomer and pretty much write all my code to run on Spark, but have completed a few of the Coursera courses. Can someone point me at the shortcomings or danger zones of Scala? To me at least it doesn’t seem like there is a language out there that is functional, type safe and expressive like Scala which works universally in back-end services, web apps, and big data apps running on Spark/Kafka/etc. The only other language that is similar in this respect is Clojure but it’s not statically typed which is something I’m drawn to coming from Python. Are there alternatives to Scala people should be looking it?
To me, the big danger zone is that Scala has two largely incompatible groups of users. One is happy to use a better Java, the other is essentially writing Haskell on the JVM. Neither group likes the other’s code. The Haskell-flavored-Scala folks tend to be blowhards about it more often than the better-Java folks, so expect that style to win in any given project/organization. Also expect to lose people because of it. I’ve seen this happen from up close and from afar.
I wish Scala the language were separable from Scala the community. There are some great ideas in there, but I’m happier using most of them in a different language.
F# and Ocaml seem like strong contenders. Less Ocaml, but that’s mostly because of library support.
You can take a look at Haskell or close to Haskell languages:
At least Haskell is functional, type safe, and certainly as expressive as Scala. I used it Haskell along kafka, and for backend services as web apps. The first link talk about how it is used by tweag.io to run on Spark, I don’t have any personal experience with that.
I do think Scala is the best language going at the moment. But there are various rough edges, partly to do with JVM compatibility and partly to do with backwards compatibility with previous versions of Scala, plus the occasional bad design decision; just stuff like null, ClassTag, the contortions needed to implement HList and Record (which largely don’t cause problems for correct code but show up in things like the error messages you get when you make a mistake), the lack of enums or any truly first-class kind of sum type….
Pitfalls to avoid: SBT, akka actors, the cake pattern, pattern matching where it’s impossible to tell safe from unsafe, monad-like types that don’t obey the laws, implicit parameters used as actual parameters (something Odersky is now promoting), lack of kind polymorphism…
In terms of alternatives F# and OCaml don’t have HKT; Haskell is an option but seems to introduce as many problems as it solves (laziness making performance reasoning hard, lack of good tooling, limited compatibility with other ecosystems). I had high hopes for Ceylon but I’ve come to think union types are the wrong thing (they’re inherently noncompositional compared to the opaque disjoint union kind of sum type). I’m excited for Idris - that seems to take the good parts of Scala and also bring something new and valuable to the table.
So, to go full circle, if I were to do it again, I’d probably spend most of my time building something clever with a HyperLogLog, only to eventually cave-in and resort to something inefficient, bland and boring.
Why not HyperLogLog when it’s the most efficient solution?
This is great thanks! Is there anything you’d say specifically if someone asked you why to chose Postgres over MySQL? Let’s say they’re using AWS and not Citus. I had a hard time making an argument for Postgres at my current job where basically all the SQL databases are MySQL. In my case it’s for analytics and upserts are really useful which at the time Postgres didn’t have. HyperLogLog is also a really useful feature but AFAIK you can’t add the extension on AWS RDS.
For me the biggest thing is around flexible datatypes (arrays, hstore, range types, and JSONB) along with the corresponding indexes that can use used with them like GIN and GiST. And then truly extensions, even if not using HyperLogLog you might need geospatial support via PostGIS, maybe full text search, maybe foreign data wrappers, and there’s a much longer list.
At my last job we ran a few services in docker on version 1.8 and 1.9. Garbage collection was the biggest issue I experienced when trying to run many containers on a host. Every time we deployed a new image all the running containers would get trashed and take up disk. Other than that never had any major issues except for stupid CID files not getting removed. After reading this I don’t think I’d want to use docker for anything critical though but it’s definitely nice for running many services in development.
I previously had a Mac but 4 months ago I switched to the ThinkPad X1 Carbon. It’s light and the keyboard is better than a Mac’s IMO, but the screen could use more vertical space. There are definitely some hardware driver issues when running Linux which I wrote about in my installation guide: https://gist.github.com/jjmalina/5e13b2269ec97895ea5fda9df6d26751
The three main arguments in favor of PHP are: no shared state, concurrency, and programmer workflow. The deficiencies are mostly around the language itself but HHVM fixes this. Couldn’t one make the same argument for other languages? You could use Openresty with NGINX and write your embedded server in Lua. PHP definitely has a much bigger ecosystem than Lua but the ecosystem is hardly mentioned and I think that’s a big factor when choosing a language.
Also what server do you have to run PHP in? Apache? Also what server do you have to
What hardware are you running it on? I switched from Mac OS X to Debian 8.5 Jessie on a Lenovo Thinkpad X1 Carbon and it was kind of a pain to set up because the graphics and wifi drivers didn’t support the hardware, so I had to upgrade the kernel to get newer drivers. Is OpenBSD also plagued with driver instability like Linux is?
Well the iwm wireless driver still has some known issues but generally works fine. Several developers use x1 carbon machines.
The AP portion seems a lot like Cassandra, and the differences I’m not sure if the differences matter.
What’s the big win of using InfluxDB and their own clustering solution to just using an existing technology Cassandra?
I don’t have any experience running Cassandra but InfluxDB’s query API and ease of deployment are advantages IMO.
This is a great intro and useful resource. I’ve been using OpenResty at work for both a log collection server and a REST API proxy, and the performance of it is amazing. The biggest thing I’ve had trouble with is setting up Lua and installing dependencies via luarocks. AFAIK there’s nothing like virtualenv for Lua, and if you need a particular version of Lua on OS X it’s a real pain to install. It’s not really a big deal anymore since we just run it in docker now, but I wish the tooling and docs for Lua were better. I can attest that busted works well for unit testing though :)
Very true, the various Lua and LuaJIT versions have definitely made it quite difficult on OSX over time. After a bunch of reading I found that the homebrew packagers worked a way around all this so that the lua51 package installs the appropriate luarocks binary, you just have to remember to use luarocks-5.1 in the shell. (There’s no ‘canonical’ lua52 or lua53 that I can see in the homebrew repo but there is a homebrew/versions/lua53.) On debian it’s way easier, I just install everything from apt including luarocks and it Just Works.
Thing is the OpenResty install bundles up all the JIT stuff necessary for it to work and installs it along with the nginx core and all the resty.* scripts so you don’t really need to worry about that - at least in my experience, at least on debian. I just use the OR environment and if I need to run something on the CLI then I just invoke the JIT binary directly.
Also, since I got into doing all server-side stuff on Vagrant debian boxes, I tend to avoid doing anything that will eventually get deployed on Linux on Mac anyway, ‘cos although homebrew is a straight-up Gift From Zeus, there’s still too much inconsistency on stuff like this, and spinning up a VM is just too easy.
All that said, I’m sure there are ways to get Lua > 5.1 working and happy on OSX including luarocks-5.x - but given that I don’t need anything specific in 5.2/5.3, I’ve never needed to dig in to it.
I’ve also used Celery. Run one beat mode process and then a worker process for each queue/task type all configured with supervisord. It’s kind of annoying and confusing to set up because you need RabbitMQ and once or twice a year the EC2 instance RabbitMQ runs on gets hosed and your whole setup goes down so then you need to look into RabbitMQ clustering.
The other alternative I’ve tried is CloudWatch Events on a cron schedule which send messages to a queue or SNS topic. SQS and SNS are more simple terms of code but not as high throughput. It’s also a ton of work to automate the setup of SNS topics and SQS queue subscriptions. So you’d need something like Terraform to handle that for you.
Overall I’m not stoked on either solution. The next thing I’d want to try is CloudWatch + Kinesis.