I’m sorry but the calculation of money lost while devs are waiting for tests is misleading.
Until testing takes up a significant portion of a developers time 6 minutes here and there mean nothing and further devs should be encouraged to take some time off the screen to clear their mind in between tasks for the sake of their mental health.
Also the time saved does not add up with each additional employee.
In that situation I’d probably would like to something that combines the technical, theoretical and practical aspects of computer science, there are probably better topics but a short intro about computer vision (facial recognition etc) with a demo, will not only easy to remember but also combines a lot of sub-disciplines.
Ooh, I might be interested in that. I have little doubt the horrifying abominations that my employer has built in ant will strenuously resist automatic conversion, but even the possibility of reducing the build tool count by one is intriguing.
Automatically migrating dependencies from ant projects into Gradle files
This is proving to be not that simple.
Anyone have a better idea than parsing the Gradle file and adding dependency expressions into the AST ?
Personal:
Road cycling season is more or less over until next March just in time for me to start a off season training plan
Automatically migrating dependencies from ant projects into Gradle files
This is proving to be not that simple.
Anyone have a better idea than parsing the Gradle file and adding dependency expressions into the AST ?
Personal:
Road cycling season is more or less over until next March just in time for me to start a off season training plan
Road cycling.
I ride outside during the summer and ride on Zwift during the winter (starting now in October).
I follow a training plan but it can be hard to stick to it when you have work and exams.
I’m still trying to find a personal project that combines cycling/Zwift and programming/tech.
The easiest way to scale, provided you have $$’s is just throw more hardware at the problem (ram, cpu, etc) You can scale x86 hardware up quite far, and can probably handle most workloads while still sitting on a single machine just fine.
If you can’t scale on hardware, for whatever reason, then you have to get more creative/work a little harder. The next easiest is figure out what the pain points are for scaling X app, and working on the code/configuration to lower the resources required to do the same amount of work. For custom apps, this could just be something simple like changing how you store X in memory, or it could be putting that work out in a more performant language(like say a cython module if the code was originally in Python), etc. If it’s a java app, it might be just re-configuring the JVM a little bit.
Next is scaling by increasing the number of running copies of said application. This is where it can get.. hard, and is definitely more unique to each application. For instance scaling nginx across multiple copies is really easy, since it’s pretty much stateless, and has no DB or anything you have to deal with. Scaling postgres across multiple copies gets a lot harder, since it’s a DB, and is very stateful.
For us, for most web stuff, that is mostly stateless, we scale by just starting more instances of the application.
For stuff like Postgres, we scale hardware, versus doing something like citus or offloading reads to a hot-standby or whatever. It gets complicated quickly doing that stuff.
Generally the easy answer is just throw more physical resources at the problem, that almost always just works to make things faster. You sort of have to know what resource to increase tho (CPU speed, memory, I/O bandwidth, storage, etc). Luckily every OS out there gives you the tools you need to figure that out pretty easily.
Excellent comment. I agree with all of it. I’ll add one can scale the databases by starting with or migrating to a database that scales horizontally. There’s quite a few of them. This is true for some other stateful services. One might also use load-balancers and/or sharding to keep too much traffic from being on one machine. That gets a bit more complex. There’s still tools and specialists that can help.
I agree there are stateful databases that scale multi-node better out of the box than Postgres(PG) does. I specifically picked PG as the example here because it doesn’t scale multi-instance/machine out of the box very well.
Once you get to multi-machine stateful applications, there are a lot of different trade-offs you have to handle, I’m not sure there is any one stateful DB that is multi-machine out of the box that will work for basically every workload the way Postgres does. I.e. PG is basically useful for any DB workload out of the box, provided it can fit on a single physical machine. I’d love examples of general purpose DB’s like PG that are multi-node out of the box with basically no downsides.
But basically my advice is, once you have a stateful thing you have to go multi-node with, you either need a good, well paid consultant, or good in-house technical staff, as it’s not an easy problem that is very well solved for all use cases. Or to put it another way, avoid multi-node stateful things until absolutely forced to do so, and then go there with eyes wide open, with lots of technical knowledge at hand. Luckily if you do get to that requirement, you probably have lots of $$$ resources to shove at the problem, which helps immensely.
Well-said again. Yeah, I don’t know if any of those advanced DB’s cover every kind of workload with what drawbacks. I’d want to see experimental comparisons on realistic data.
You sound like you know a lot about this topic. Hypothetically, if its even possible, what would you do if the load balancer that you put in front of your workers cant handle the incoming load.
How do you load balance the load balancer ?
It’s definitely possible. You have some options, depending on the kind and amounts of traffic we are talking about.
It depends some on if the Load Balancer(LB) is hardware or software based, etc. Most people are doing software based ones these days.
Roughly in order of preference, but it’s a debatable order:
Ensure you are using a high throughput LB (haproxy comes to mind as a good software based one).
Simplify the load balancer configs to the bare minimum, i.e. get them doing the least amount of work possible. The less work you have to do, the more you can do given X resources.
Find the bottleneck(s), for most LB workloads the problem is a network I/O problem, not a CPU or memory or disk problem. So ensure your hardware is built for peak I/O (make sure your hardware NICs are configured for IP offloading your kernel is tuned, for I/O, etc).
Scale out the LB, with multiple instances. This gets.. interesting, as suddenly you need your traffic to hit more than 1 machine, you can do that a variety of different ways, depending on the actual traffic we are talking about. The easiest is probably just lazy DNS RR (i.e. have your name return multiple A/AAAA records for the host you are load balancing and each IP is a LB).
Rinse and repeat the above, until you have room to breathe again.
There are more complicated ways to do this, depending on traffic load. Once you get to needing to scale past simple DNS RR, you probably want a network expert, as it depends on the kinds of traffic (is it all inbound, or is it mostly outbound traffic, etc). I.e. there are some neat things you can do where the LB’s are only handling the inbound traffic and all outbound/return traffic can come directly from the machine(s) doing the work, and not have to go back through the LB, etc. This requires special configuration(s) and is not overly simple to do.
But it all basically comes down to the general case above. load balancers are generally very stateless, so the general, just run a bunch of copies plan usually works fine.
If you need your LB’s to do a lot of CPU based work, then you can have a layered fanout model, where you have several layers of LB, each doing a small piece of the total overall work, if you can fit it all within your RTT budget.
Also if you get extreme you can do user-spaced or unikernel designs where the entire software stack is taken over to do only LB duties, and you can really tune the software stack to do your specific LB duties.
More than this, and into specific workloads for you, I’d be happy to consult, but it won’t be free ;)
This is how you do it[1], no cloud required. From what I understand the cloud is essentially composed of what’s in the linked article along with a sophisticated configuration. People’s apps run on this config in the cloud if you go deep enough but it’s being handle by people who worry about the plumbing for you. I really must say that the article I’ve linked is quite excellent and comes with source code. You should check out his other projects as well.
Because people will fall into the trap, and they will get a call from Oracle’s compliance department, and Oracle will make money, and that will buy enough fuel to power the Larry Ellison for another day or so.
“…You need to think of Larry Ellison the way you think of a lawnmower. You don’t anthropomorphize your lawnmower, the lawnmower just mows the lawn, you stick your hand in there and it’ll chop it off, the end. You don’t think ‘oh, the lawnmower hates me’ – lawnmower doesn’t give a shit about you, lawnmower can’t hate you. Don’t anthropomorphize the lawnmower. Don’t fall into that trap about Oracle.” – Bryan Cantrill
What about “soft-features” like eco-system, compatibility, maintainability, testability … ? Those are very important for some of the use cases that that require that the program works reliable and should be definetly considered when choosing a language for a project.
Edit: especially for a project that aims to be “highly reliable” on its homepage.
Not quite sure what you mean by compatibility, but if you mean operating systems, it is quite correlated with popularity.
I didn’t quite rank maintainability, I think its possible to make a mess of pretty much any language, but it is a good question if some languages produce more maintainable code than others. Rust advertising fearless concurrency is essentially a maintainability issue.
For testability I have some plans that I might write about later, but it is good to not choose something that paints you into a corner for sure.
Reliability to me means thorough testing of error conditions and minimizing code in the fault kernel of a program. For this tool in particular this means the write path data takes should be small and tested (across all error paths). The choice of error model for a language does affect this quite a lot. http://joeduffyblog.com/2016/02/07/the-error-model/ .
If I am totally honest, many languages don’t meet the quality standard (yet) either, which I sort of counted under ‘stability’ even if i didn’t mention it explicitly.
I didn’t quite rank maintainability, I think it’s possible to make a mess of pretty much any language…
I think this is in large part because languages that focus on correctness, often do so at the expense of difficult refactoring and daunting cognitive overhead. This manifests as a maintainability issue in actual projects, but the self-selecting nature of the maintainers means the issue is unconsciously swept under the rug.
At the same time, few languages offer features specifically geared toward keeping code refactorable—let alone aim to be both correct and manageable (both on the screen and in the mind). One rare exception, an example of the latter, is Jon Blow’s Jai prototype.
Planned to install a crank based powermeter on my road bike but destroyed (or rather will destroy) a set of chainrings after I’ve used the wrong chainring bolts (in hindsight over tightened them and stripped them when trying to remove them) on them.
So now I had to order another set of chainrings and another set of chainring bolts which will only arrive on monday.
Other than that I’m relaxing, finished a project on Friday new project/assignment from boss coming monday.
Also a question: Why is cycling so popular in this community (and HN) ?
Its not that nobody knows how this stuff works when they use a web framework, its just that nobody wants to invest the time to make the same mistakes as other when it comes to edge cases and the like.
I dont know how this article is aimed at, but for someone who knows how this stuff works it’d be a PITA to rewrite everything from scratch for no good reason, but I do have to admit that knowing whats going on underneath is useful.
And some of the things he calls out as “oh-no! a framework!” are more library than framework. What stood out to me was Flask, which is a pretty thin abstraction over the http request-response cycle.
I’ll try to train any regression model on my cycling performance data (HR, cadence, power, speed) to see if I can accurately predict my power data without a powermeter.
Early tests have shown my powermeter from my smart trainer is too inaccurate (~10 %) and my indoor training sessions are not a good source for my training data, I’ll see if real word data from a friend will improve results.
I was supposed to go on a two day road cycling trip, but then did not only my wheelset have some problems, the organizer also had to cancel the ride, because too few people signed up.
Bummer, I’ve enjoyed this ride very much last time.
Now I’ve got all this time to spend on finishing a paper and doing a what I can only assume to be a very fun assignment with Prolog.
What fascinates me about complexity theory, is how easy we can so from “yeah sure no problem” to “the universe will be dead by then”.
Shortest path from A to B, “yeah sure no problem”.
Optimal city round trip; “better make some tea, this is going to take a while” (if you discard the “good enough” solutions).
Same goes for computability theory, learning that “there is no way on earth, that you will get a result in finite time for all possible inputs” was such an mind opening experience, I wish I’d understand more of it.
Something I like to use to set up a VPN in a hurry is the OpenVPN install script by nyr
https://github.com/Nyr/openvpn-install
It takes very little time to setup and is easy to use, but I could totally see how someone would want to avoid that script (youre blindly trusting some shell script, curl | sh style) in favour of a solution like the one presented in this post.
Please don’t make your first submission to our community a feedback test for your product (judging by your wording). We are not your marketing channel.
Ok, I understand. I will try to post higher quality content in the future.
I was just trying to find out if this is something people would use.
Source is available at https://github.com/aiosin/gitraqr
I’m sorry but the calculation of money lost while devs are waiting for tests is misleading. Until testing takes up a significant portion of a developers time 6 minutes here and there mean nothing and further devs should be encouraged to take some time off the screen to clear their mind in between tasks for the sake of their mental health.
Also the time saved does not add up with each additional employee.
I want to take a break when I want to. If I’m in a flow state the slowness of tests can be really frustrating.
In that situation I’d probably would like to something that combines the technical, theoretical and practical aspects of computer science, there are probably better topics but a short intro about computer vision (facial recognition etc) with a demo, will not only easy to remember but also combines a lot of sub-disciplines.
“Fly, you fools!”
easy;
python -c "import antigravity"
Insert obligatory flying python XKCD here
I’ll have finished writing a ant to Gradle migration tool yet I’m no closer to understanding Gradle, groovy syntax is so weird.
Ooh, I might be interested in that. I have little doubt the horrifying abominations that my employer has built in ant will strenuously resist automatic conversion, but even the possibility of reducing the build tool count by one is intriguing.
well, for one you can interact with ant from within gradle, so you can pitch that to your manager/employer
my tool essentially translates the dependencies and classpath files, for our specific project layout, but if need help in the future I could help
Work:
Automatically migrating dependencies from ant projects into Gradle files This is proving to be not that simple. Anyone have a better idea than parsing the Gradle file and adding dependency expressions into the AST ?
Personal: Road cycling season is more or less over until next March just in time for me to start a off season training plan
Work:
Automatically migrating dependencies from ant projects into Gradle files This is proving to be not that simple. Anyone have a better idea than parsing the Gradle file and adding dependency expressions into the AST ?
Personal: Road cycling season is more or less over until next March just in time for me to start a off season training plan
Road cycling. I ride outside during the summer and ride on Zwift during the winter (starting now in October). I follow a training plan but it can be hard to stick to it when you have work and exams.
I’m still trying to find a personal project that combines cycling/Zwift and programming/tech.
This is what we do, per policy.
It depends on what you mean by Scale..
The easiest way to scale, provided you have $$’s is just throw more hardware at the problem (ram, cpu, etc) You can scale x86 hardware up quite far, and can probably handle most workloads while still sitting on a single machine just fine.
If you can’t scale on hardware, for whatever reason, then you have to get more creative/work a little harder. The next easiest is figure out what the pain points are for scaling X app, and working on the code/configuration to lower the resources required to do the same amount of work. For custom apps, this could just be something simple like changing how you store X in memory, or it could be putting that work out in a more performant language(like say a cython module if the code was originally in Python), etc. If it’s a java app, it might be just re-configuring the JVM a little bit.
Next is scaling by increasing the number of running copies of said application. This is where it can get.. hard, and is definitely more unique to each application. For instance scaling nginx across multiple copies is really easy, since it’s pretty much stateless, and has no DB or anything you have to deal with. Scaling postgres across multiple copies gets a lot harder, since it’s a DB, and is very stateful.
For us, for most web stuff, that is mostly stateless, we scale by just starting more instances of the application. For stuff like Postgres, we scale hardware, versus doing something like citus or offloading reads to a hot-standby or whatever. It gets complicated quickly doing that stuff.
Generally the easy answer is just throw more physical resources at the problem, that almost always just works to make things faster. You sort of have to know what resource to increase tho (CPU speed, memory, I/O bandwidth, storage, etc). Luckily every OS out there gives you the tools you need to figure that out pretty easily.
Excellent comment. I agree with all of it. I’ll add one can scale the databases by starting with or migrating to a database that scales horizontally. There’s quite a few of them. This is true for some other stateful services. One might also use load-balancers and/or sharding to keep too much traffic from being on one machine. That gets a bit more complex. There’s still tools and specialists that can help.
I agree there are stateful databases that scale multi-node better out of the box than Postgres(PG) does. I specifically picked PG as the example here because it doesn’t scale multi-instance/machine out of the box very well.
Once you get to multi-machine stateful applications, there are a lot of different trade-offs you have to handle, I’m not sure there is any one stateful DB that is multi-machine out of the box that will work for basically every workload the way Postgres does. I.e. PG is basically useful for any DB workload out of the box, provided it can fit on a single physical machine. I’d love examples of general purpose DB’s like PG that are multi-node out of the box with basically no downsides.
But basically my advice is, once you have a stateful thing you have to go multi-node with, you either need a good, well paid consultant, or good in-house technical staff, as it’s not an easy problem that is very well solved for all use cases. Or to put it another way, avoid multi-node stateful things until absolutely forced to do so, and then go there with eyes wide open, with lots of technical knowledge at hand. Luckily if you do get to that requirement, you probably have lots of $$$ resources to shove at the problem, which helps immensely.
Well-said again. Yeah, I don’t know if any of those advanced DB’s cover every kind of workload with what drawbacks. I’d want to see experimental comparisons on realistic data.
You sound like you know a lot about this topic. Hypothetically, if its even possible, what would you do if the load balancer that you put in front of your workers cant handle the incoming load. How do you load balance the load balancer ?
It’s definitely possible. You have some options, depending on the kind and amounts of traffic we are talking about.
It depends some on if the Load Balancer(LB) is hardware or software based, etc. Most people are doing software based ones these days.
Roughly in order of preference, but it’s a debatable order:
Rinse and repeat the above, until you have room to breathe again.
There are more complicated ways to do this, depending on traffic load. Once you get to needing to scale past simple DNS RR, you probably want a network expert, as it depends on the kinds of traffic (is it all inbound, or is it mostly outbound traffic, etc). I.e. there are some neat things you can do where the LB’s are only handling the inbound traffic and all outbound/return traffic can come directly from the machine(s) doing the work, and not have to go back through the LB, etc. This requires special configuration(s) and is not overly simple to do.
But it all basically comes down to the general case above. load balancers are generally very stateless, so the general, just run a bunch of copies plan usually works fine.
If you need your LB’s to do a lot of CPU based work, then you can have a layered fanout model, where you have several layers of LB, each doing a small piece of the total overall work, if you can fit it all within your RTT budget.
Also if you get extreme you can do user-spaced or unikernel designs where the entire software stack is taken over to do only LB duties, and you can really tune the software stack to do your specific LB duties.
More than this, and into specific workloads for you, I’d be happy to consult, but it won’t be free ;)
This is how you do it[1], no cloud required. From what I understand the cloud is essentially composed of what’s in the linked article along with a sophisticated configuration. People’s apps run on this config in the cloud if you go deep enough but it’s being handle by people who worry about the plumbing for you. I really must say that the article I’ve linked is quite excellent and comes with source code. You should check out his other projects as well.
[1]https://vincent.bernat.ch/en/blog/2018-multi-tier-loadbalancer
The related question to this is, of course, what do you do of the load balancer dies?
As much as I’d love to see people actually pay for tools, I don’t quite get why they’re trying this with the JDK.
Because people will fall into the trap, and they will get a call from Oracle’s compliance department, and Oracle will make money, and that will buy enough fuel to power the Larry Ellison for another day or so.
Wanted to mention the following comment on the same note:
https://palisadecompliance.com/oracle-org-chart/
Granted the article is opionated and a bit dated (2013) but this shows what Oracle is capable of.
I don’t know if it’s deliberate or by accident, but I like the idea of fueling a Larry Ellison.
But it will be a drag to ensure proper compliance with these new rules…
“…You need to think of Larry Ellison the way you think of a lawnmower. You don’t anthropomorphize your lawnmower, the lawnmower just mows the lawn, you stick your hand in there and it’ll chop it off, the end. You don’t think ‘oh, the lawnmower hates me’ – lawnmower doesn’t give a shit about you, lawnmower can’t hate you. Don’t anthropomorphize the lawnmower. Don’t fall into that trap about Oracle.” – Bryan Cantrill
https://www.youtube.com/watch?v=-zRN7XLCRhc
What about “soft-features” like eco-system, compatibility, maintainability, testability … ? Those are very important for some of the use cases that that require that the program works reliable and should be definetly considered when choosing a language for a project.
Edit: especially for a project that aims to be “highly reliable” on its homepage.
Popularity and eco-system seem highly correlated.
Not quite sure what you mean by compatibility, but if you mean operating systems, it is quite correlated with popularity.
I didn’t quite rank maintainability, I think its possible to make a mess of pretty much any language, but it is a good question if some languages produce more maintainable code than others. Rust advertising fearless concurrency is essentially a maintainability issue.
For testability I have some plans that I might write about later, but it is good to not choose something that paints you into a corner for sure.
Reliability to me means thorough testing of error conditions and minimizing code in the fault kernel of a program. For this tool in particular this means the write path data takes should be small and tested (across all error paths). The choice of error model for a language does affect this quite a lot. http://joeduffyblog.com/2016/02/07/the-error-model/ .
If I am totally honest, many languages don’t meet the quality standard (yet) either, which I sort of counted under ‘stability’ even if i didn’t mention it explicitly.
I think this is in large part because languages that focus on correctness, often do so at the expense of difficult refactoring and daunting cognitive overhead. This manifests as a maintainability issue in actual projects, but the self-selecting nature of the maintainers means the issue is unconsciously swept under the rug.
At the same time, few languages offer features specifically geared toward keeping code refactorable—let alone aim to be both correct and manageable (both on the screen and in the mind). One rare exception, an example of the latter, is Jon Blow’s Jai prototype.
Planned to install a crank based powermeter on my road bike but destroyed (or rather will destroy) a set of chainrings after I’ve used the wrong chainring bolts (in hindsight over tightened them and stripped them when trying to remove them) on them.
So now I had to order another set of chainrings and another set of chainring bolts which will only arrive on monday.
Other than that I’m relaxing, finished a project on Friday new project/assignment from boss coming monday.
Also a question: Why is cycling so popular in this community (and HN) ?
Its not that nobody knows how this stuff works when they use a web framework, its just that nobody wants to invest the time to make the same mistakes as other when it comes to edge cases and the like.
I dont know how this article is aimed at, but for someone who knows how this stuff works it’d be a PITA to rewrite everything from scratch for no good reason, but I do have to admit that knowing whats going on underneath is useful.
And some of the things he calls out as “oh-no! a framework!” are more library than framework. What stood out to me was Flask, which is a pretty thin abstraction over the http request-response cycle.
I’ll try to train any regression model on my cycling performance data (HR, cadence, power, speed) to see if I can accurately predict my power data without a powermeter.
Early tests have shown my powermeter from my smart trainer is too inaccurate (~10 %) and my indoor training sessions are not a good source for my training data, I’ll see if real word data from a friend will improve results.
Has anyone done anything resembling this ?
I was supposed to go on a two day road cycling trip, but then did not only my wheelset have some problems, the organizer also had to cancel the ride, because too few people signed up.
Bummer, I’ve enjoyed this ride very much last time.
Now I’ve got all this time to spend on finishing a paper and doing a what I can only assume to be a very fun assignment with Prolog.
What fascinates me about complexity theory, is how easy we can so from “yeah sure no problem” to “the universe will be dead by then”.
Shortest path from A to B, “yeah sure no problem”. Optimal city round trip; “better make some tea, this is going to take a while” (if you discard the “good enough” solutions).
Same goes for computability theory, learning that “there is no way on earth, that you will get a result in finite time for all possible inputs” was such an mind opening experience, I wish I’d understand more of it.
Something I like to use to set up a VPN in a hurry is the OpenVPN install script by nyr https://github.com/Nyr/openvpn-install It takes very little time to setup and is easy to use, but I could totally see how someone would want to avoid that script (youre blindly trusting some shell script, curl | sh style) in favour of a solution like the one presented in this post.
I use that script too - spin up a lowendspirit box, run the script and you have a cheap VPN set up in a few minutes.
Please don’t make your first submission to our community a feedback test for your product (judging by your wording). We are not your marketing channel.
Ok, I understand. I will try to post higher quality content in the future. I was just trying to find out if this is something people would use. Source is available at https://github.com/aiosin/gitraqr