1. 1

    I remember I was hugely excited when instant apps were first introduced by Google. Unfortunately none of the hype has lived long to be adopted by users or app owners. Do you know why that would be? Is it due to the friction due to maintaining a separate version of the app that fits the limitations? Or app owners are simply happy with customers using the mobile website given most apps are practically webviews?

    1. 1

      I think one of the biggest problems is that the time put in by interviewers is significantly lower than the time put in by interviewees. If it did match, then we would see a lot of multi-layered good problem solving from both sides which would help assess candidates better and remove biases.

      Usually things work out as is, and that’s why nobody bats an eye, but I would hope people and companies to strive to do better.

      1. 3

        I was talking about this post with a co-worker today who has been working on polygon tracing for farms as well and he pointed out that the solution in this post, while good at minimizing the fence/perimeter, it will likely have issues when a user tries to draw something that really does look like the before images in this post like if they have a lake or something else that cuts a jagged edge in to the perimeter then this software will mess it up. While the example at the top is a bit annoying, it’s not too hard to learn the right way to do things and not have the problem from now on. But with this solution, the software is modifying your data to fit it’s own model of what looks right and you can’t change your behaviour because the software is always trying to optimise for the smallest fence line which isn’t always a reality.

        1. 3

          From our current experience, we might be able to handle this case too.

          Take the cases of the errors we have recorded in the blog, there are cases where it does fail because it isn’t the smallest path. It is fairly easy and intuitive for a user to add more points such that the path goes through what is required.

          At the same time, it is not going to work always, so yes there are those few cases where it will still not work, and we are fine to make the user do more work there (delete everything, start again) while at the same time optimizing for the majority cases.

          1. 2

            Thanks for the response. How well does it behave while editing? I can imagine such a system might radically change the polygon while editing and it finds shorter paths with a new node added. Also is the performance decent for real time editing with things like dragging a node around the map?

            1. 2

              Sometimes it does goes bad, but most of the time it is more intuitive than any other User interface we tried. Check these 2 links which show that it will work after a few manageable tries:

              But again in terms of possibilities, there is always going to be a case which is not going to work at all (sadly). And it does change radically and unexpectedly at times. One clever trick here is to stick the initialization to the user marked points, because most likely they did mark them in close to correct order.

              Regarding performance, we are using 2-opt algorithm, so our number of steps are limited by N^2 * number of iterations

              We can set these number of iterations based on expectations of N. For example, we work with 10 iterations and it seems to converge well most of the times. Assuming 20 points, and each iteration being ~1ms, it would take 4 seconds. In reality, I have never seen it become unresponsive even for 50 points(which means the iteration duration is far lower than 1ms).

          2. 1

            I think they have a manual override, although presented as tongue in cheek:

            Technical support: Just kidding, there is a manual override. Just mark them in order.

          1. 1

            Really fun article! One question: what was the reasoning behind implementing a solver yourself instead of using an off the shelf solution? Was it because of the “it must run on an old smartphone” requirement?

            1. 2

              Honestly, I wanted to use a library but couldn’t find any Javascript solver which uses any of the old boring(read fast) algorithms such as 2-opt or 3-opt or the LKH solutions I wanted.

              You will find a few JS implementations which are based on Genetic algorithms, but though they are cooler, I don’t like the idea of using non-deterministic algorithms.

              Again, 2-opt is so easy that it is documented on Wikipedia :) So I had no inhibitions to implement it myself. The best available heuristic (LKH) is a lot more complicated, which is why I haven’t implemented it (yet).

              1. 1

                Makes perfect sense. Thank you for explaining!

            1. 2

              So, there’s a middle ground between “let your ORM do N+1 queries but use caching to mitigate it” and “don’t use an ORM”. The N+1 problem is due to lazy loading, which is often a good default for an ORM. But a full featured ORM will let you eager load, either at the mapping level, or on a case by case basis. The Django ORM is pretty simple (which is fine! Sometimes that is what you want!), and doesn’t really let you do this. @mordae posts how to turn on eager loading temporarily in SQLAlchemy, which is probably your best option in Python. But lots of other ORMs like Hibernate/NHibernate also let you do this. So I’m going to strongly disagree that “throw away ORMs’ is the “correct” way to fix it. “Use a more capable ORM if you need to” is probably the correct answer.

              1. 1

                Makes sense. I have personally not worked with other ORMs so I can’t say about them here. But I agree a more powerful ORM should be able to solve these problems. Even possibly handle the partial queries example I added as response to @jrwren

              1. 1

                At the same time, do you think if it would make sense for a company to develop this commercially?

                Technology giants are so valuable because of their ability to develop correlation between users’ data, create insights about users which are helpful for them for generating business based on similarities of users. (Again, it will be much better for users’ to get a model which is locally optimized which is something Google has been adding to our phones keyboards)

                1. 1

                  IMHO, this is not only about privacy but also about yielding the computation to a distributed network of devices. As you say, Google has tested it on Google Keyboard (https://ai.googleblog.com/2017/04/federated-learning-collaborative.html). And as far as I know, some other companies such as Mozilla are investigating this technique.

                1. 6

                  Or maybe configure loading properly, including failing on unexpected loads early.

                  1. 2

                    Sounds like a good solution if you are looking to optimize your queries. Honestly, we aren’t.

                    A simple example is a Rest Framework model API. We want to use this API for getting different data fields every time. There is no smartness in the ORM to decide whether to pre-fetch or not. Unless I take time to optimize each of the cases manually (or mindlessly always do JOINs which I don’t need), I can’t make it return me the data in an efficient way.

                    1. 6

                      why aren’t you looking to optimize your queries?

                      1. 1

                        I will try to explain the above example in detail.

                        We write an API which simply returns data from our model directly.

                        Model Crop
                           name 
                        Model Farm
                           crop
                           area
                        

                        Now, we want to access a Farm using an API: /farm

                        But we made this API with partial support so that we can access only the data we need

                        /farm?fields=crop  # Needs to join with Crop table
                        /farm?fields=area  # Does not need to join with Crop table
                        

                        The above is a simple example, but things can get a lot more complex where we access 3 layers of relationships. In most of our cases, we are not seeing a lot of latency with caching added(see results in my tests, we went from 7s to 3s for a very very large query), so we aren’t worrying about optimizing it.

                        Of course, we should be splitting into multiple APIs which do different things, but given that this is allowing us to move faster, we have been doing it.

                  1. 4

                    Adding plug and play Caching is a no brainer, you can get significant optimizations by doing nothing, even with badly written queries.

                    I don’t think it’s that simple. There’s a reason cache invalidation comes up every single time someone makes a “two hard problems in computer science” joke.

                    A better framing would be - “If you introduce caching, be very careful. It’s not a no brainer.”

                    1. 2

                      Completely agreed that cache invalidation is hard.

                      In the above blog, I am working with a Caching library which takes care of invalidations. Note that the Cachalot library is highly inefficient, it drops all Querysets related to the model in case of a single object being changed. If that’s a red flag for you, then you shouldn’t use it. It works very well for our use case.