Threads for mxp

  1. 3

    I will continue to work on Marmot with two major things

    • File based configuration as option; I believe I’ve added enough CLI flags
    • S3 snapshots of database, I started off with NATS as file store which has solved the problem for me but I know as a fact people would want to use S3 and other blob stores to store snapshots.

    Then I will continue to work on documentation, and website for it. Turns out quite a few people actually respond to a website rather than boring Github repo.

    1. 3

      Very interesting I’ve been considering implementing view stamped replication for Marmot maybe I can use this project to inspire some ideas.

      1. 1

        Have you considered exploring Marmot instead of LiteStream or LiteFS. You can actually build a CouchDB like multi-master service on top of SQLite (I’ve done demos of PocketBase and KeyStone JS).

        1. 2

          Interesting - I hadn’t heard of this! I’ll definetly add it to my ever growing list.

          1. 2

            Custom protocol to communicate with DB. It’s same category as rqlite but worst because now you have to implement protocol as well.

          1. 1

            For a second I got confused if it’s some version of InfiniSpan. Pretty sure Amazon will shut this down once too many people use it.

            1. 3

              When I saw Next.js and all the SSR shenanigans, it was pretty evident to me we need very small SPAs. The moment a person navigates to a brand new URL you are better off serving a fresh page vs introducing complex routing on client side. It instantly took me back to PHP + jQuery days! I think React just got out of hands in this case.

              1. 4

                This seems both very clever and on its face a very inconvenient way of programming. Is there an example of this being used usefully?

                1. 1

                  I can definitely give you an example. Imagine a business scenario where you want to check if a person is eligible for car loan if, he has monthly salary more than 150K and his credit score is more than 700. You can easily add rule to your code like person.salary > 150000 && person.credit_score > 700 but these kind of rules usually change with evolving business needs. Having to recompile & deploy a whole system every time there is business change rule change is bad idea, that’s where people rely on these rule engines. Similarly if you want to give user access to a resource based on a particular role in user.roles array you don’t want to deploy something as critical as identity service every time. All of these rule engines and access control have a fundamentally important thing i.e. boolean expression validation before executing something. This is where I believe RuES will be helpful, it can be a really help you have an isolated, and independently managed rules system, usually at larger orgs security would love a system where they can isolate such rules. Here is an example of such a system in Open Stack called Oslo.

                  1. 1

                    Ok generically you’re talking about rule systems. But what makes this system convenient? Tbh one post request per rule seems pretty inconvenient; and why not embed inside a host process?

                    1. 1

                      Host inside process and value I’ve answered below in another comment. Batch call I already am working on still not finished but I plan to have it out ASAP v 0.3.0.

                      1. 1

                        Eh with batch call it seems like it would work for the usecase you described. Seems like a case of de gustibus non disputandum est

                        1. 2

                          As I promised the latest version has batch API now :)

                          1. 1

                            It’s a new year’s miracle 😃

                1. 2

                  Will continue to work on my hobby project RuES. I have to say I am getting used to Rust, its like having a code reviewer with me who just stops me in place if I do something stupid. My next features are to add date parsing and formatting functions, and Geo functions I have a smaller list but I want to make sure things are performant and they work first before I start extending the list.

                  1. 2

                    @mxp can you explain a bit the benefits this offers over embedding one of the JMESPath libraries in your software directly? https://jmespath.org/libraries.html shows well tested libraries for most popular programming languages.

                    I hear about “rule engines” but I’m a bit stymied to understand when I want one, the concept is too abstract for me.

                    1. 1

                      You can definitely embed the logic and take up responsibility of updating and managing rules including deployments. I believe it’s the same question as why have memcached when you can embed a cache library. Obvious advantages are reusability, centralization of rules management, and most important that I believe security folks in your team will like is sandboxing. Bug in an in process library has a broader attack surface than an isolated sidecar or a server.

                      I am envisioning this more like a Redis of expression evaluation where I want to provide with broad set of very commonly used functions and operators (like I already added support for regex matching), JMES is the entry point the value lies in operations and broad range of things rules can do. Just like you could have implemented your own version of Redis with all data structures but having it out of box makes it so much easier, and reusable.

                      1. 2

                        I get that, but this is so many orders of magnitude slower than linking in a JMESPath library or even running it as a separate local process. And these sorts of tests/queries can easily become bottlenecks in a larger operation. (For example, the “map function” in CouchDB, evaluated using an external JS process.)

                        1. 1

                          I would respectfully disagree, map function for larger queries is totally different case specially the CouchDB part with external evaluation.

                          I can give you counter example of Redis & Lua that has been out in production for years now, and I’ve seen people successfully deploy production grade apps on it. I did give rationale of why you might centralize it in a comment above, and how I plan on having a batch call to avoid multiple round-trips. That combined with sidecar approach should be able to give you sub-millisecond responses (like benchmarks).

                      2. 1

                        Some common places to use them could be things like:

                        • determining if a loan application has all appropriate bits (credit , history, job etc)
                        • determining applicability of a candidate for insurance

                        You can often think of them as glorified Boolean if thens. Right now I use them in a variety of situations:

                        • look at a particular protocol message and compare it to a bunch of rules to determine if someone is attempting something nefarious
                        • look at log messages to determine if they meet any of several hundred conditions and then alerting / routing / etc appropriately

                        Is that a bit more concrete? The “promise” you often hear is of allowing domain experts to define the rules and actions so they don’t need to be put directly into code.

                      1. 2

                        Have been working on my small open source log filtering application https://github.com/maxpert/drep (wrote in rust to ensure no memory leaks and low footprint) that I believe will be very useful for containers to dynamically enabled disable logs on the fly. I’ve personally struggled with problem of enabling disabling logs on fly which were then piped into log aggregator. After looking around a little bit couldn’t find a UNIX philosophy tool so I decided to write my own.

                        1. 3

                          The best thing one can do to learn this build a simple P2P file transfer system. That us how I learned all of this, a lot of time you take things for granted and don’t pay attention to details despite being told to you. Implementing P2P really helps asking these questions.

                          1. 2

                            The current data points used for generating fingerprints are: user agent, screen print, color depth, current resolution, available resolution, device XDPI, device YDPI, plugin list, font list, local storage, session storage, timezone, language, system language, cookies, canvas print

                            Curious if a browser plugin that randomizes or obfuscates these exists.

                            1. 5

                              The tor browser (which is a set of firefox configurations + extensions) blocked this successfully for me.

                              1. 5

                                The Tor browser does the best possible thing: it gives everyone the same UA, resolution, etc. And more importantly, it picks the most common values that are observed on the web for those. Every Tor browser user looks like the most statistically average web user in the world.

                              2. 5

                                Firefox has privacy.resistFingerprinting, which I’ve used reasonably successfully. Sometimes it breaks sites that display time e.g. Gmail other times it breaks in bigger ways e.g. when writing to a <canvas> element. So it’s not uncommon for me to need to temporarily disable it for a one-off basis.

                                1. 3

                                  I’m running Firefox from the Debian repos with essentially all the privacy settings enabled as well as a bunch of extensions for fingerprint blocking, tracker blocking etc and it seems to have stopped this site from doing its tricks :)

                                  1. 1

                                    Brave has something builtin AFAIK

                                    1. 2

                                      I temporarily installed brave just to test this, then removed it because I find other things about it worrisome. But it did successfully block this specific site from identifying me. Vanilla firefox did not block it. tor browser successfully blocked it. So did vivaldi.

                                      1. 1

                                        What were worrisome parts? May be I can evaluate too.

                                        1. 5

                                          They have, in the past, decided it was OK to inject content into websites for their own financial gain. Here’s an example. This is related. Their administration of the “Brave Rewards” program (stripping ads from sites, showing their own stuff, and holding payments until the sites step up and ask for them) is also a little disturbing if less likely to be privacy-violating.

                                          In short, if I want an alternate blink-based thing, I think Vivaldi is less likely to have a profit motive where they benefit from compromising my interests. And If I want something really privacy focused, I don’t think a blink thing is likely the smart play anyway. So there’s no upside to make me want to keep Brave around given what they’ve shown me.

                                  1. 6

                                    One of these days, I promise one of these days I plan to use NIM!

                                    1. 1

                                      I wonder if a basic md5 on workspace ID would have been enough to use the builtin partition features of Postgres. 480 is indeed a weird choice pretty sure it will have deeper roots that is not mentioned in article. Also no mention of if the logical to physical map was stored somewhere or it’s just a mathematical mapping.

                                      Edit: Based on foot note 3 it seems like it’s just a mathematical function. I would love to see shard rebalancing strategy here, and how it plays out in future.

                                      1. 9

                                        I converted from MySQL (before whole MariaDB and fork), and I’ve been happier with every new version. My biggest moment of joy was JSONB and it keeps getting better. Can we please make the connections lighter so that I don’t have to use stuff like pg-bouncer in the middle? I would love to see that in future versions.

                                        1. 6

                                          Connections are lighter in Postgres 14!

                                          1. 5

                                            Do you have more information about it ? I am interested too :)

                                            1. 4

                                              This writeup (not by me) might be of interest: https://pganalyze.com/blog/postgres-14-performance-monitoring

                                            2. 4

                                              I am all ears!

                                              1. 2

                                                One link is in a reply to other comment in a tree, and some other details can be found from links from this blog post https://www.depesz.com/2020/08/25/waiting-for-postgresql-14-improvements-for-handling-large-number-of-connections/

                                          1. 1

                                            <3 <3 <3 for the team! I just love it!

                                            1. 3

                                              Should I continue learning Django or is it better to move to the likes of Rocket and Actix? (0 experience in Rust , btw)

                                              And what’s the consensus on SSR vs REST + SPA?

                                              1. 6

                                                Play with rust and see if you like it first. No sense in discussing frameworks if you don’t even like working in the language.

                                                1. 4

                                                  Should I continue learning Django or is it better to move to the likes of Rocket and Actix? (0 experience in Rust , btw)

                                                  It depends on what you want to achieve. I can tell you billion dollar companies out there running on Django Python (e.g. Clubhouse). Rust itself has a higher learning curve and requires almost a rewire if you are traditional programmer, but gets a lot of things right, and makes you think through a lot you used to ignore. So pick what you want.

                                                  what’s the consensus on SSR vs REST + SPA

                                                  It’s hilarious that we keep doing these cycles between server heavy vs client heavy views. I’ve seen it 3 times in my life already. So learn it but don’t make it a religion.

                                                  1. 2

                                                    I’m late to the party, but if you’re familiar with Python it’s probably worth starting here. If you’re not very experienced with webapp development Django is pretty nice (at least last few times I’ve used it).

                                                    I haven’t tried any of the web frameworks for Rust, but I imagine that the learning curve is steep with a new language, a new framework which is in heavy development and isn’t as well known as say Django. That means searching for help when running into problems is a lot harder.

                                                    1. 1

                                                      And what’s the consensus on SSR vs REST + SPA?

                                                      I’m not sure there is a consensus, but in the React world there seems to be solid momentum around “why not both?”. Next JS (and Gatsby?) will let you mix SSR, static, and client-side/SPA fairly seamlessly. Making a page static or SSR is often just a matter of moving your query to a specially-named function. Combined with Typescript it’s a pretty nice place to be. But, as with many things in JS-land, there can be rather a lot of complexity under the hood.

                                                    1. 6

                                                      I did some initial benchmarks and plan to do a blog post detailing my findings.

                                                      1. 2

                                                        I can’t wait to try Shenandoah and ZGC in production (one of my services does over 1.5K RPS with G1). I’ve moved over to Kotlin long time ago seeing same features land in Java, and more JVM improvements get me excited and confident in future of JVM ecosystem.

                                                        1. 12

                                                          It really is an exciting time to be working in a JVM language. I too moved over to Kotlin a while back, but I still closely follow what’s going on in Java.

                                                          My hunch is that a lot of people who currently dismiss Java and the JVM as slow bulky dinosaur tech are going to be shocked when some of the major upcoming changes get released. Loom (virtual threads) in particular should drive a stake through the heart of async/await or reactive-style programming outside a small set of niche use cases, without sacrificing any of the scalability wins. Valhalla (user-defined types with the low overhead of primitive types) and Panama (lightweight interaction between Java and native code) will, I suspect, make JVM languages a competitive option for a lot of applications where people currently turn to Python with C extensions.

                                                          1. 2

                                                            My hunch is that a lot of people who currently dismiss Java and the JVM as slow bulky dinosaur tech are going to be shocked when some of the major upcoming changes get released

                                                            I agree with this re the JVM, but isn’t Java mostly releasing language-level changes that are just catch-up with things that have been commonplace elsewhere for years?

                                                            1. 4

                                                              That’s a fair point, sure.

                                                              Maybe a better way to frame it is that as language changes roll out, it’ll get harder to point to Java and say, “That’s such an obsolete, behind-the-times language. It doesn’t even have thing X like the other 9 of the top 10 languages have had for years.”

                                                              Of course, Java will never (and should never, IMO) be on the bleeding edge of language design; its designers have made a deliberate choice to let other languages prove, or fail to prove, the value of new language features. By design, you’ll pretty much always be able to point to existing precedent for anything new in Java, and it’ll never look as modern as brand-new languages. My point is more that I think the perception will shift from, “Java is obsolete and stagnant” to, “Java is conservative but modern.”

                                                            2. 1

                                                              my Android client app shares code with backend (both are in Java).
                                                              The android’s Java is at about JDK 8+ level (https://developer.android.com/studio/write/java8-support-table ) The backend is currently using JDK 11.

                                                              So sharing the code between client and backend is becoming more challenging.

                                                              I think, if I am to move backend to JDK 17, then it will be harder to share code (if take advantage of JDK 17 features on the backend).

                                                              I guess the solution is to move both backend and frontend to Kotlin… but that’s a lot of work without significant business value.

                                                              1. 3

                                                                Nothing prevents you from using JDK 17 with Java 8 language features level, essentially marking any use of new features a compile errors and making sure your compiler produces Java 1.8 compatible bytecode. That’s what we do in our library that needs to be JDK 8+ (but we use JDK 8 on the CI side to compile the final JARs to be on the safe side). Then you can run that code on JVM 17 on the server and take advantage of the JVM improvements (but not the new language features). We have decided to add Kotlin where it makes sense gradually instead of a full rewrite (e.g. when we’d want to use coroutines).

                                                                1. 2

                                                                  You could also just stick to 11. It’ll be supported for years.

                                                            1. 2

                                                              I’ve been working on a very unconventional idea of exposing an HTTP interface for Postgres https://github.com/maxpert/phanpy

                                                              There are options like PostgREST, or GraphQL, but I’ve never been convinced of DSLs, or new query language being as robust and powerful as SQL itself (I can keep ranting for my experiences but I won’t do it here). It streams the rows, hence the memory overhead is extremely low (over 1K RPS under 50MB). I’ve had in production with all these DSLs. While I am still working on adding resilience features like circuit breakers, and stampede prevention I am open to suggestions and feedback.

                                                              1. 1

                                                                What makes PostgREST exceptional is the fact that it propagates HTTP headers to the query environment (including auth proxy headers).

                                                                Is phanpy safe in this regard, or is it possible to craft such a query that would trick your views and stored procedures into thinking the caller is somebody else?

                                                                1. 1

                                                                  Not yet. I am answering the fundamental questions yet. It is extremely important that I validate the fit and correctness of solution first. These edge cases can land in later.