1. 2

    It depends on the size of what you do… Sure full stack at Netflix sounds impossible… And everyone is a specialist in the service he manages.. but probably some are full stack at the level of their service…

    1. 4

      NER, classifier and recommender system. Planing data collection and I have to implement push notifications in our CMS… Is the week only 5 days?

      1. 4

        I think the comparison is flawed. Static Sites are just a frontend, and you could generate a static site from WordPress if you wanted to…

        CMS on the other hand are content management system and are basically a backend for people to manage the content.

        Many CMS propose a frontend, which is not static, but you have other CMS types called headless, or API first CMS that make wonderful backend for static websites.

        The headless work great on the web, as backend for apps, pwa, web apps, etc. And itegrate with everything, chatbot, TV, whatever.

        1. 1

          CMSs that use dynamic code in the frontend tend to rely on that functionality. Maybe not the core product but many of the plugins needed do.

          Static Sites are just a frontend, and you could generate a static site from WordPress if you wanted to…

          You don’t want to do that believe me! I had to implement exactly this a few months ago. The pain of having to manually figure out which files were missed by the export plugin to manually add them is big. Trying to teach people that most plugins won’t work because code will not be executed for each request seems to be in vain. They happily install whatever might solve their problem and then complain that stuff isn’t working as expected on the static version. Forcing me to try to implement that functionality on the webserver level should that be even possible.

          1. 1

            I would of course not use WordPress as backend for my static site… But there are many excellent headless that you can use for that. Strapi, CloudCMS, Contentfull, Kentico cloud, etc.

            I personally use Cloud CMS.

            I agree that if your users are use to WordPress, and that they have freedom to add plugins etc, that is not a viable solution. What I meant with my comment is that a static site is your frontend, a CMS is where you maintain content, or where you are supposed to maintain content.

            But CMS have grown to become monsters that supposedly do everything… Like shops etc.

            What I advocate in the company I work for is “content as a micro service” where your CMS is used only to create and manage content and not to manage the design, the shop, etc.

            This is a little outdated and does not reflect exactly where I am with this thinking, but it is nevertheless the underlying idea : https://jboye.com/2017/06/27/the-making-of-ers-2-0/

            1. 1

              Thanks that was an interesting read!

              1. 1

                Thank you!

        1. 2

          Starting a text classifier and named entity recognition

          1. 2

            What tech stack?

            I’ve used StanfordNLP’s NER on a project previously (we literally just needed NER and some date recognition, no sentiment/etc) and while we got it to work, the amount of work required to get it to a usable stage felt like overkill - it didn’t help that I had to delve back into java to get a usable http interface for it.

            1. 2

              If your looking for something better and non-java (with a more permissive license) I recommend checking out spaCy - https://spacy.io

              API is a pleasure to work with and lots of really good NER comes with the pretrained models.

              1. 1

                It was more of a general curiosity than a current requirement, but thanks for the reference.

              2. 1

                I am doing NER with spacy, classification with tensor flow. I am also experimenting with prodi.gy a tool that is developed by the same guys than spacy and offer an easy interface to worwith. For now I still have some issues with my own word vectors (4M words) I have some buffer overflows that I do not yet understand.

            1. 1

              I find the idea nice, but I do not understand what it does exactly. Ok the projector is able to recognise a piece of paper and print the output of the code on top of. But do you really need to print the code on the paper? It seems not right?

              What can you do else with it?

              1. 2

                What can you do else with it?

                Show it off to the internet to show how clever you are?

                This is one of those cleverness for cleverness’ sake.

                1. 2

                  Art for art’s sake. Ok why not.

              1. 2

                It’s obvious to anyone running a business that GDPR is a massive pain in the ass, and a huge threat. 20M euros in fines will destroy any medium-sized company too.

                Oh but if a company is fined under GDPR, surely that means it deserved to die, right? Good riddance! .. To any valuable products or services it provided, and good riddance to all the jobs it had created too!

                The GDPR has been successfully sold to the masses, as something that will supposedly prevent sleazy ad companies from invading your privacy. But do you really think Google will be invading it any less than before?

                What about governments then? Do you think intelligence agencies will spy on you less?

                This is the main reason why GDPR is such a fucking farce. They tell you they’re protecting your privacy, while invading it as much as they possibly can.

                1. 4

                  Agree it’s a huge hassle.

                  On the other hand people really suffered from not being able to get new bridges when engineering requirements were brought in, but 50 years later we no longer had lethal collapses on a regular basis.

                  Being able to make google wipe out everything they know about me is pretty cool. I’ve nearly finished getting their hooks out of my stuff.

                  1. 2

                    What about governments then? Do you think intelligence agencies will spy on you less?

                    Nope, since GDPR is primarily about working with commercial entities rather than clandestine government agencies.

                    But do you really think Google will be invading it any less than before?

                    I expect them to comply with the law.

                    I also expect companies will pop up with low-cost solutions to deal with user data, similarly to how PCI regulation created an industry for payment providers to come up and handle that aspect of the transaction. Cloud providers can offer userdata bases that are encrypted and architecture for it. And designing a new system for GDPR is not super challenging, the important parts of the law tend to be pretty straight forward.

                    As someone who was involved in implementing GDPR at a company, I believe the law is a good first iteration. I’m sure we’ll find that some things in it are irrelevant and some things in it are harmful, but I believe in pushing for privacy.

                    Do you have an alternative? You’ve consistently commented on GDPR being a bad idea and implied, but not out right said, that it will have no effect. Is your suggestion that we should just drop the idea and let companies do what they want? Do you have a suggestion for alternative legislation?

                    1. 0

                      I also expect companies will pop up with low-cost solutions to deal with user data

                      Don’t want to deal with the VAT-MESS? -Oh no problem! You just pay someone else to take care of that bullshit.

                      Don’t want to deal with the GDPR? -Oh no biggie. There’s a service to deal with that bullshit.

                      But a burden is still a burden, even if you pay someone else to deal with it, and there’s a limit to the burdens a business can bear.

                      I suspect the real goal of all these new burdensome regulations is to gradually cull small (and even medium) sized businesses, as part of a drive to centralize our societies ever further, so that we’re all easier to rule over.

                      I believe the law is a good first iteration. I’m sure we’ll find that some things in it are irrelevant and some things in it are harmful, but I believe in pushing for privacy.

                      It’s far from a good first iteration. They’re threatening one-man companies with 20M EUR fines for not complying with rules that are basically impossible to fully comply with. That’s not something to cheer for, and that doesn’t happen by accident - genuinely retarded people don’t get to a position where they’re writing EU-wide laws.

                      People keep telling us we’ll just have to wait and see how the law will be interpreted. That sounds vaguely benign, but what that means in the real world is observing which companies get destroyed for which arbitrary/political reasons.

                      It’s a bit like waiting to see who gets executed for wearing the kind of clothes the Emperor doesn’t happen to like. Is there no problem once everyone knows what kind of clothes he’s unhappy with?

                      Do you have an alternative? You’ve consistently commented on GDPR being a bad idea and implied, but not out right said, that it will have no effect.

                      How about “no onerous bullshit legislation”? Of course it will have effects, and they’ll be a massive net negative. How about tens of thousands of companies not wasting time researching and complying with onerous bullshit legislation, and concentrating on providing valuable goods and services instead?

                      Even if GDPR actually makes some privacy-invading scumbags call it a day, it’s not even meant to do anything about the police states budding everywhere.

                      Pretty much everyone on this forum is intimately familiar how the people running governments operate.. so why are you seemingly fine with.. well, anything governments do?

                      1. 2

                        Even if GDPR actually makes some privacy-invading scumbags call it a day, it’s not even meant to do anything about the police states budding everywhere.

                        You keep on bringing up government surveillance but GDPR does not have anything to do with that. It’s a fine fight to have but it’s not related to this particular discussion, there are other laws and legislation around government agencies.

                        How about “no onerous bullshit legislation”?

                        This is an entirely unactionable suggestion. One person’s onerous bullshit legislation is another’s opportunity. There is not meaningful way to turn this useless platitude into a working economic system.

                        1. 2

                          You keep on bringing up government surveillance but GDPR does not have anything to do with that.

                          Privacy, hello?

                          This is an entirely unactionable suggestion.

                          You’re saying onerous bullshit legislation has to be created, but that’s not true.

                          One person’s onerous bullshit legislation is another’s opportunity.

                          Duh? Of course it benefits whoever charges you money for dealing with the bullshit. So what?

                          There is not meaningful way to turn this useless platitude into a working economic system.

                          That sounds like you’re over-exerting yourself in trying to sound smart.

                    2. 1

                      This is exactly the point I am trying to make. How can a business take advantage of GDPR and build a “legal” tracking system that you can turn into a recommender system for example.

                      1. 0

                        Well you’re basically just advertising your http://grakn.ai/ service, and trying to polish the GDPR turd in the process.

                        1. 1

                          I am not actually working for GRAKN which is not a service but a database. I build a proof of concept for the company I work for. I had considered neo4j for the task but found GRAKN better suited. GRAKN did appreciate my proof of concept and asked me to publish my paper.

                    1. 4

                      I am working on a proof of concept for GDPR using a graph database and vuejs. Wednesday I will be speaking about API first CMS at WHO in Copenhagen.

                      1. 2

                        GDPR is going to be a hot topic next year. Is your idea to demonstrate links between data points?

                        1. 3

                          Yes it is! I am preparing a GitHub repo and few blog posts. I will share all when it is ready.

                          1. 1

                            Please do, I’m interested on this matter!

                            1. 1

                              Hello, as promised I have published the first part here: https://blog.grakn.ai/gdpr-threat-or-opportunity-4cdcc8802f22 the second part is here: https://medium.com/@samuelpouyt/grakn-ai-to-manage-gdpr-f10cd36539b9 and I have yet to publish the api example. Code is available here https://github.com/idealley/grakn-gdpr

                        2. 1

                          Are you talking about GDPR at the WHO? Or an actual CMS?

                          1. 3

                            At who I am speaking about Cloud CMS an actual CMS we have implemented where I work, but I am speaking generally about API first CMS’s and the benefits they can bring to a company, especially if you need to publish to different channels.

                            1. 1

                              Have you spoken at any other humanitarian agencies yet or worked at an NGO in a technical capacity before?

                              1. 1

                                I am working at an NGO. And we have implemented it. I agree it requires some technical knowledge, but the benefits are huge!

                                I did not speak at humanitarian agencies on this topic, but I have have in other digital circles.

                                1. 1

                                  Cool, well good luck! I haven’t been to the Copenhagen office before, been to GVA and in-country offices, they only let me out of my cage to see the outside world once in a blue moon.

                                  1. 1

                                    I was also in cage. One day I was invited, my boss said no. I took the days off on my extra hours, and financed myself. Like this trip to Copenhagen. :( But all the rest is fun!

                        1. 2

                          This is a series of blogs. It will be followed by an implementation example with an hypergraph database: GRAKN.AI

                          1. 1

                            I guess our saboteurs do it unconsciously, or at least I would hope so. But it is really interesting to see that those technics are really applied today… Especially the channel one. My guess is that it is also out of laziness or lacks of balls as decision can be offloaded and/or postponed.

                            1. 2

                              GDPR is covered by trashing encryption keys.

                              1. 2

                                I’d like trashable per-customer keys to be a good answer, but:

                                • You have to back up the keys (or risk losing everything), and those backups need to be mutable (so you’re back to square one with backups)
                                • Your marketing department still want a spreadsheet of unencrypted customer data
                                • Your fraud department need to be able to efficiently identify similar customer records (hard when they’re all encrypted with different keys)
                                • Your customer support department wants to use SAAS instead of a crufty in-house thing (and answer users who tweet/facebook at them)
                                1. 3

                                  You have to back up the keys (or risk losing everything), and those backups need to be mutable (so you’re back to square one with backups)

                                  Generally backups are done daily and expire over time. GDPR requires that a user deleting itself is effective within 30 days, so this can be solved by expiring backups after 30 days.

                                  Your marketing department still want a spreadsheet of unencrypted customer data

                                  Depending on what marketing is doing, often aggregates are sufficient. I’m not sure how often marketing needs personally identifiable information.

                                  Your fraud department need to be able to efficiently identify similar customer records (hard when they’re all encrypted with different keys)

                                  Again, aggregates are usually sufficient here. But to do more one probably does need to build specialized data pipeline jobs that know how to decrypt the data for the job.

                                  Your customer support department wants to use SAAS instead of a crufty in-house thing (and answer users who tweet/facebook at them)

                                  I’m not quite sure what this means so I don’t have a response to it.

                                  1. 1

                                    you also have to make sure re-identification is not possible… This is quite challenging and they are no guidelines to which extent this should be achieved

                                    1. 1

                                      Generally backups are done daily and expire over time. GDPR requires that a user deleting itself is effective within 30 days, so this can be solved by expiring backups after 30 days.

                                      Fair point - that’s really only a slight complication.

                                      Depending on what marketing is doing, often aggregates are sufficient. I’m not sure how often marketing needs personally identifiable information.

                                      Marketing don’t like being beholden to another team to produce their aggregates, but this is much more of an organizational problem than a technical one. Given the size of the fines I think the executive team will solve it.

                                      Again, aggregates are usually sufficient here. But to do more one probably does need to build specialized data pipeline jobs that know how to decrypt the data for the job.

                                      Fraud prevention is similar in difficulty to infosec, and it can hit margins pretty hard.

                                      There are generally two phases: detecting likely targets, and gathering sufficient evidence.

                                      For instance, I worked on a site where you could run a contest with a cash prize. Someone was laundering money through it by running lots of competitions and awarding their sockpuppets (which was bad for our community since they kept trying to enter the contests).

                                      The first sign something was wrong came from complaints that obviously-bad entries were winning contests. We found similarities between the contest holder accounts and sockpuppet accounts by comparing their PII.

                                      Then, we queried everyones PII to find out how often they were doing this, and shut them down. I’m not clear how we could have done this without decrypting every record at once (I suppose we could have done it to an ephemeral DB and then shut it down after querying).

                                      Customer support

                                      For instance, lots of companies use (eg) ZenDesk to help keep track of their dealings with customers. This can end up holding information from emails, phone systems, twitter messages, facebook posts, letters, etc.

                                      This stuff isn’t going to be encrypted per-user unless each of your third-party providers happen to also use the technique.

                                      Summary: It’s not a complete technique, but you’ve gotten past my biggest objections and I could see it making the problem tractable.

                                  2. 1

                                    Lobsters is open source. Anybody want to make a patch to make it use per user keys? I’m curious to see what’s involved.

                                    1. 1

                                      Good question though: what happens if a citizen of the EU uses his right to be forgotten? Does the user have a shiny “permanently forget me” button? The account deletion feature seems to fall a bit short of that?

                                      1. 1

                                        I suspect it’s “the site admin writes a query”.

                                    2. 1

                                      Actually you are wrong… as you have to make sure that user’s data is portable, meaning that it can be exported and transferred to someone else, and you cannot keep data if you do not need it… You also have to be able to show what data you have about the user… so if you cannot decrypt what you have to show the user… you are not compliant.

                                      1. 1

                                        Those are two separate requirements of GDPR, and being able to export a user’s data in a reusable format is only required if they haven’t asked for their data to be deleted.

                                        I think you’re missing a key part. If a user asks for their account to be deleted, you don’t need to be able to make their data portable anymore, you just need to get rid of it. If you delete the encryption key for your user’s data, you can no longer decrypt any data you have on a user - which means legally you don’t have that data. There is nothing to show the user, or make portable.

                                        1. 2

                                          I see your point and that indeed works only for deletion requests.

                                    1. 2

                                      How exactly does the EU think it can make people not sell to EU citizens if they have no local presence?

                                      1. 3

                                        Nice share thank you. I want to get started already! And rewrite our main API ;)

                                        1. 3

                                          The parser was my favorite. I could be misreading it since I dont know the language. The time example did look really easy to express and/or follow.

                                          1. 2

                                            It is indeed a nice design.

                                            You can read more about Parsec in this paper: https://web.archive.org/web/20140528151730/http://legacy.cs.uu.nl/daan/download/parsec/parsec.pdf

                                            If you plan to give a try, you should directly start with MegaParsec: https://hackage.haskell.org/package/megaparsec

                                            1. 2

                                              IMHO, MegaParsec has a terrible roadblock for a Haskell beginner. The first thing you’ll try to do with it will be something like (if you’re lucky enough to get the imports right):

                                              import Text.Megaparsec
                                              import Text.Megaparsec.Char.Lexer
                                              
                                              main = parseTest decimal "123"
                                              

                                              Then you’ll immediately be greeted by a compiler error:

                                               error:
                                                  • Ambiguous type variable ‘e0’ arising from a use of ‘parseTest’
                                                    prevents the constraint ‘(ShowErrorComponent
                                                                                e0)’ from being solved.
                                              

                                              For a Haskell beginner, it would be almost impossible to guess what to do next. Well, I’ll give spoilers here:

                                              ...
                                              import Data.Void
                                              type Parser = Parsec Void String
                                              
                                              main = parseTest (decimal :: Parser Integer) "123"
                                              

                                              Normally, you’d rely on tutorials around the net to get you covered, but megaparsec had a recent major release which broke all the tutorials you get by googling “Megaparsec getting started”.

                                            2. 1

                                              Indeed!

                                          1. 2

                                            Good intro into a relatively obscure corner of CS! Looking forward to the sequels. Search, esp. isomorphic search in hypergraphs is where it starts to get tricky.

                                            1. 1

                                              What I like with GRAKN is that they allow the use of hypergraphs and inferences on top of it in a very easy and straight forward way. Do you know of any resources on isomorphich search in hypergraphs (Sorry I did not look for it)

                                              1. 2

                                                Nothing specifically for hypergraphs AFAIK, but then am no expert. There are several methods for ordinary graphs which one can use as a starting point, good summary is here. Wish I read it before I started my pet project that makes use of HGs. Ended up basically re-inventing neighbourhood signature pruning, a la GraphQL. However I don’t have enough theoretical training to claim it’s the best approach.

                                                1. 1

                                                  Thank you.

                                            1. 1

                                              Go for an alternative on your servers e.g. https://www.totaljs.com/messenger/

                                              1. 2

                                                I’m wondering when/if email providers such as MailChimp provide/will provide such functionalities. That could be really nice. As for now it is a bit cumbersome for communication people…

                                                1. 13

                                                  @itamarst dude, time to step it up. There’s not much content here, and this is a clickbaity blogspam title. The article doesn’t offer much - one time this thing happened to you, and a nice thing to do when you make software is iterate?

                                                  Probably being a bit harsh, but I don’t come to lobste.rs for articles that are short, sugary and don’t particularly contribute anything other than ‘bad things are bad, good things are good’ with no exploration or development or original content.

                                                  1. 2

                                                    Hey, sorry you didn’t like the article.

                                                    I don’t know what to say about “clickbait title”, I’m just not very good at good titles mostly.

                                                    As far as actual content: iteration is pretty basic, yes, except many people don’t do it. So still seems worth repeating.

                                                    But beyond that… the point I’m trying to get across isn’t quite the same as “iterate”:

                                                    • Iteration is a technique, “start with simple version and flesh it out in each release,” let’s say.
                                                    • Incremental results are a desirable goal, with iteration being one way to achieve that goal.

                                                    And so I think the idea of incremental results is more powerful because it’s applicable more broadly. E.g. sometimes you can achieve incremental results without iteration:

                                                    • If each individual feature provides value on its own then you can incremental results with less iterative, more cumulative development.
                                                    • Incremental results can be structured into the output, e.g. the way Progressive JPEGs render incrementally.

                                                    But perhaps that should all have been in the original article.

                                                    1. 7

                                                      Your article also doesn’t provide any evidence that what you’re espousing does produce success. Your correct way isn’t actually a success store, it’s a figment of your imagination. It might be correct but there is no way to know based on this article. Maybe your correct way would have resulted in 6mo - 1yr delay for other reasons.

                                                      1. 2

                                                        No, but I implemented a program per “the right way” [1]. It’s written in Lua using LPeg (because of the amount of parsing involved) and it has proved to be “mostly fast enough.” It’s only recently (the past two months out of 18 months in production) that a performance issue has popped up (because traffic to the service has vastly increased) and the issue only happens intermittently (and usually don’t last long—by the time it’s alarmed, it’s over). Not a bad thing really, as the code is straightforward and easy to work on [2].

                                                        [1] It was intended to be a proof-of-concept (at least in my mind), only it ended up in production, without anyone informing me (long story).

                                                        [2] Thank God Lua has coroutines. It makes writing network code much nicer, as it’s imperative instead of a nest of callbacks with logic spread out all over the place.

                                                      2. 3

                                                        As thin as it may seem to some, I wish my boss would read it…

                                                        1. 4

                                                          Your boss will read the whole thing and only will remember “But maybe a Python version would have been fast enough.”

                                                          1. 2

                                                            ah ah ah, true. Unfortunately.

                                                    1. 5

                                                      I agree the article is a bit thin, but it’s true. The way I’ve heard it described is doing a vertical slice of an entire feature, rather than horizontally filling out your implementation. The end user features have to drive the architecture and not the other way around.

                                                      Another way to think about it is to do end-to-end tests first; don’t write tests for internal interfaces which will undoubtedly change based on contact with the real world.

                                                      1. 1

                                                        I also think that ‘vertical slices’ can be simpler to implement. Here though it seems that some kind of messaging protocol had to be implemented. It seems, at list to me, complicated to slice it vertically mostly as everything should work together. No?

                                                        1. 2

                                                          Yes, I think you mean it’s silly to slice it horizontally if you really need the end-to-end messaging; doing it vertically is obvious. I think the advice is obvious in most circumstances, but I guess in a big organization you can fall into the trap of “well this team will just implement the interface, and then we’ll build to that interface”.

                                                          But that doesn’t really work… Well, you might be able to brute force it, but the result will be suboptimal (buggy, hard to maintain, perform badly). A better approach is to have a small team to build out the end-to-end proof of concept, and then split up the work with a more stable architecture.

                                                          Certain architectural decisions are almost set in stone from the beginning, and very hard to change later.

                                                      1. 1

                                                        That is impressive. Thank you for sharing.

                                                        1. 1

                                                          This post has plenty of issues IMO.

                                                          For instance:

                                                          const add = (a, b) => a + b
                                                          The first line creates a function add() that takes one parameter and it returns another function that also takes one parameter.

                                                          Or the two code samples shown: https://gist.github.com/idealley/1066aca705b768e5f869674e489347c3/0540e51d02106a6e7904e762cc002ed1c2c4ccba and https://gist.github.com/idealley/3691634227195f6d26027e93b9485849/3471b723bbfd47af561e99cb03d2a66d7a986374 which are pretty bad:

                                                          • no consistency (x ;, x;, if (a), if (a ), if () {}, if ()\n{})
                                                          • loose equalities (!=)
                                                          • use of indexOf instead of includes
                                                          • a || b hack instead of default parameters

                                                          Overall, I think this blogpost gives a pretty bad example to the reader.

                                                          1. 2

                                                            Thank you. I will improve it.