I am wondering if commits added during code review should be “fixup” of the previous commit. Those fixup commits would be autosquashed before the branch gets merged.
What do you think about this?
Great idea! Thank you for sharing :)
It did feel “wrong” to recommend using --force-with-lease during the review process. Using git commit with either --fixup or --squash after code review would be best of both worlds.
I always switch from azerty to qwerty, this project helps me a lot. Also if you homebrew, swift there are open issues with “help wanted”.
The provided link leads to a 404. Maybe replace with https://github.com/spencertipping/jit-tutorial ?
Their is even : https://github.com/kbarrette/mediummode
That actually seems like it would be more irritating than hard mode…
We have/has a similar issued, where our test suite took about 20 minutes to run. We did similar things to shave off a couple of minutes, but the biggest improvements came from a different approach:
Slow tests are often caused by slow code.
Your tests are calling stuff in your codebase. If your codebase has performance problems, your tests will, too. We halved our test runtime by optimizing a few methods and fixing a quirk in our caching layer.
In our case, we should less rely on the database. For example, we should test once the database scope the fetch specific data and in other tests, stub it.
This site is extremely wide and horrible to read on Firefox because I need to scroll right all the time.
Yep ! I submited a PR https://github.com/schneems/schneems/pull/14
I’ve had the same problem. I’m not sure what’s going on.
It’s fixed now.
Very impressed about all of those scripts. If someone succeed at using FactoryDoctor I’m interested.
On the input/output uniqueness validation, the canonical way to do this is to create and maintain your own bloom filter; the precise method depends pretty heavily on what your stack and your use case looks like, but, e.g., https://github.com/Baqend/Orestes-Bloomfilter might give you some ideas/pointers/clues.
On the Lambda timeouts, what specifically are you looking to understand? https://aws.amazon.com/lambda/faqs/ tells us that there’s a 3 second execution timeout that you can configure up to 300 seconds if you need. If you want to understand why you’re timing out, then that’s an application level concern that you need to instrument through application level logging/tracing/analysis libraries.
On Lambda error reporting, generally the best practice is to take the logs and put them somewhere as a unified whole, and then run whatever processing on them you believe may be necessary, rather than split them at the application level and then try to have multiple handling paths. This is especially true in distributed environments where there might be hundreds or thousands of workers running on different machines and you may have a variety of questions that might have correlated answers, like timeouts and such (e.g. a centralized db table is missing an index and the query planner gave up).
On Redshift, there are probably many SQL commands “missing”; the world of SQL database command implementation is one of thousands of overlapping venn diagrams, none of which are quite alike. In some cases the underlying technical implementation may make certain commands impossible or incredibly expensive; in others, it may be that there hasn’t been enough obvious demand yet to implement the functionality. For DISTINCT ON, you might try something like this: https://gist.github.com/jmindek/62c50dd766556b7b16d6 and for GENERATE_SERIES(), you might try the ‘old school’ way: https://www.periscopedata.com/blog/generate-series-in-redshift-and-mysql.html .
(disclaimer: I work for Amazon, but not in AWS)
Thanks a lot for your answer. I didn’t know about bloomfilter. Will take a look.
For lambda timeouts it’s only because I like to understand what’s happen and if I can do something about it. The only way I found it’s to add logs on a every step of the lambda to check where it stops.
Yes you’re correct. I just thought that maybe AWS could split in two type of errors (lambda error and timeout). I saw that people were using elasticsearch and kibana or iopipe but we are a small team and we can spent some times on this for the moment. For the moment with the number of lambda we have cloudwatch is nearly enough.
It’s funny because last comment it’s me :)
Thanks for the other link.
I’m not working for them but I wanted to discuss about this kind of project. I think this is the first time I see something like this. Do you think it’s a good thing?
I’m not sure that a random people can fix random OS projects.
From lobsters: https://lobste.rs/s/ciwlu0/volkswagen
Identical post: http://blog.cleancoder.com/uncle-bob/2015/10/14/VW.html
Arf sorry will take care next time.
URL has some garbage at the end.
Real url: http://www.craigkerstiens.com/2016/01/08/writing-better-sql/
Sorry. Thanks to Irene.
While we’re on the subject, does anyone have a great recommendation for a book on SQL? I don’t have to write a ton of super complex queries in my job, but once every month or two, some task calls for a good bit of SQL writing, and I’d like to get a better foundation that just “what I’ve picked up over the years plus Google”.
Not a book recommendation but a couple pieces of advice which helped me shift out of the procedural mindset:
Think about the problem in terms of sets and transformations rather than individual instances.
When formulating a query start with the SELECT and write a list of what you want to see in the result set. This is the goal you’re working towards. Add in appropriate joins to build a set of the data you require. This is your starting point. Figure out the intermediate set transformations required to get from start to finish. Coincidentally this made the ordering of SELECT, FROM, and WHERE click. I was previously thinking in terms of FROM, WHERE, and SELECT.
Hopefully that’s not too elementary. Coming from a similar background I’d never really had that spelled out to me.
I came across the advice in Joe Celko’s SQL for Smarties, which I think is probably too basic for your needs. I haven’t read anything else by him so can’t vouch but “Joe Celko’s Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL” might be helpful? I’ve also heard good things about “The Essence of SQL” but it’s out of print so good luck finding a copy!
I find it amazing how differently you approach this than I do, yet I would assume, we would still end up writing very similar queries.
How do you approach it?
I tend to think of the database like a tree or a perhaps a map (as in google not reduce). I look at the fields I know I need to return, and then mentally map them across the database. I start my query by working from the trunk, or root node if you don’t like wood, of my tree and then map out the query from there. The root node isn’t always the same table; so that can vary based upon the required fields. After selecting all the fields I need from the root, I proceed to join the next table required by my SELECT. That isn’t always direct, and many times, there are tables in between. The process repeats till I have all the tables I need.
This line of thinking has lended itself well to pretty much every dataset I’ve encountered. Words like “set”, “transformation”, and “instance” never even crossed my mind.
Now obviously words like “set” and “instance” have a great deal of meaning in database land, but as far as writing queries go, those aren’t words I tend to think of.
I use CTEs a lot in Postgres, so I find that I work towards the final result in a different way, more like I would in code - by treating the query as a series of data transformations. So, for example, if I want to produce a result for a set of users on a range of days, I write a CTE to get the users, then another one to generate the range of dates, then another one to combine them to get all possible “user-days”, then another to retrieve the data for each and process it, and so on.
This results in very readable queries - way better than having subqueries all over the place. There are performance caveats to using CTEs so sometimes I have to structure the query differently, but it works well for a lot of them.
The docs for Postgres are amazing, and a good resource for this. It will call out which parts of SQL it’s explaining are Postgres-specific/not in the SQL standard.
I’m a very from first principles thinker and so I hope this recommendation isn’t off the mark for your needs. I really liked “Relational Theory for Computer Professionals” by C.J. Date. The book is roughly broken up into parts, the first is an introduction to relational theory and what that’s all about. This is the best intro to relational algebra that I’ve ever seen, a close second is the Stanford Database online class (you can just study the SQL part of the course). The second part of the book takes what you now know about relational algebra and shows how it fits with SQL.
This helped me peel away some of the syntactic confusion around SQL syntax and made the whole concept make more sense.
My 2 favorite resources over the years have been: “SQL for web nerds: http://philip.greenspun.com/sql/” and the official postgres docs.
You can also check this book : https://twitter.com/BonesMoses/status/832983048266330113 and his blog how is full of very good sql
Not agree with ORM. I’m so glad to use ActiveRecord or Ecto and not write all the SQL. I find it easier to reeuse (scope), more secure (http://rails-sqli.org/), easier for new comers and more consistent accros project.
That’s all convenient until you need to use things like PostGIS or another non-lowest-common-denominator piece of database functionality, or deal with performance issues.
A good candidate for the testing and practices tags. :)
But I can’t change tags after?
Curious about a VPN for west-europe that is fast. I used PrivateInternetAccess before but it was slow.
I don’t have any recommendations but a reasonable starting point is /r/vpnreviews.
Working on user stats using AWS redhsift + rds. Discovered dblink and materialized views : https://aws.amazon.com/blogs/big-data/join-amazon-redshift-and-amazon-rds-postgresql-with-dblink/
Big database migration in progress. We’re moving two hot tables of about 500M rows each from Postgres to DynamoDB. Having a lively workload on Postgres has made for a lively time keeping the DB in good shape, so we’re excited to move it to a more fully managed DB service. Shout-outs to the composite bridge design pattern and to AWS support in Sydney for shooting the shit about Postgres transaction_id exhaustion.
I’m looking for more resources like this. I’ve enjoyed this talk. For the psqlrc you can find it here : http://www.craigkerstiens.com/2015/12/29/my-postgres-top-10-for-2016/
That’s quite a good approach. For me the most difficult with Top-Down Design is to know when to stop defining “the whole thing”. At which level?