1. 8

    The default built-in storage engine is a log-structured merge database. It is very fast, faster than LevelDB, supports nested transactions, and stores all content in a single disk file.

    Damn. Time to go read that source code.

    1. 3

      I hope you report back with your findings!

      1. 1

        Do you know if sqlite4 is ready for usage? Sqlite.org doesn’t have any direct reference to it, you seem to have to know about the src4 URL to get to it and then you have to pull the code directly from fossil as far as I can tell..

        1. 1

          It doesn’t seem to be generally available so I’m guessing no.

      1. 2

        I built something exactly like this. I never got around to monetizing it. I am almost certain it uses Quicklook.framework to generate the PNGs.

        Good work, @jpadilla. Have you been able to work out how to view specific pages in a word, pdf, or ppt? I think that could be really useful.

        I pitched my product as a way for mobile devices to quickly view the contents of an otherwise heavy document format without third-party libs.

        1. 1

          We’re leveraging a couple of already existing tools like ImageMagick, Tesseract, Ghostscript, and a couple other in a custom workflow software we built. We’ve looked into Quicklook.framework for iWork files, Omnigraffle, and Sketch which already have “previews” inside. The tricky part about those files is that they are actual package/folders, so uploading them is quite tricky. I haven’t seen anyone doing it right on web, so an idea might be to build a simple desktop uploader/sync tool.

          Edit: Yea we currently extract all pages from all documents that we can. Here’s an example JSON result for a .doc file with multiple pages.

          We currently have two APIs. One that one we’re pitching for client side web apps and a more complete one using webhooks. We have a couple of client libraries, including an Ember.js component, that use the simple API that powers the demo on the website.

          1. 1

            My focus at the time was high quality PDF to PNG and Quicklook.framework beat out ImageMagick by a long shot.

            1. 1

              That might actually be interesting. I could always check if the PDF has a Quicklook preview and use that instead which would be faster.

              1. 1

                I don’t recall PDFs having embedded Quicklook previews, but the framework could generate PNGs from PDFs really fast and look as good as they would in the Preview.app. The only drawback was that it had to run on OSX and mac hardware, but I found a good mac hosting company.

                If you’d like to chat further feel free to connect with me stefan at natchev.com; Cheers.

        1. 1

          The fact that Ruby came in 3 times more popular than the nearest other language in 2008 should explain the “Sharp decline” in the years since.

          1. 2

            Thanks for the background. Just a general question about this tool, what would be the use-case for someone to target both C++ and Python? Is this meant for producing (for instance) portable libraries or full applications?

            1. 3

              A few possible use cases:

              • I believe it was actually started by one of the maintainers of the Haxe / SublimeText plugin. The plugin had to be written in Python, but he wanted to be able to write it in Haxe (to invite more collaboration), and so he decided to try implement a python target, and a few other people pitched in to help. So that is one use-case: you need to write code for a Python environment, but are most comfortable with Haxe.

              • You have a Haxe code base, but you need to write some tools (build tools etc) and Python has better library support. For example the Flambe is written in Haxe, but uses a Python-based tool to monitor assets and perform live-reloading. This tool could be rewritten in Haxe now.

              • You have an existing library (like this 2D physics library) written in Haxe, and you want to use it in Python. This would generate the python code, then you just need to figure out how to wrangle it in to a format so you can bundle it is a library.

              • If you had an existing Python web-app, but wanted to use something like Haxe for the client side JS, you could now use Haxe/Python on the server to glue the two together, allowing sharing of code between client and server, and having a more water-tight integration.

              At the end of the day, I think this was written as a “because I can” target - it was not difficult to start it for the original project (a sublime text plugin), and a few people helped out. It gives me some hope that whatever the next-big-platform is, if I still prefer the Haxe language it’s not too hard to just write another target :)

            1. 3

              Perhaps requiring the poster to add the first comment explaining what motivated him/her to submit it might be a good way to start a discussion. Also it would act as a first filter against people submitting flame bait or other low quality articles.

              1. 2

                Given what I’ve heard about the pitfalls of outsourcing, I wonder why outsourcing test implementation hasn’t taken greater hold in corporate America? All one would need is to have two independent companies, one of which writes the tests and the other which is tasked with “grading” a subset of those tests, the testability of the code to be tested, and making recommendations for refactoring. If a contract can be written to make the profits of those companies dependent on their performance, this system could eventually be made to work well. It would take some doing and building of relationships and expectations. It would not be simple, but it could work.

                Such a task would fit what I’ve heard about the work patterns, and even the work pathologies, of outsourcing companies in India. It’s a suitable task for 1st year hires. Such companies are motivated to produce large amounts of output, but can also be directed through proper incentives to produce decent output. This would be particularly true if compensation structures included a disincentive for “dead” tests that never found errors.

                And yes, such a system can be gamed. Any system can be gamed. There is no substitute for building relationships and working relationships amongst good people that want to produce good results.

                1. 5

                  In my experience, writing a good test is more difficult than writing a good implementation. Perhaps, it can be the other way around?

                  1. 2

                    I’ve found that the difficulty of writing tests depends on several factors, three of the most important (in my unscientific experience):

                    1. Difficulty of the problem-at-hand
                    2. The relative impedence of the implied solution to the proposed API
                    3. Relative ‘power’[1] or capability of the underlying framework

                    The first is largely dealt with by breaking down the problem further, though sometimes you simply cannot – problems which are hard to verify are – indeed – hard to test (which is essentially just verification by another name). In these cases I think your experience is probably not observed, in that writing both halves of the ‘problem’ of writing tested code is difficult. More commonly I suspect that #2 and #3 bite you (as it does me), in particular, #2 is a very difficult problem, because if I take as an a priori requirement that a given API be supplied, then it is the case that writing the implementation might be utterly trivial, but mapping it to a particular API takes several intermediate steps. Take, for instance, the standard MVC architectural pattern. We have at least three layers of abstraction between the client-observed API (the “V”), and the storage-API (the data store behind the “M”), Trying to write code that tests the V and involves the M is quite hard, and usually brittle, and therefore quite painful. The mitigation strategy involves either confining tests and assertions to V (e.g., using something like capybara to poke a UI and then observe changes in the UI to confirm the behavior expected), or using mocks and the like to ensure that stuff that doesn’t result in UI changes at least gets the right set of messages sent down the line.

                    Fundamentally those are just hard problems to solve, but the tests doubtless add value – when they work. The do function as regression catchers (in a limited sense, particularly when refactoring), and they also serve as a way to work through building out the set of API transformations that turn a click into SQL. Depending on your team, your domain, and your preference, sometimes tests make the most sense to provide this ability, sometimes types do, sometimes QA people do – each has costs and benefits. Tests – I think – stand in between types and QA. Types have a lot of power to make static assertions and prevent you from changing assumptions on the fly – and after all, programming is just managed, repeated assumption and assertion – but they also can constrain you when refactoring if not well designed, preventing you from making changes until all the details are worked out. They can force you into local minima wrt. complexity. QA people, on the other hand, allow you to freely change code, and often ‘sneak past’ changes which technically break expected behavior, especially unintentionally. They also are open to human error in a way Types and Tests are not. Tests are a nice middleground, allowing static assertions but not necessarily tying you to ensuring every behavior is preserved as it was when refactoring (that is, they allow you to cut corners, like with QA), but at the same time they can be difficult to write, maintain (especially when they’re written poorly) and – especially – they can be difficult to believe. That is to say, it can be difficult to know that a particular test, especially one with mocks, is actually testing stuff. There are tools that help to address this problem in some areas, but it is a real problem.

                    I guess my beleaguered point here is that I think writing a good test can be more difficult than writing a good implementation, but it’s not necessarily the case, nor is it correlated – that is, a bad test can effectively test a good implementation, a good test can effectively test a bad implementation, and so on. The point of the test is to verify in some mostly-static sense that your assumptions and assertions are correct.

                    [1] I hate this term, or – at least – have come to hate it. Software is not ‘powerful’, it is sometimes ‘capable’ and often ‘incapable’, but capability and power are different concepts and it’s hard to say that something is narrowly capable using the language of ‘power’. For instance, a DSL can be very capable (and should be, by definition), but it is not necessarily “powerful” (in the sense of ‘power’ as being the ability to do (general) work, rather than specific work).

                1. 2

                  Add one more to the list. I recently discovered Tup http://gittup.org/tup

                  Does anybody have any experience using Tup over Make? Is it yet another DSL or does it offering something more compelling?

                  1. 2

                    I futzed with it a while ago. I found it to be interesting, but ultimately not very flexible – in particular doing any sort of compilation across nested subdirectories was pretty gnarly (I would’ve had to write some script to create a Tupfile that generated the right dependency lines, because tup doesn’t see past the first layer of subdirs. The neat thing is the use of fsevents to track dependencies, that was a stroke of genius, I just wish it was a bit more amenable to deeply nested source trees (like the one I’m stuck with) that have code at every level (like the one I’m stuck with).