1. 18

    What I found most interesting in this piece (so in other words, it resonates with my personal motivation and bias to a large degree) is down in the comments (Mathias Hasselmann):

    “They simply don’t know their target audience. They design with that toxic idea in mind that “Grandma and Grandma must be able to use this”, entirely ignoring that world outside their ivory towers, where causal users happily do all their computing needs on mobile devices, not desktops. The desktop has shrunk. It’s not mainstream anymore. It’s a tool for information workers again, and making the desktop useless for information works will not bring a single mobile user back, but it will scare away more and more professionals for Linux at least.”

    I have a much longer diatribe in the out-queue as it strongly relates to my projects and frustration with how user integration with computing is developing, but really - the concessions everyone (from OSX to Windows and onwards) seem to make race away from “silent/passive by default, configurable mechanisms to your desires” towards “preset hidden policies to match our perception of what you want - it just works” rather than advancing the former to be more ergonomic, discoverable etc.

    1.  

      For someone who knows absolutely nothing about gaming, World of Warcraft, or this thing in particular… what is this? Can someone explain?

      1.  

        World of Warcraft is a massively popular 16 years old game, and maybe one of the most popular games ever. Since its launch in 2004 it’s been changing and evolving into what it is today, which is something completely different to what it was in its inception. Given that a large number of people would like to play the Vanilla WoW, that is, the first version of the game before any expansion was released, Blizzard has decided to roll out a “classic” version with all the content prior to the first expansion. This expansion was released in late 2006, and since there have been many more. Before the company’s official announcement that they would be releasing this classic version, many requests were made for it by fans but they were turned down by Blizzard citing several arguments such as: “the vanilla wow doesn’t exist anymore since the codebase has continued to evolve” and “Vanilla wow would be looking back and we want to move forward”. However, a vanilla WoW paid server named Nostalrius, maintained by fans for fans gained such popularity that during its peak it had more than 100k players on it. Sadly, it had to be closed in 2016 after Blizzard sent them a cease and desist order. It would seem that from the whole episode the company realized that there was actually a market for a classic WoW and they eventually changed their mind.

        1. 6

          World of Warcraft is a popular commercial subscription-based cloud-hosted enterprise legacy app featuring a low grade CRM system married to a highly complex logistics system in a standard 3 tier architecture deployed in a fully sharded configuration. Like many legacy systems, it has undergone significant schema mutation over the course of its deployed lifecycle in response to customer demand. Notably, it started out with a mostly-denormalized schema and, with the advent of improved database performance, a better understanding of the customer base’s requirement envelope, and feature creep, it has moved towards Codd’s 3rd normal form.

          As with many legacy apps, some customers’ business needs mandate that they stay pinned to older versions of the app. Interestingly, customers have here asked that an earlier version of a cloud-provided app be made available 12 years later, which poses some interesting issues having to do with incompatible schema migration. Given that the app is also written in a mix of obscure legacy languages, the traditional approach of simply migrating the queries and schema together is technically formidable.

          One established practice here is to create a proxy facade layer. In this pattern, you keep the interface to the legacy client application exactly the way it is, but create an intermediate layer which translates the db calls to and from the normalized format. This incurs round trip cost and bugs are common in edge cases, especially in frequently-undocumented minor shifts in API and field meaning, and especially given the expected low coverage of unit and functional tests in a 12 year old codebase. This technique is frequently overused owing to underestimation of the cost and time complexity of ferreting out the edge cases.

          The other established practice is to perform a one-time wholesale schema migration, normally done either through an ETL toolchain like Informatica, or more commonly through hand-written scripts. This approach frequently takes more developer time than the facade approach, owing to needing to “get it right” essentially all-at-once, and having a very long development loop.

          Whatever the technique used, schema migration programs of this scope need a crisp definition of what success looks like that’s clearly understood by all the involved developers, project managers, data specialists, and product leaders. Too frequently, these types of programs fail owing to incomplete specification and lack of clearly defined ownership boundaries and deliverable dependencies. The industry sector in which this legacy app resides is at greater than average risk for failure of high-scope projects due to fundamental and persistent organizational immaturity and improperly managed program scopes.

          Also, they better not nerf fear, because rogues were super OP in vanilla and getting the full 40 down the chain to rag with portals was tough enough.

          1.  

            As someone who levelled through Stranglethorn Vale via painstaking underwater+Unending Breath grinds in order to escape OP rogue stunlock love, I say to you: Bravo Sir!. Also, f**k the debuf cap.

        1. 5

          Last week

          • Fiddled with different ways of attaching to processes and viewing their states.
          • Some other technical stuff that went well

          This was for the low level debugger I’m trying to make.

          So, from what I’ve read and seen, tools that attach and inspect other process tend to just use gdb under the hood. I was hoping for a more minimal debugger to read and copy.

          lldb almost does what I need because of its existing external Python interface but documentation for writing a stand-alone tool (started from outside the debugger rather than inside) is scattered. I haven’t managed to make it single step.

          Using raw ptrace and trying to read the right memory locations seems difficult because of things like address randomization. And getting more information involves working with even more memory mapping and other conventions.

          I wish all these conventions were written in some machine readable language agnostic way so I don’t have to human-read each one and try to implement it. Right now this is all implicit in the source code of something like gdb. This is a lot of extra complexity which has nothing to do with what I’m actually trying to accomplish.

          The raw ptrace approach would also likely only work for Linux. And possibly strong tied to C or assembly.

          The problem with the latter is that eventually I will want to do this to interpreters written in C or even interpreters written in interpreters written in C. Seems like even more incidental complexity in that way.

          An alternative is to log everything and have a much fancier log viewer after the fact. This way the debugged program only need to emit the right things to a file or stdout. But this loses the possibility for any interactivity.

          Plus, all of this would only be worth it if I can get some state visualization customizable to that specific program (because usually it will be an interpreter).

          Other questions: How to avoid duplicating the work when performing operations from “inside the program” and from “outside” through the eventual debugger?

          Other ideas: Try to do this with a simpler toy language/system to get an idea of how well using such a workflow would work in the first place.

          Some references

          This week

          • Well, now that I have a better idea of how deep this rabbit hole is, I need to decide what to do. Deciding is much harder than programming…
          • Or maybe I should do one of the other thousand things I want to and have this bit of indecision linger some more.
          1. 5

            I wrote a very simple PoC debugger in Rust if you are interested in the very basics: https://github.com/levex/debugger-talk

            It uses ptrace(2) under the hood, as you would expect.

            1. 1

              Thanks! I’ve a had a look at your slide and skimmed some of your code (don’t have Rust installed or running would be the first thing I’d do).

              I see that you’re setting breakpoints by address. How do you figure out the address at which you want to set a breakpoint though?

              How long did it take to make this? And can you comment on how hard it would be to continue from this point on? For example reading C variables and arrays? Or getting line numbers from the call stack?

              1. 2

                Hey, sorry for the late reply!

                In the talk I was setting breakpoints by address indeed. This is because the talked focused on the lower-level parts To translate line numbers into addresses and vice-versa you need access to the “debug information”. This is usually stored in the executable (as decribed by the DWARF file format). There are libraries that can help you with this (just as the disassembly is done by an excellent library instead of my own code).

                This project took about a week of preparation and work. I was familiar with the underlying concepts, however Rust and its ecosystem was a new frontier for me.

                Reading C variables is already done :-), reading arrays is just a matter of a new command and reading variables sequentially.

                1. 1

                  Thanks for coming back to answer! Thanks to examples from yourself and others I did get some stuff working (at least on the examples I tried) like breakpoint setting/clearing, variable read/write and simple function calls.

                  Some things from the standards/formats are still unclear, like why I only need to add the start of the memory region extracted from /proc/pid/maps if its not 0x400000.

                  This project took about a week of preparation and work. I was familiar with the underlying concepts, however Rust and its ecosystem was a new frontier for me.

                  A week doesn’t sound too bad. Unfortunately, I’m in the opposite situation using a familiar system to do something unfamiliar.

                  1. 2

                    I think that may have to do with whether the executable you are “tracing” is a PIE (Position-Independent Executable) or not.

                    Good luck with your project, learning how debuggers work by writing a simple one teaches you a lot.

                2. 2

                  For C/assembly (and I’ll assume a modern Unix system) you’ll need to read up on ELF (object and executable formats) and DWARF (debugging records in an ELF file) that contain all that information. You might also want to look into the GDB remote serial protocol (I know it exists, but I haven’t looked much into it).

                  1. 1

                    Well, I got some addresses out of nm ./name-of-executable but can’t peek at those directly. Probably need an offset of some sort?

                    There’s also dwarfdump I haven’t tried yet. I’ll worry about how to get this info from inside my tracer a bit later.

                    Edit: Nevermind, it might have just been the library I’m using. Seems like I don’t need an offset at all.

                    1. 2

                      I might have missed some other post, but is there a bigger writeup on this project of yours? As to the specifics of digging up such information, take a look at ECFS - https://github.com/elfmaster/ecfs

                      1. 1

                        I might have missed some other post, but is there a bigger writeup on this project of yours?

                        I’m afraid not, at least for the debugger subproject. This is the context. The debugger would fit in two ways:

                        • Since I have a GUI maker, I can try to use it to make a graphical debugger. (Ideally, allowing custom visualizations created for each new debugging task.)
                        • A debugger/editor would be useful for making and editing [Flpc]((github.com/asrp/flpc) or similar. I want to be able to quickly customize the debugger to also be usable as an external Flpc debugger (instead of just a C debugger). In fact, it’d be nice if I could evolve the debugger and target (=interpreter) simultaneously.

                        Although I’m mostly thinking of using it for the earlier stages of development. Even though I should already be past that stage, if I can (re)make that quickly, I’ll be more inclined to try out major architectural changes. And also add more functionality in C more easily.

                        Ideally, the debugger would also be an editor (write a few instructions, set SIGTRAP, run, write a few more instructions, etc; write some other values to memory here and there). But maybe this is much more trouble than its worth.

                        Your senseye program might be relevant depending on how customizable (or live customizable) the UI is. The stack on which its built is completely unknown to me. Do you have videos/posts where you use it to debug and/or find some particular piece of information?

                        As to the specifics of digging up such information, take a look at ECFS - https://github.com/elfmaster/ecfs

                        I have to say, this looks really cool. Although in my case, I’m expecting cooperation from the target being debugged.

                        Hopefully I will remember this link if I need something like that later on.

                        1. 2

                          I have to say, this looks really cool. Although in my case, I’m expecting cooperation from the target being debugged.

                          My recommendation, coolness aside, for the ECFS part is that Ryan is pretty darn good with the ugly details of ELF and his code and texts are valuable sources of information on otherwise undocumented quirks.

                          Your senseye program might be relevant depending on how customizable (or live customizable) the UI is. The stack on which its built is completely unknown to me. Do you have videos/posts where you use it to debug and/or find some particular piece of information?

                          I think the only public trace of that is https://arcan-fe.com/2015/05/24/digging-for-pixels/ but it only uses a fraction of the features. The cases I use it for on about a weekly basis touch upon materials that are NDAd.

                          I have a blogpost coming up on how the full stack itself map into debugging and what the full stack is building towards, but the short short (yet long, sorry for that, the best I could do at the moment) version:

                          Ingredients:

                          Arcan is a display server - a poor word for output control, rendering and desktop IPC subsystem. The IPC subsystem is referred to as SHMIF. It also comes with a mid level client API: TUI which roughly correlates to ncurses, but with more desktop:y featureset and sidesteps terminal protocols for better window manager integration.

                          The SHMIF IPC part that is similar to a ‘Window’ in X is referred to as a segment. It is a typed container comprised of one big block (video frame), a number of small chunked blocks (audio frames), two ring buffers as input/output queue that carry events and file descriptors.

                          Durden act a window manager (Meta-UI).This mostly means input mapping, configuration tracking, interactive data routing and window layouting.

                          Senseye comes in three parts. The data providers, sensors, that have some means of sampling with basic statistics (memory, file, ..) which gets forwarded over SHMIF to Durden. The second part is analysis and visualization scripts built on the scripting API in Arcan. Lastly there are translators that are one-off parsers that take some incoming data from SHMIF, parses it and renders some human- useful human- level output, optionally annotated with parsing state metadata.

                          Recipe:

                          A client gets a segment on connection, and can request additional ones. But the more interesting scenario is that the WM (durden in this case) can push a segment as a means of saying ‘take this, I want you to do something with it’ and the type is a mapping to whatever UI policy that the WM cares about.

                          One such type is Debug. If a client maps this segment, it is expected to populate it with whatever debugging/troubleshooting information that the developer deemed relevant. This is the cooperative stage, it can be activated and deactivated at runtime without messing with STDERR and we can stop with the printf() crap.

                          The thing that ties it all together - if a client doesn’t map a segment that was pushed on it, because it doesn’t want to or already have one, the shmif-api library can sneakily map it and do something with it instead. Like provide a default debug interface preparing the process to attach a debugger, or activate one of those senseye sensors, or …

                          Hierarchical dynamic debugging, both cooperative and non-cooperative, bootstrapped by the display server connection - retaining chain of trust without a sudo ptrace side channel.

                          Here’s a quick PoC recording: https://youtu.be/yBWeQRMvsPc where a terminal emulator (written using TUI) exposes state machine and parsing errors when it receives a “pushed” debug window.

                          So what I’m looking into right now is writing the “fallback” debug interface, with some nice basics, like stderr redirect, file descriptor interception and buffer editing, and a TUI for lldb to go with it ;-)

                          The long term goal for all this is “every byte explained”, be able to take something large (web browser or so) and have the tools to sample, analyse, visualise and intercept everything - show that the executing runtime is much more interesting than trivial artefacts like source code.

                          1. 1

                            Thanks! After reading this reply, I’ve skimmed your lastest post submitted here and on HN. I’ve added it to my reading list to considered more carefully later.

                            I don’t fully understand everything yet but get the gist of it for a number of pieces.

                            I think the only public trace of that is https://arcan-fe.com/2015/05/24/digging-for-pixels/ but it only uses a fraction of the features.

                            Thanks, this gives me a better understanding. I wouldn’t minding seeing more examples like this, even if contrived.

                            In my case I’m not (usually) manipulating (literal) images or video/audio streams though. Do you think your project would be very helpful for program state and execution visualization? I’m thinking of something like Online Python Tutor. (Its sources is available but unfortunately everything is mixed together and its not easy to just extract the visualization portion. Plus, I need it to be more extensible.)

                            For example, could you make it so that you could manually view the result for a given user-input width, then display the edges found (either overlayed or separately) and finally after playing around with it a bit (and possibly other objective functions than edges), automatically find the best width as show in the video? (And would this be something that’s easy to do?) Basically, a more interactive workflow.

                            The thing that ties it all together - if a client doesn’t map a segment that was pushed on it, because it doesn’t want to or already have one, the shmif-api library can sneakily map it and do something with it instead.

                            Maybe this is what you already meant here and by your “fallback debug interface” but how about having a separate process for “sneaky mapping”? So SHMIF remains a “purer” IPC but you can an extra process in the pipeline to do this kind of mapping. (And some separate default/automation can be toggled to have it happen automatically.)

                            Hierarchical dynamic debugging, both cooperative and non-cooperative, bootstrapped by the display server connection - retaining chain of trust without a sudo ptrace side channel.

                            Here’s a quick PoC recording: https://youtu.be/yBWeQRMvsPc where a terminal emulator (written using TUI) exposes state machine and parsing errors when it receives a “pushed” debug window.

                            Very nice! Assuming I understood correctly, this takes care of the extraction (or in your architecture, push) portion of the debugging

                            1. 3

                              Just poke me if you need further clarification.

                              For example, could you make it so that you could manually view the result for a given user-input width, then display the edges found (either overlayed or separately) and finally after playing around with it a bit (and possibly other objective functions than edges), automatically find the best width as show in the video? (And would this be something that’s easy to do?) Basically, a more interactive workflow.

                              The real tool is highly interactive, it’s the basic mode of operation, it’s just the UI that sucks and that’s why it’s being replaced with Durden that’s been my desktop for a while now. This video shows a more interactive side: https://www.youtube.com/watch?v=WBsv9IJpkDw Including live sampling of memory pages (somewhere around 3 minutes in).

                              Maybe this is what you already meant here and by your “fallback debug interface” but how about having a separate process for “sneaky mapping”? So SHMIF remains a “purer” IPC but you can an extra process in the pipeline to do this kind of mapping. (And some separate default/automation can be toggled to have it happen automatically.)

                              It needs both, I have a big bag of tricks for the ‘in process’ part, and with YAMA and other restrictions on ptrace these days the process needs some massage to be ‘external debugger’ ready. Though some default of “immediately do this” will likely be possible.

                              I’ve so far just thought about it interactively with the sortof goal that it should be, at most, 2-3 keypresses from having a window selected to be digging around inside it’s related process no matter what you want to measure or observe. https://github.com/letoram/arcan/blob/master/src/shmif/arcan_shmif_debugif.c ) not finished by any stretch binds the debug window to the TUI API and will present a menu.

                              Assuming I understood correctly, this takes care of the extraction (or in your architecture, push) portion of the debugging

                              Exactly.

                              1. 2

                                Thanks. So I looked a bit more into this.

                                I think the most interesting part for me at the moment is the disassembly.

                                I tried to build it just to see. I eventually followed these instructions but can’t find any Senseye related commands in any menu in Durden (global or target).

                                I think I managed to build senseye/senses correctly.

                                Nothing obvious stands out in tools. I tried both symlinks

                                /path/to/durden/durden/tools/senseye/senseye
                                /path/to/durden/durden/tools/senseye/senseye.lua
                                

                                and

                                /path/to/durden/durden/tools/senseye
                                /path/to/durden/durden/tools/senseye.lua
                                

                                Here are some other notes on the build process

                                Libdrm

                                On my system, the include -I/usr/include/libdrm and linker flag -ldrm are needed. I don’t know cmake so don’t know where to add them. (I manually edited and ran the commands make VERBOSE=1 was running to get around this.)

                                I had to replace some CODEC_* with AV_CODEC_*

                                Durden

                                Initially Durden without -p /path/to/resources would not start saying some things are broken. I can’t reproduce it anymore.

                                Senseye
                                cmake -DARCAN_SOURCE_DIR=/path/to/src ../senses
                                

                                complains about ARCAN_TUI_INCLUDE_DIR and ARCAN_TUI_LIBRARY being not found:

                                Make Error: The following variables are used in this project, but they are set to NOTFOUND.
                                Please set them or make sure they are set and tested correctly in the CMake files:
                                ARCAN_TUI_INCLUDE_DIR
                                
                                Capstone

                                I eventually installed Arcan instead of just having it built and reached this error

                                No rule to make target 'capstone/lib/libcapstone.a', needed by 'xlt_capstone'.
                                

                                I symlinked capstone/lib64 to capstone/lib to get around this.

                                Odd crashes

                                Sometimes, Durden crashed (or at least exited without notice) like when I tried changing resolution from inside.

                                Here’s an example:

                                Improper API use from Lua script:
                                	target_disphint(798, -2147483648), display dimensions must be >= 0
                                stack traceback:
                                	[C]: in function 'target_displayhint'
                                	/path/to/durden/durden/menus/global/open.lua:80: in function </path/to/durden/durden/menus/global/open.lua:65>
                                
                                
                                Handing over to recovery script (or shutdown if none present).
                                Lua VM failed with no fallback defined, (see -b arg).
                                
                                Debug window

                                I did get target->video->advanced->debug window to run though.

                                1. 2

                                  I’d give it about two weeks before running senseye as a Durden extension is in a usable shape (with most, but not all features from the original demos).

                                  A CMake FYI - normally you can patch the CMakeCache.txt and just make. Weird that it doesn’t find the header though, src/platform/cmake/FindGBMKMS.cmake quite explicitly looks there, hmm…

                                  The old videos represent the state where senseye could run standalone and did its own window management. For running senseye in the state it was before I started breaking/refactoring things the setup is a bit different and you won’t need durden at all. Just tested this on OSX:

                                  1. Revert to an old arcan build ( 0.5.2 tag) and senseye to the tag in the readme.
                                  2. Build arcan with -DVIDEO_PLATFORM=sdl (so you can run inside your normal desktop) and -DNO_FSRV=On so the recent ffmpeg breakage doesn’t hit (the AV_CODEC stuff).
                                  3. Build the senseye senses like normal, then arcan /path/to/senseye/senseye

                                  Think I’ve found the scripting error, testing when I’m back home - thanks.

                                  The default behavior on scripting error is to shutdown forcibly even if it could recover - in order to preserve state in the log output, the -b arguments lets you set a new app (or the same one) to switch and migrate any living clients to, arcan -b /path/to/durden /path/to/durden would recover “to itself”, surprisingly enough, this can be so fast that you don’t notice it has happened :-)

                                  1. 1

                                    Thanks, with these instructions I got it compiled and running. I had read the warning in senseye’s readme but forgot about it after compiling the other parts.

                                    I’m still stumbling around a bit, though that’s what I intended to do.

                                    So it looks like the default for sense_mem is to not interrupt the process. I’m guessing the intended method is to use ECFS to snapshot the process and view later. But I’m actually trying to live view and edit a process.

                                    Is there a way to view/send things through the IPC?

                                    From the wiki:

                                    The delta distance feature is primarily useful for polling sources, like the mem-sense with a refresh clock. The screenshot below shows the alpha window picking up on a changing byte sequence that would be hard to spot with other settings.

                                    Didn’t quite understand this example. Mem diff seems interesting in general.

                                    For example, I have a program that changes a C variable’s value every second. Assuming we don’t go read the ELF header, how can senseye be used to find where that’s happening?

                                    From another part of the wiki

                                    and the distinct pattern in the point cloud hints that we are dealing with some ASCII text.

                                    This could use some more explanation. How can you tell its ASCII from just a point cloud??

                                    Minor questions/remark

                                    Not urgent in any way

                                    • Is there a way to start the process as a child so ./sense_mem needs less permissions?
                                    • Is there a way to view registers?
                                    Compiling

                                    Compiling senseye without installing Arcan with cmake -DARCAN_SOURCE_DIR= still gives errors.

                                    I think the first error was about undefined symbols that were in platform/platform.h (arcan_aobj_id and arcan_vobj_id).

                                    I can try to get the actual error message again if that’s useful.

                                    1. 2

                                      Thanks, with these instructions I got it compiled and running. I had read the warning in senseye’s readme but forgot about it after compiling the other parts. I’m still stumbling around a bit, though that’s what I intended to do.

                                      From the state you’re seeing it, it is very much a research project hacked together while waiting at airports :-) I’ve accumulated enough of a idea to distill it into something more practically thought together - but not there quite yet.

                                      Is there a way to view/send things through the IPC?

                                      At the time it was written, I had just started to play with that (if you see the presentation slides, that’s the fuzzing bit, the actual sending works very much like a clipboard paste operation), the features are in the IPC system now, not mapped into the sensors though.

                                      So it looks like the default for sense_mem is to not interrupt the process. I’m guessing the intended method is to use ECFS to snapshot the process and view later. But I’m actually trying to live view and edit a process.

                                      yeah, sense_mem was just getting the whole “what does it take to sample / observe process memory without poking it with ptrace etc. Those controls and some other techniques are intended to be bootstrapped via the whole ipc-system in the way I talked about earlier. That should kill the privilege problem as well.

                                      Didn’t quite understand this example. Mem diff seems interesting in general.

                                      The context menu for a data window should have a refresh clock option. If that’s activated, it will re-sample the current page and mark which bytes changed. Then the UI/shader for alpha window should show which bytes those are.

                                      For example, I have a program that changes a C variable’s value every second. Assuming we don’t go read the ELF header, how can senseye be used to find where that’s happening?

                                      The intended workflow was something like “dig around in memory, look at projections or use the other searching tools to find data of interest” -> attach translators -> get symbolic /metadata overview.

                                      and the distinct pattern in the point cloud hints that we are dealing with some ASCII text. This could use some more explanation. How can you tell its ASCII from just a point cloud??

                                      See the linked videos on “voyage of the reverse” and the recon 2014 video of “cantor dust”, i.e. a feedback loop of projections + training + experimentation. The translators was the tool intended to make the latter stage easier.

                                  2. 1

                                    I’d give it about two weeks before running senseye as a Durden extension is in a usable shape (with most, but not all features from the original demos).

                                    A CMake FYI - normally you can patch the CMakeCache.txt and just make. Weird that it doesn’t find the header though, src/platform/cmake/FindGBMKMS.cmake quite explicitly looks there, hmm…

                                    The old videos represent the state where senseye could run standalone and did its own window management. For running senseye in the state it was before I started breaking/refactoring things the setup is a bit different and you won’t need durden at all. Just tested this on OSX:

                                    1. Revert to an old arcan build ( 0.5.2 tag) and senseye to the tag in the readme.
                                    2. Build arcan with -DVIDEO_PLATFORM=sdl (so you can run inside your normal desktop) and -DNO_FSRV=On so the recent ffmpeg breakage doesn’t hit (the AV_CODEC stuff).
                                    3. Build the senseye senses like normal, then arcan /path/to/senseye/senseye

                                    Think I’ve found the scripting error, testing when I’m back home - thanks.

                                    The default behavior on scripting error is to shutdown forcibly even if it could recover - in order to preserve state in the log output, the -b arguments lets you set a new app (or the same one) to switch and migrate any living clients to, arcan -b /path/to/durden /path/to/durden would recover “to itself”, surprisingly enough, this can be so fast that you don’t notice it has happened :-)

                3. 3

                  If you are looking for references on debuggers then the book How Debuggers Work may be helpful.

                1. 2
                  1. 6

                    Having gone through the process multiple times in my more formative of years I can only chime in and say that the value of this exercise can’t be understated - a lot of computing unfolds the more you do it ‘raw’ (including reversing) and the deeper you dive (everything is buggy, the bugs need to be discovered, replicated and timing is a bitch) the more you get. It’s the perfect area for training reverse engineering, cracking, …

                    See also: https://patpend.net/articles/ar/aev021.txt

                    1. 4

                      FWIW - though I rarely agree with the decisions in libinput on this matter or others, who-t deserves praise for the work, rigour and analysis done here.

                      That said, there’s some kind of “facepalm” to be had in that there’s more effort and real engineering being put into assuring a physical mapping between mouse samplerate/sensor resolution and physical travel than it is in other parts of the display stack (big topic, but it ties into mixed-DPI output and Waylands quite frankly retarded solution with using buffer scale factors).

                      My personal opinion / experience is that acceleration is the wrong solution for the problem - and there’s a draft in my article pile digging into ‘why’. The problem being that with big / multiple screens, the travel time (=effort providing the input) for moving the mouse cursor between different targets (windows), warrants breaking linearity for.

                      An alternative is what I added in durden for keyboard-dominant window management schemes, where the WM remembers position per window and ‘warps’ when your keyboard bindings change the selected window.

                      For stacking/floating, get an eye tracker(!) and let eye gaze region determine cursor start position (bias with a sobel filter and contrast within that region) and mouse motion set linear delta from that. Nobody has done that yet though ;-)

                      1. 1

                        I seem to recall similar acceleration settings on other platforms (though, with a single slider for acceleration ratio). Has anybody done an analysis with similar curves for, ex., mouse acceleration on Windows?

                        1. 1

                          in part 3, “I’ll probably also give the windows/OS X approaches a try (i.e. same curve, different constant deceleration) and see how that goes. If it works well, that may be a a solution because it’s easier to scale into a large range. Otherwise, shrug, someone will have to come with a better solution.”

                          1. 1

                            Ah, I missed that. Thanks!

                        1. 2

                          So I couldn’t tell from skimming this article, does this software at all address the X11-lets-anyone-be-a-keylogger problem?

                          1. 5

                            This article is only ‘ranty technical notes from when I …’ the full ‘how does this actually solve the X11 and Wayland problems (that I know of)’ needs a much more careful and thorough explanation.

                            But yes, the software itself addresses the keylogger problem, the clipboard monitoring problem, the window/screen sharing problem, the display gamma control problem and the window management problem without sacrificing features.

                            1. 2

                              Where can I find more info on those claims?

                              1. 4

                                It’s currently scattered in the wiki and the blog posts. it’s a huge topic that I think should be considered in the context of the features each security risk is related to, not in isolation like “no screen reading”, “no keylogging” or the perspective suffers. Therefore, for a coherent comparison - there is a three-article series currently being written with the first one being close to finished. Wait for ‘Arcan vs. Xorg: nearing feature parity’ (followed by ‘at feature parity’ and ‘beyond feature parity’).

                                The much condensed basic idea:

                                1. The WM explicitly sets routing policy for all input, including the ability to filter, synthesise (fake), broadcast or multicast it.
                                2. The WM has access to trust origin from two mechanisms. Trusted client being spawned by the display server process, inheriting IPC primitives. Untrusted clients connecting through a consume-on-use named connection point.
                                3. A client connection gets a fixed set of basic primitives, and negotiates for more with the default policy being reject.
                                4. The WM can assign different policy- and permission- sets based on trust origin from 2.
                                5. User picks the WM (set of scripts).
                          1. 3

                            doing what weston-launch did, but in a saner way

                            weston-launch is already pretty sane :) Shameless plug: I Rewrote It In Rust (well, not protocol compatible with the original weston-launch, just the same idea.) My version uses Capsicum sandboxing on FreeBSD. The (privileged) parent has a socket to the child, a kqueue and a descriptor to /dev, that’s it. No way to open ~/.ssh/secret_key, it can only openat files under /dev.

                            [evdev] gets an ever increasing pile of “helper” libraries (libevdev, libudev, libinput, libratbag, libkilljoy, libreadabuttonpress, libvaluebetween0and1, …)

                            Okay, that’s not fair. udev is hotplug, and ratbag is configuring dank gaming LEDs on gaming mice. It makes perfect sense that the kernel reports simple events and userspace (libinput) interprets it in complex ways (touch gestures, calibration, etc.)

                            evdev is one of the few parts of Linux that don’t suck, and I’m very happy to have it on FreeBSD too.

                            1. 6

                              weston-launch is already pretty sane :)

                              It’s the decoupling between -launch and the main process being a hard coded exec I have a beef with. Hurts when you have multiple versions installed etc. So yes, I’ve had futile weston debugging sessions boiling down to forgetting that weston-launch does this.

                              I took a peak at your plug - now my Rust is a bit non-existing, but to me it look like you can path traverse out of drm and input from those starts_with checks and collect sensitive data from the other device nodes in there? i.e. why xenocara went with a whitelist.

                              evdev is one of the few parts of Linux that don’t suck, and I’m very happy to have it on FreeBSD too

                              Have you actually seriously worked with it, as in the real thing and not a reasonably clean “we ported the structs and ioctls” and without someone wrapping udev+libevent+device fixup database?

                              The cracks doesn’t start to show until you work with input devices that aren’t your basic keyboard and mouse. They widen quickly and large enough to swallow you whole after that. The ‘big box of fun’ input devices that I try whenever I feel the need to lower my spirits cover everything from joysticks that randomly gets exposed as a keyboard and as a joystick depending on what else I have plugged into the bus, mice and keyboards that expose LEDs but then require separate custom hid packets for the changes to activate, etc.

                              To clarify my stance: evdev is much too simple for the devices it provides access for - lacking basics like filter controls, ambiguous and unreliable device identity (both collisions and failure to track the same device moving between connected ports), lost relationships (for devices that split up into 2+ separate subdevices). Just look at this small slice of nightmare fuel, a condensed version of what udev is doing.

                              1. 1

                                Yeah, protecting other devices is not something I bothered to do yet. But it’ll be easy to just open the input and dri subdirectories. A bit unfortunate that the virtual terminals aren’t in a subdirectory, but opening the vt doesn’t take user input.

                                About evdev, well, Synaptics touchpad + TrackPoint and Windows Precision touchscreen work fine. I actually added Apple Magic Trackpad support to bthidd and that mostly works too. I also made a little daemon that generates events on a virtual device based on a user script that gets events from other devices.

                                We don’t have gamepad support yet… also we don’t have udev anything, only a stub libudev that uses devd for hotplug notifications.

                                Is there a real need to classify devices as mouse/keyboard/etc? Why not assume that any device is everything at once? The only meaningful distinction I can think of is touchpad vs touchscreen (both send absolute events)…

                                1. 3

                                  Most gains of knowing the separation is in the user interface and the performance optimization stages, where it can be incredibly noticeable. Since everything is dynamic / hotplug and non-deterministic it’s hard to know if the system is in a usable state or not if you don’t preserve type. Then there’s the threat model that comes with something like a rubber ducky…

                                  Some generals: For keyboards you want to load/probe for translation tables/state machines, repeat-rate controls / recovery keybindings.

                                  For mice you want dynamic sample rate controls dependent on the type of your input focus, and with most optical mice today, detect / discard jitter if at rest. The more you can early-discard or memory map instead of queue+copy the better. If you work with accessibility or want to reduce physical motion, pair it with an eye tracker (adding support for that right now).

                                  For gamepads you really want them disabled at the lowest level until they are actually needed or they can murder your input stack. Something like the PS3 or PS4 controller exposes 30ish analog inputs, each at very high samplerates, even if they’re not used, they get sampled, read, forwarded, queued, copied, and much later discarded.

                                  touchpad vs touchscreen (there’s a lot here) both devices classes can actually send both absolute and relative events, think of the Z axis or “pressure” (some touchscreens can even give you a rough 2.5D heightmap of what it “sees” while some hide the analysis in firmware). These behave/“feel” better if you align or resample them to the sample rate of the current output display, and there’s big gains if you hook up to CPU/GPU governors. Precision and calibration varies with room temperature and grounding / signal environment.

                                  1. 1

                                    Yeah, protecting other devices is not something I bothered to do yet. But it’ll be easy to just open the input and dri subdirectories. A bit unfortunate that the virtual terminals aren’t in a subdirectory, but opening the vt doesn’t take user input.

                                    I guess it’s not very portable, but in the past I’ve used devfs rulesets[1] in chroots or jails for isolating device nodes for certain daemons.

                                    [1] https://www.freebsd.org/cgi/man.cgi?query=devfs.rules&sektion=5&n=1

                              1. 1

                                This is pretty cool and could drive me to trying out arcan, I wonder what would be needed to build durden on openbsd.

                                1. 2

                                  Can’t find my notes on what I needed but from a clean 6.3 there was not many packages needed (for arcan that is, durden is “only” lua scripts). llvm+clang+cmake for build deps then freetype/mesa/sqlite/egl/gbm along with the in-source stuff for lua/openal.

                                  The “last mile” stuff is still rather rough around the edges, but we’re just about reaching the whole “sane config presets”, “proper packaging” etc. stage - mechanisms first, then policies :-)

                                1. 8

                                  Finishing bringup of arcan on OpenBSD 6.3 (+article). Mostly rework to take advantage of EVFILT_DEVICE for display hotplug left. If there’s any time left over, I’ll investigate some bug where the keyboard goes nuts if I run wsconsctl from within a graphical terminal (manual ssh in, kill needed to recovery, console filled with wskbd_input: evar->q == NULL).

                                  1. 5

                                    Lots of hacking this week as I was forced to take some vacation. What better way is there than to spend it coding on my own projects? The big item will be posted to lobste.rs - but I also plan to start the revival of my experimental visual debugging / reverse engineering toolkit, Senseye (https://github.com/letoram/senseye/wiki) If you are interested in the domain - make sure to look at the voyage of the reverser presentation from BH2010.

                                    1. 8

                                      These kinds of posts tend to take one out of two stances on ‘network transparency’ depending on what the author actually wants to attack, and if there is a reference to the “standup over substance” Daniel Stone talk on X11 vs Wayland - you can be sure it will be the ‘haha, X isn’t actually transparent’ one, followed by perplexed users saying that it is actually very useful to them. The reddit thread referencing this article is one such example.

                                      The first camp talking about the drawing primitives and commands as such and saying that the actual drawing commands should be sent and treated locally in the same way as they are treated remotely and since the toolkits have all made data type and parameters opaque pixmaps, opaque transfers are needed.

                                      The ‘my use experience’ camp talks about the convenience in just saying ‘run graphics program’ from a remote shell and it seem to appear on the local machine similarly enough to how it would behave had it actually been run locally. That’s the meaning of transparency to them.

                                      The later perspective is, in my opinion, more interesting since it can be improved on in many more ways than just switching compression schemes. Incidentally, I might just have something for that in the works…

                                      1. 3

                                        Most basic applications work fine with X11 forwarding. Stone’s talk stating X isn’t transparent is really referring to specific use cases (direct rendering and opengl), which, of course, isn’t going to work over the network.

                                        I agree with his talk that we need to move forward from X, but you can’t just hand wave away a lot of features that people currently use. Usability matters. It took Pulseaudio a long time to get to a usable state where it’s not the first thing I try to disable (I think it works pretty well now, especially with bluetooth devices). Systemd is still terrible in terms of its usability with its terrible CLI interface.

                                        1. 4

                                          tone’s talk stating X isn’t transparent is really referring to specific use cases (direct rendering and opengl), which, of course, isn’t going to work over the network.

                                          Gaming over the network has been around for some time. Some products were around before the cloud became a thing. These days, some are doing gaming desktops and rendering workstations with cloud VM’s. Probably more like X is an outdated design being made to do things in totally different context it can’t handle.

                                          1. 2

                                            specific use cases (direct rendering and opengl), which, of course, isn’t going to work over the network.

                                            Couldn’t OpenGL, at least, be made to work over a network? It’s just commands which have larger effects; I’d think some sort of binary protocol could transmit it fairly effectively.

                                        1. 15

                                          Spiritually similar to Baudelaire and “the finest trick of the devil is to persuade you that he does not exist”, the biggest deception search engines played was switching the meaning of ‘search’ from ‘select from what it knows, that which bests matches the criterion you provided’ into ‘select from what it knows, influenced by what it knows about you, that which <some deity - advertisers, hostile governments, …> wants you to know’.

                                          The saving grace is still how embarrassingly crap they are at it ;-)

                                          1. 6

                                            I’m all over the place this week. Some minor things to fix on the OpenBSD port of Arcan, along with a writeup on the porting experience as such. Then there is some low level stuff when it comes to accelerated/compressed handle passing/mapping/drawing for multi-planar YUV formats that falls in ‘tedious, please kill me, has to be done..’. To not burn out on that I’m playing around with support for 180/360 stereoscopic video playback on HMDs.

                                            1. 4

                                              Your best bet is probably to avoid modifying the environment at all, if possible.

                                              This is sound advice. Consider the environment immutable and your sanity will be preserved.

                                              1. 3

                                                This starts getting a bit rough when you’re using libraries that are relying on stuff like locale.

                                                I don’t know if the solution is to edit the environment, but the fact that so many C libraries change behavior based on env bubbles up in so many places. Bit of a rough foundation

                                                1. 2

                                                  I treat the environment as immutable, but often call sub-processes with altered/augmented environments, e.g.

                                                  #!/usr/bin/env bash
                                                  doSomethingWith "$FOO" "$BAR"
                                                  FOO="A different foo" someOtherThing
                                                  

                                                  For LOCALE in particular, HaskeIl (compiled/interpreted with GHC, at least) will crash on non-ascii data if LOCALE isn’t set to an appropriate value, so I often invoke Haskell programs with LOCALE="en_US.UTF-8" myHaskellProgram.

                                                  I run into this a lot in build scripts (e.g. using Nix), since their environments are given explicitly, to aid reproducibility (rather than, say, inheriting such things from some ambient user or system config).

                                                  I imagine this would be extra painful if using libraries which need conflicting env vars.

                                                  Racket handles this quite nicely using dynamic binding (AKA “parameters”), which we can override for the evaluation of particular expressions. It feels very much like providing an overridden env to a sub-process. For example, I wrote this as part of a recent project (my first Racket project, actually):

                                                  ;; Run BODY with VARS present in the environment variables
                                                  (define (parameterize-env vars body)
                                                    (let* ([old-env (environment-variables-copy (current-environment-variables))]
                                                           [new-env (foldl (lambda (nv env)
                                                                             (environment-variables-set! env (first  nv)
                                                                                                         (second nv))
                                                                             env)
                                                                           old-env
                                                                           vars)])
                                                      (parameterize ([current-environment-variables new-env])
                                                        (body))))
                                                  

                                                  Looking at it now, it might have been nicer as a macro, like (with-env (("FOO" "new foo value")) body1 body2 ...).

                                                  1. 2

                                                    Locale- defined behaviour is indeed an ugly little duckling that will never become a swan. One of the weird bugs I’ve been called into over the years was a C+Lua VM + script that was outputting CSV with “,” as element separator in a longer processing chain (system was quite big, simplified for the sake of story).

                                                    The dev had been observant and actually checked the radix point and locale before relying on its behaviour in printf- related functions. Someone had linked in another library though, that indirectly called some X+Dbus nonsense, received language settings that way, and changed the locale mid-stream - asynchronously. Sometimes the workload was finished, sometimes a few gigabytes had gone by and corruption was a fact as floats turned into a,b rather than a.b after a while…

                                                  1. 1

                                                    To add to your pile, Systemic Software Debugging - free, almost entirely unknown and written by yours truly along with a colleague. It was not really intended as a book as such - the writing project had a few odd constraints given that the target was course material for mostly self-taught seniors that worked almost exclusively with debugging difficult problems, and we needed some common ground for the other parts of the course that I, sadly, can’t go into. Got some stories out of it though :-) …

                                                    Some of my thesis work a while back was also in this direction, though it had to be disguised because academia and funding disapprove when you get too practical (though there is ‘retooling and securing systemic debugging’ as an article). From that dark period I’ve gone through a few books as well. Some that comes to mind:

                                                    “The Science of Debugging” (telles/hsieh), “Debugging by thinking” (metzger), “If I only changed the software, why is the phone on fire?” (simone), the bland one by Zeller, Advanced Windows Debugging (hewardt, pravat), “How debuggers work” (rosenberg), “Linkers & Loaders” (levine), … but in general I’d give the literature (including my own modest contributions) a grunting sound as a kind-of review and summary.

                                                    1. 1

                                                      Awesome. Thank you! I’ve added it to my queue. It looks like a fun read.

                                                      I’ve read or own all the other books you recommended, except for Rosenberg’s book, strangely enough. The advice is helpful, though, because it’s good to know the common/popular set of books that are typical for the topic.

                                                    1. 6

                                                      Recently there’s been a lot of discussion of keyboard latency, as it is often much higher than reasonable. I’m interested in how much the self-built keyboard community is aware of the issue. Tristan Hume recently improved the latency of his keyboard from 30ms to 700µs.

                                                      1. 2

                                                        The Planck that Dan and I tested had 40ms of latency - not sure how much that varies from unit to unit though.

                                                        1. 3

                                                          I would expect very little, using the QMK firmware with a custom keymap. There’s typically only a handful of C with a couple ifs, no loops.

                                                        2. 2

                                                          Why are those levels of latency problematic? I would think anything under 50ms feels pretty much instantaneous. Perhaps for people with very high typing speeds or gamers?

                                                          1. 1

                                                            The end-to-end latency on a modern machine is definitely noticeable (often in the 100s of ms). Many keyboards add ~50 ms alone, and shaving that off results in a much nicer UX. It is definitely noticeable comparing, say, an Apple 2e (~25ms end-to-end latency) to my machine (~170ms end-to-end latency, IIRC).

                                                          2. 1

                                                            I recall reading about that. I’ll see about getting some measurements made, and see what it’s like on my Planck.

                                                            I’m interested in how much the self-built keyboard community is aware of the issue

                                                            I haven’t really seen much about it :/ If we could find an easy way of measuring latency without needing the RGB LEDs and camera, that would be good.

                                                            1. 2

                                                              a simple trick - use a contact microphone (piezo), jack it into something like https://www.velleman.eu/products/view/?id=435532

                                                          1. 1

                                                            How long until scammers leverage this in phishing campaigns as foreplay to extortion…