I’ve never heard of this joke, but if you want to find a cycle in a linked list, the canonical way to do it is to use two pointers and have one walk one step at a time, while the other walks two. If they are ever equal after they take a step, there’s a cycle. Using signals is – I think it’s safe to say – way over the top. (Unless that’s the joke.)
That is the joke. The way I heard it was: keep free()ing the nodes, and if there’s a crash (due to double-free) you found a cycle.
I’ve heard of the two-pointer approach (with pointer variables typically given the names “tortoise” and “hare”), but I really like the double-free approach.
(1) The memory sizes are some fixed numbers. E.g, 32, 64, 128, etc, not random;
I had a hard time parse this line. What does “fixed numbers” even mean? Does it mean the numbers have to be compile time constants?
Sorry for my poor English. Because I use memory size as the key of the hashtable, and the key space of hashtable will affect performance. If user only uses some “fixed” size memory, it will have a better performance. Otherwise, the performance may be downgraded. For example, I use CUDA to do some mathematical computation, and only allocate memory in some size: 8K, 16K, the performance gets a remarkable improvement compared using the original CUDAMalloc* APIs, thanks!
I still don’t understand what you are saying.
What is a ’“fixed” size memory?
Some magic numbers picked by tuning performance? Do I need compile time known constants? Can I give a number at runtime?
I guess you actually want to do a page aligned allocation. Is this what you are saying?
What “fixed” size means in this case is that you expect to have many allocations/frees of the same size, i.e., you are allocating and freeing many objects of the same size throughout execution of the program.
And you very likely do not want to page-align ever allocation of, say, 100 bytes. It would be very inefficient if every allocation of 100 bytes was on a different 4K page.
Seems that this implements a basic slab cache on top of malloc (or other) using C++ map data structures. Though I think maybe technically a slab cache allocates by dividing up a contiguous block of memory into specifically-sized chunks instead of piecing individual small chunks together in a queue. You can see in your example on your github page that your two 100-byte allocations are not at contiguous memory addresses.
I don’t know much about the speed of C++ things, but it would be nice to see some speed comparisons to malloc, since I think malloc does some binning/caching of its own instead of immediately returning stuff to the system.
Also, in allocate_memory(), if _alloc_fun() returns NULL due to underlying allocator (i.e., malloc) failing, it will still increment _used_mem, which will lead to incorrect accounting. It might also cause some problems when the NULL pointer is freed.
Thanks very much for your comment! I learnt a lot of thing.
Also, in allocate_memory(), if _alloc_fun() returns NULL due to underlying allocator (i.e., malloc) failing, it will still increment _used_mem, which will lead to incorrect accounting.
Yes, this is a bug and I have already fixed it, thanks!
Freeing a NULL pointer shouldn’t be an issue (at least when using free); the C99 standard draft says that:
The free function causes the space pointed to by ptr to be deallocated, that is, made available for further allocation. If ptr is a null pointer, no action occurs.
Yes that is correct according to the standard. The issue in ump was with the accounting inside ump, but nanxiao fixed that.
Tabbed terminals and auto-resizing of text that I get through xfce-terminal (can do it in gnome-terminal and I’m sure others as well) are big for me. As I’ve been having to use Windows lately (and am starting to get used to the cmd program). The title of this article really got me excited, but it seems the changes are pretty minimal.
However! It is possible to get sshd running on WSL. Then you can ssh in from a terminal that supports tabs, or whatever other features you like (I have no idea if this will work well for non-English alphabets).
I enjoyed the video showing (hearing) those three keyboards in action. I routinely work in open air public spaces and keyboard noise is factor for me. This got a chuckle out of me:
The big brown fox jumps over the lazy dog.
The sentence “The quick brown fox jumps over the lazy dog” is a pangram–a sentence containing every letter in the (English) alphabet. It appears as a keyboard exercise due to that fact.
The name “zpojqwfejwfhiunz” also contains every letter in the English (Latin) alphabet… except for a few letters.
Interesting read, thanks nickpsecurity.
This really isn’t a scientific paper since there was no formal hypothesis stated nor experiment performed. It probably would’ve made a great presentation, though. What the authors present is a collection of missed optimizations that they found with the help of some software they wrote.
Would be nice to see if they could find a way to generalize their process or show that differential testing is somehow better at finding these missed optimizations. Currently it seems like they use their tool to generate an “interesting” score for a compiled program, then manually investigate it to look for missed optimizations.
As a complete side-note, I had to laugh when the author starts talking about lines of code, especially this gem: “At the time of writing, optdiff comprises 573 physical lines of Python code” WTF is a physical line of code?
I’m glad you found it interesting. I just skimmed it originally since it looked similar to one of my older ideas of using many compilers for a program with each function or module using the fastest compile. I never tried building it. Yet, you said one thing I didn’t get:
“This really isn’t a scientific paper since there was no formal hypothesis stated nor experiment performed.”
The scientific method basically says you need to observe stuff, make a hypothesis, conduct an experiment to refute/support it, belief increases w/ success (graduates to theory), and peer review follows. Your claim is a strong accusation. I read the paper in full to assess it. Here’s what’s in the paper:
Observation that running the same program through multiple compilers looking at differences found lots of compiler bugs. This pattern has strong, empirical evidence thanks to Csmith paper.
Their hypothesis is they can do the same thing, even with some of same tools, to spot missed optimization opportunities.
Another hypothesis is that things like spills can be used to find those. Metrics like that are already well-supported by evidence since they’re what backend writers and assembly coders already try to minimize.
They conduct an experiment by running random code through multiple compilers using binary analysis to detect those patterns in the code. They score code based on these patterns.
Their experiments find missed optimization opportunities as predicted. They file bug reports that are low-priority since it’s not correctness but some eventually get action by compiler writers. That’s a little bit of independent replication in itself.
So, they have hypotheses, experiments, outcomes supporting the hypotheses, some negative news on loops they’re honest about, and some independent confirmation by compiler teams fixing what they talk about. From there, one can rate the details, demand replication, run a new study done better, and so on. They do have scientific evidence for their claims, though, for something worthy of either a replication or clean-slate project done by independent group. There was also some science on the tool development. They noted their unstructured-over-time process kept finding problems in the tools themselves with bug reports on them, too. So, the tooling itself had a mini-cycle of maybe it will work, try to use it, it didn’t meet expectations, modify it, and use it again.
“As a complete side-note, I had to laugh when the author starts talking about lines of code, especially this gem: “At the time of writing, optdiff comprises 573 physical lines of Python code” WTF is a physical line of code?”
Lmao! That’s great. Probably was aiming for “uncommented” or something. Wonder if English was second language or just a mental slip-up. I don’t look at lines of code much if we’re talking high-level languages built on complex runtimes. More like it depended on 573 lines of Python + native implementation of that runtime. They might only have to write and review 573 lines, though, which can be an argument favoring productivity and/or maintenance. Python is also fairly reliable, too, if one stays away from any weird or hardly-used features/libraries.
Again, I’d like to be clear that I thought this was a great read and very interesting work. I think my original comment sounded condescending. I didn’t mean it to sound that way.
The point I was trying to make when I said “no formal hypothesis nor experiment performed” is that it wasn’t formal. I agree that there were experiments performed and that they did find legitimate issues with the compilers. The reason I thought it was informal was, to put it stupidly, that there were no numbers.
The actual technique in question was the process of binary analysis and method of finding “interesting” differences. For example, I would be curious how often that led to false positives, i.e., a high interesting score but without any way to optimize. I would also like to see how the authors’ approach compared to other, existing approaches to locating missed opportunities for optimization (assuming they exist). That might just be future work for this research group.
Anyway, thanks for posting the paper. It’s the type of technical stuff that I like to read on Lobsters.
I thought you were being condescending. My tone was aimed at that. If not, I take that much of it back.
I agree their numbers should be in the paper or at least a supplement. You have good reasons for that. Also glad you enjoyed the paper.
I was in the middle of a long-winded comment, but I don’t have the time to point out all the problems. Quite simply, the article doesn’t make any sense. There is a term for this: it’s not even wrong.
In order to be “wrong”, arguments have to be based in some sort of reality, but just reach the incorrect conclusion. None of the author’s arguments make any sense, so the article cannot be judged as being “right” or “wrong”, because it’s not based in reality.
I somewhat agree with this but maybe not for the same reasons. Many of the points ignore obvious reasons that are still valid today. Such as:
Regarding print statements:
Oh, wait a minute, this is still the only way we can get anything done in the world of programming!
But what about debuggers? Tcpdump-like tools? These are much better than debugging via printing.
Our ancestors bequeathed to us cd, ps, mkdir, chown, grep, awk, fsck, nroff, malloc, stdio, creat, ioctl, errno, EBADF, ESRCH, ENXIO, SIGSEGV, et al.
What technical field is not besieged by jargon with meaning not obvious to the unfamiliar? Likewise, all of these can be alised to more obvious terms, anything you desire, for only the cost of character space in your code or configs. So why don’t we see more of this? The author suggests it it because keyboards were physically painful to type on in the past.
You know what hurts? RSI. Silly point aside, the time savings are absolutely still relevant today.
The big problem with text-as-ASCII (or Unicode, or whatever) is that it only has structure for the humans who read it, not the computer.
Is this really a problem? Haven’t we solved inefficiencies relating to this with various types of preprocessing such as, well, code preprocessing? Not to mention compilation? Code aside, is the minimal overhead of parsing a config file when your daemon happens to restart that big? Maybe I’m being pedantic but aren’t scripting languages generally “slow” to execute yet quick to write, and the inverse for binary programs?
I feel that the author ignores obvious reasoning (obvious in that the alternative is, while not sugar-coated, certainly more modern than 70s Unix. This seems to fit every point in this article and, at the risk of ad hominem, paints a picture of a naive computer enthusiast.
A couple notes on the article (specifically, the one it links to at the beginning, The Logical Disaster of Null).
Null is a crutch. It’s a placeholder for I don’t know and didn’t want to think about it further
I disagree. In C, at least, NULL is a preprocessor macro, not a special object, “which expands to an implementation-defined null pointer constant”. In most cases, it’s either 0, or ((void *)0). It has a very specific definition and that definition is used in many places with specific meaning (e.g., malloc returns NULL on an allocation failure). The phrase, “It’s a placeholder for I don’t know and didn’t want to think about it further”, seems to imply that it’s used by programmers who don’t understand their own code, which is a different problem altogether.
People make up what they think Null means, which is the problem.
I agree. However, again in C, this problem doesn’t really exist, since there are no objects, only primative types. structs, for example, are just logical groupings of zero or more primative types. I can imagine that, in object-oriented languages, the desire to create some sort of NULL object can result in an object that acts differently than non-NULL objects in exceptional cases, which would lead to inconsistency in the language.
In another article linked-to in Logical Disaster of Null talks about how using NULL-terminated character arrays to represent strings was a mistake.
Should the C language represent strings as an address + length tuple or just as the address with a magic character (NUL) marking the end?
I would certainly choose the NULL-terminated character array representation. Why? Because I can easily just make a struct that has a non-NULL-terminated character array, and a value representing length. This way, I can choose my own way to represent strings. In other words, the NULL-terminated representation just provides flexibility.
The C Standard allows that. It basically states that, in the source code, a value of 0 in a pointer context is a null pointer and shall be converted to whatever value that represents in the local architecture. So that means on a Honeywell DPS-8/M, the code:
char *p = 0;
is valid, and will set the value of p to be -1. This is done by the compiler. The name NULL is defined so that it stands out in source code. C++ has rejected NULL and you are expected to use the value 0 (I do not agree with this, but I don’t do C++ coding).
Correct. Just for reference, from the 1989 standard:
“An integral constant expression with the value 0, or such an expression cast to type void * , is called a null pointer constant.”
I would certainly choose the NULL-terminated character array representation. Why? Because I can easily just make a struct that has a non-NULL-terminated character array, and a value representing length. This way, I can choose my own way to represent strings. In other words, the NULL-terminated representation just provides flexibility.
That’s not a very convincing argument IMO since you can implement either of the options yourself no matter which one is supported by the stdlib, the choice of one doesn’t in any way impact the potential flexibility. On the other hand NULL-terminated strings are much more likely to cause major problems due to how extremely easy it is to accidentally clobber the NULL byte, which happens all the time in real-world code.
And the language not supporting Pascal-style strings means that people would need to reach for one of a multitude of different and incompatible third-party libraries and then convince other people on the project that the extra dependency is worth it, and even then you need to be very careful when passing the functions to any other third-party functions that need the string.
You make a good point. Both options for strings can be implemented. As for Pascal strings, it is nice that a string can contain a NULL character somewhere in the middle. I guess back in the day when C was being developed, Ritchie chose NULL-terminated strings due to length being capped at 255 characters (the traditional Pascal string used the first byte to contain length). Nowadays, since computers have more memory, you could just use the first 4 bytes (for example) to represent string length, in which case, in C it could just be written as struct string { int length; char *letters; }; or something like that.
From Ritchie: “C treats strings as arrays of characters conventionally terminated by a marker. Aside from one special rule about initialization by string literals, the semantics of strings are fully subsumed by more general rules governing all arrays, and as a result the language is simpler to describe and to translate than one incorporating the string as a unique data type.”
I think this is a good idea. Oddly enough, there isn’t even a “vi” tag, just a “vim” tag.
Is it worth having two tags? A “text editors” or “editors” tag, and an “IDE” tag? At the very least, I think there should be a generic tag for text editors, whether or not we get rid of the “vim” or “emacs” tag (they are popular, so it may not make sense to get rid of them). IDEs may be different enough to warrant their own tag as well, as they tend to differ from text editors in that they frequently have built-in runtimes, debuggers, library management systems, etc.
Title is sort of the wrong message.
The right message is the critical region should be as small as possible, and only protect against racy access to shared resources.
Everything else that is in the critical region, but doesn’t need to be, increases contention without value.
I think that’s a fair summary. Having worked with Go for awhile now, I think it’s interesting to try to formalize the guiding principles of writing robust thread-safe code. Keeping the critical section as small as possible is a well-known idea, and yet I still see examples of people locking around I/O, which might be an accident of the lock/defer unlock pattern.
If the I/O is the contended for shared resource, eg. interleaving the outputs from multiple threads into a single output, then yes, i/o should also be “locked around”.
The point is I/O or not I/O, the point is the shared resources, any shared resource, that as /u/tscs37 said below that need to be access atomically.
It is very useful to think in terms of invariants. ie. Relationships that must always hold between data items.
If a thread race may break the invariant (either by leaving the variables inconsistent relative to each other, or by providing a thread with an inconsistent snapshot of them) then you must lock around them to protect the invariant.
However, a larger region than absolutely necessary, that is locked, doesn’t make you safer, it merely, as you rightly observed, always increases (sometimes drastically, as you observed) the latency of your system.
It also, as /u/bio_end_io_t observed below, increases the opportunities for things to go very wrong. eg. Dead locks and priority inversions.
which might be an accident of the lock/defer unlock pattern.
In C, which lacks the nifty go “defer” thing, or the C++ RAII, or the Ruby “ensure”, I have taken to actually introducing a entire new scope for each lock…. eg.
void myFunc( void)
{
DoStart();
lock();{
DoA();
DoB();
};unlock();
DoEnd();
}
And then make sure as little as possible is in that scope.
Sort of makes it in your face obvious which is the critical section. (Which is especially visually handy if you need to take two nested locks, and allows you to rapidly check that you always take them in the same order everywhere.)
And Obvious is a Very Good Thing when dealing with multithreaded code.
I agree that keeping critical regions small and not locking around IO are good rules of thumb.
There are some subtlies since you need to know what exactly is “as small as possible”. For example, in general, you don’t want to call kmalloc inside of a spinlock because it may block. Of course, you can pass the GFP_ATOMIC flag to kmalloc to avoid the block, but in most cases, you can allocate outside the critical section anyway.
So really, you want to avoid blocking inside a critical region, but use something like a mutex or semaphore that can release the CPU if there is a chance you will need to sleep inside the critical region, because quite frankly, that can happen. IO tends to be something that can block, so if you NEED locking around IO, avoid using spinlocks.
Edit: fixed typo. should say “that” instead of “than”
There is two sides to it. There is of course when you need to keep this critical section as a small as possible and I think that would affect the majority of the code.
There would also be code that needs to operate atomically in transactions. This would mainly affect anything that needs to process a lot of data while holding the lock for it. In that case you want the lock to be as wide as possible as to not have to enter the locked area all the time.
DHS also said that its NPPD is “not aware of any current DHS technical capability to detect IMSI catchers.”
“NPPD is aware of anomalous activity outside the [National Capital Region] that appears to be consistent with IMSI catchers,”
These statements contradict each other.
Why?
First statement: DHS says NPPD doesn’t know about DHS’s ability to detect IMSIs
Second statement: NPPD can detect IMSis
While cd may be wasting some of my time it certainly isn’t the biggest culprit. I’m sure this is a horrific antipattern but I just use aliases for directories I commonly cd to, such as
alias proj="cd /path/to/project/root"
and so on
This is the single greatest English sentence I’ve ever read: “My made up estimate is that Autojump has saved me from at least a whole year of typing. At least that’s how it feels.”
If you’re not sold after reading that…
Prob better to just modernize the existing C. net-tools shouldn’t have to depend on the installation of a python environment.
Python can be compiled to native executables with Nuitka or Shedskin.
Which link against libpython…
C applications on Linux all depends on shared libraries. Installing a small .so file alongside with the executable should be acceptable, IMO.
And Shedskin compiles directly to C++.
An Internet search tells me you still need a python environment to use shedskin and that it only supports python 2.4-2.6.
No, once you compile/transpile the Python application to C++, the resulting application does not need Python.
There is also a project named Grumpy, for compiling/transpiling Python to Go. Once that is done, Python is not needed for those executables either. Google used Grumpy to convert the Python code used to run YouTube, from Python to Go.