Some programs have safety critical code, where a bug can physically harm a user. One should definitely use a memory safe language for these cases.
Hm, this doesn’t sound exactly right. For safety-critical stuff, I would really prefer to have a proof that the program functions correctly, memory safety is a far too weak property. Memory safety in a lot of cases boils down to “we crash the process deterministically”, and I think this won’t be an acceptable outcome for many such systems.
I can see how in “normal code” memory safety bug can be more damaging than logic bug. My intuition is that for critical stuff this distinction becomes less crisp, that any bug is a game over.
I think it is also true that more safe languages usually make generalized correctness easier to ensure, but that’s more of a correlation: there are memory safe languages with weak and error-prone type systems.
The conceptualization of UB and memory unsafety as an immune system attack is the right one to use here, I think - “it attacks your ability to diagnose faults in the system”. In a “safety critical” system, it is the faults themselves that do the damage. Memory unsafety right-shifts the costs (from concept to design, from design to development, from development to testing, from testing to post-disaster investigations).
The thing about memory safety is that memory (un)safety is not a property directly related to functional safety (the product does not harm anyone), but one that increases the proof burden because it always needs to be taken into account. Potential memory unsafety in safety critical engineering is seen as a risk and a potential interference (where the behaviour of a system - memory allocation - may interfere with the behaviour of the system under inspection - e.g. your motor control), so eliminating it in general reduces your proof burden, but you are correct - it does only go to a certain extend.
That could be true, except for the artificial complexity and iteration costs that can sometimes be introduced by the borrow checker’s restrictions. Sometimes, the cure is worse than the disease. Not often, but sometimes.
Not sure if you read all the way through, but the article mentions a few reasons where one might opt for less safety, mostly in the costs of all memory safety approaches.
It’s nuanced because the definition of memory safety depends on some abstraction for memory regions. Memory safety is a spectrum. An MMU enforces that every memory access must be to some valid memory. With a sanitizer-like approach, it means that every memory access is to a valid live object. In a typical programming language it adds a provenance model so every pointer is derived from a pointer to a valid object and any memory access via that pointer can only access the object and only for the lifetime of that object.
If you are implementing a memory allocator then a definition of memory safety that is built on top of an object model doesn’t really help because you are the thing responsible for providing the definition of an object. This is even more true for an OS kernel because you are providing the process abstraction and are responsible for page table setup. You can pretend that you have a language with object-level memory safety but anything that can modify page tables can bypass it and if most of your code pokes the page tables directly then you have no benefit. You need to understand what the benefits that you want from memory safety are, what is able to bypass memory safety, and how you limit the danger there.
Even if you are writing application code, C libraries can bypass language-level memory safety and the OS kernel can bypass everything (and needs to for I/O). You still need to think about what you’re getting from memory safety, whether it’s reducing cognitive load (I have a GC and plenty of RAM, I don’t need to think about lifetimes!), reducing bugs (my compiler errors if I write this, I don’t have to debug it!), or providing a strong security guarantee for isolating untrusted code (I can provide a plugin API without violating the integrity of data owned by my own code!).
Memory unsafety is a bug, the severity of which might depend, but it is always a bug.
Be careful what words you are choosing, because this sentence right there is false. Even though more likely than not you know everything I’m outlining below.
A bug is when a program does not behave as desired. This includes crashes and security vulnerabilities.
Something is unsafe when using it allows us to program bugs. A C function that dereference pointers they accept as argument are unsafe, because if we give them an invalid pointer we get Undefined Behaviour™, nasal demons…
For instance, this API is unsafe:
// Prints "Hello <name>!"
// Does nothing if name is NULL
void hello(const char *name)
{
if (name == 0) return;
printf("Hello %s!\n", name);
}
Because if the string I provide is not NULL terminated I’m going to run into trouble:
The unsafety of hello() allowed the bug in user code. For hello()itself to have a bug, I need to make a mistake, such as forgetting the NULL check:
// Prints "Hello <name>!"
// Does nothing if name is NULL
void hello(const char *name);
{
printf("Hello %s!\n", name);
}
C is dangerous and a huge source of bugs. But it is not itself a bug.
Likewise, walking a slackline in the mountain with no safety equipment is insanely dangerous, but you won’t die because of the line itself. Most likely it will be your inadequate competence, lack of attention, or hubris. (Nevermind adequate competence may be unattainable to begin with).
The hello() does not take a length parameter. It’s obvious to anyone it expects a null-terminated string. How else are you going to determine the end of the string, look for an EOM control character?
But that’s not important. Even if I conceded your point and fix the comment, my main point would remain: hello() can be unsafe and bug-free.
I’m absolutely not convinced by the Google Earth argument:
For example, the Google Earth app is written in a non-memory-safe language but it only takes input from the user and from a trusted first-party server, which reduces the security risk.
As long as that server isn’t compromised, or a rogue DNS server doesn’t direct clients to a host controller by attackers, or… PDF viewers also nominally only “take input from the user”, but I will run out of fingers if I try to count arbitrary code execution vulnerabilities in PDF clients.
Anything that can take arbitrary input must never trust that input blindly.
Note that the author worked in Google Earth team, and is speaking from experience.
I agree you should not trust input blindly, but it is a spectrum, isn’t it? I hope we can agree that memory safety is less important in Google Earth compared to Google Chrome. That’s all the author is saying.
I think by “take input from the user” he meant GUI events like click/drag/keystroke, which we can agree are pretty safe. (Mostly. I just remembered that “hey, type this short gibberish string into your terminal” meme that triggers a fork bomb…)
“Open this random PDF file” is technically a user event, but that event comes with a big payload of untrusted 3rd party data, so yeah, vulns galore.
I feel like Earth is a bad example, then. Google Earth lets you read/write your GIS data as a KMZ file. KMZ files are zipped(!) XML(!!) documents – that’s quite a bit of surface area for a malicious payload to work with.
😬 At least there probably aren’t too many people using that feature. Unless they start getting urgent emails from “the IT department” telling them “you need to update your GIS data in Google Earth right away!!! Download this .kmz file…”
But some of the people who do use that are very juicy targets who make heavy use of features like that, like people mapping human rights abuses or wars. It’s not like this software is just a toy that doesn’t see serious use.
Earth passed some pretty intense security reviews before it was launched and before every major feature release, and those security teams know their stuff.
Likely a big help was how well sandboxing works in WebAssembly, Android, and iOS.
Even without SPARK, Ada still has a lot of checks built into how it handles memory.
Pointers (really “access types”, but I’ll use “pointer” to make it read easier) are typed and associated with a memory allocator (typically the default). Since pointers are associated with an allocator, it makes sense that you can’t assign one to the other since they might not free memory the same way.
type A is access Integer;
type B is access Integer
with Storage_Pool => My_Special_Allocator;
First : A := ...
Second : B := ...
Second := First; -- syntax error, can't convert A into a B
Plain access types also must point to allocated memory. If you want to point to all possible forms, you use a general access type. These can point to stack, but only if that variables is marked as aliased. They can also point to heap allocations.
type C is access all Integer;
D : aliased Integer;
Third : C := D'Access;
Third := First; -- legal since C can access ANY integer.
There’s also a set of static and runtime accessibility rules to help ensure that an accessed object lives as long as its access type in this case. It’s not as sophisticated as the Rust borrow checker though. You can throw caution to the wind and use Value'Unchecked_Access (and GNAT’s Value'Unrestricted_Access) to completely subvert the rules, but you can turn this off though with pragma Restrictions(No_Unchecked_Access).
When accepting pointers as parameters, you can accept either a typed pointer, or an “anonymous access” type (like the access all type), to take any type of pointer to that type.
procedure Foo (Ptr : access Integer) is
begin
-- ..
end Foo;
You can’t assign Ptr to an a A or B here (without a cast). You also can’t free Ptr within Foo though, because the free function Unchecked_Deallocation is a generic function that must be instantiated (template-likes in Ada require explicit instantiation) and that takes the pointer type. The pointer passed here is anonymous, so you can’t free it (without forcing a cast). Along with the typed pointers initially mentioned, you can use this to constrain who can free memory in general.
There’s also a few guardrails with a non-null attribute you can append, and analogs to C++ style const* or *const.
Even though pointers set themselves to null when being freed, you can end up here with use-after-free due to pointer aliasing. Pointers also initialize to null and perform a null check before attempting an access.
Pointers also aren’t really just generic memory addresses. If you’re looking to convert and do pointer arithmetic you have to go through the System.Address_To_Access_Conversions.
In general, these rules make it pretty hard to shoot yourself in the foot, though you can, and I have before.
How can the author say that rolling your own memory allocator is safer than using malloc()? Raw memory managers (e.g. pml4t and all that other stuff malloc depends on) are significantly harder to develop than simply remembering to call free(). They also usually entail choosing arbitrary limits which is hard to do when developing things which don’t have definite requirements, like realtime and embedded. Leaning too heavily on the stack also has issues if you do something like allocate a buffer that’s larger than your guard page.
Hm, this doesn’t sound exactly right. For safety-critical stuff, I would really prefer to have a proof that the program functions correctly, memory safety is a far too weak property. Memory safety in a lot of cases boils down to “we crash the process deterministically”, and I think this won’t be an acceptable outcome for many such systems.
I can see how in “normal code” memory safety bug can be more damaging than logic bug. My intuition is that for critical stuff this distinction becomes less crisp, that any bug is a game over.
I think it is also true that more safe languages usually make generalized correctness easier to ensure, but that’s more of a correlation: there are memory safe languages with weak and error-prone type systems.
The conceptualization of UB and memory unsafety as an immune system attack is the right one to use here, I think - “it attacks your ability to diagnose faults in the system”. In a “safety critical” system, it is the faults themselves that do the damage. Memory unsafety right-shifts the costs (from concept to design, from design to development, from development to testing, from testing to post-disaster investigations).
The thing about memory safety is that memory (un)safety is not a property directly related to functional safety (the product does not harm anyone), but one that increases the proof burden because it always needs to be taken into account. Potential memory unsafety in safety critical engineering is seen as a risk and a potential interference (where the behaviour of a system - memory allocation - may interfere with the behaviour of the system under inspection - e.g. your motor control), so eliminating it in general reduces your proof burden, but you are correct - it does only go to a certain extend.
For some insight into the risk that need to be mitigated around allocation from the perspective of the avionics standards, I co-wrote a paper about that recently: https://www.adacore.com/uploads/techPapers/Safe-Dynamic-Memory-Management-in-Ada-and-SPARK.pdf
It’s not nuanced - you always want memory safety. Memory unsafety is a bug, the severity of which might depend, but it is always a bug.
I’d argue that up until these last few years, it was nuanced. The proving out of Rust has destroyed the nuance for all cases except assembler.
That could be true, except for the artificial complexity and iteration costs that can sometimes be introduced by the borrow checker’s restrictions. Sometimes, the cure is worse than the disease. Not often, but sometimes.
Not sure if you read all the way through, but the article mentions a few reasons where one might opt for less safety, mostly in the costs of all memory safety approaches.
It’s nuanced because the definition of memory safety depends on some abstraction for memory regions. Memory safety is a spectrum. An MMU enforces that every memory access must be to some valid memory. With a sanitizer-like approach, it means that every memory access is to a valid live object. In a typical programming language it adds a provenance model so every pointer is derived from a pointer to a valid object and any memory access via that pointer can only access the object and only for the lifetime of that object.
If you are implementing a memory allocator then a definition of memory safety that is built on top of an object model doesn’t really help because you are the thing responsible for providing the definition of an object. This is even more true for an OS kernel because you are providing the process abstraction and are responsible for page table setup. You can pretend that you have a language with object-level memory safety but anything that can modify page tables can bypass it and if most of your code pokes the page tables directly then you have no benefit. You need to understand what the benefits that you want from memory safety are, what is able to bypass memory safety, and how you limit the danger there.
Even if you are writing application code, C libraries can bypass language-level memory safety and the OS kernel can bypass everything (and needs to for I/O). You still need to think about what you’re getting from memory safety, whether it’s reducing cognitive load (I have a GC and plenty of RAM, I don’t need to think about lifetimes!), reducing bugs (my compiler errors if I write this, I don’t have to debug it!), or providing a strong security guarantee for isolating untrusted code (I can provide a plugin API without violating the integrity of data owned by my own code!).
Be careful what words you are choosing, because this sentence right there is false. Even though more likely than not you know everything I’m outlining below.
For instance, this API is unsafe:
Because if the string I provide is not NULL terminated I’m going to run into trouble:
The unsafety of
hello()
allowed the bug in user code. Forhello()
itself to have a bug, I need to make a mistake, such as forgetting theNULL
check:C is dangerous and a huge source of bugs. But it is not itself a bug.
Likewise, walking a slackline in the mountain with no safety equipment is insanely dangerous, but you won’t die because of the line itself. Most likely it will be your inadequate competence, lack of attention, or hubris. (Nevermind adequate competence may be unattainable to begin with).
Not wearing a seatbelt while driving, on the other hand, could very well ding your car insurance payment if you get into a collision: https://rates.ca/resources/how-does-seatbelt-infraction-affect-car-insurance-premiums
The bug there is then the lack of a comment on
hello()
saying “Takes a string that must be null-terminated”. The implementation is under-defined.The
hello()
does not take a length parameter. It’s obvious to anyone it expects a null-terminated string. How else are you going to determine the end of the string, look for anEOM
control character?But that’s not important. Even if I conceded your point and fix the comment, my main point would remain:
hello()
can be unsafe and bug-free.I’m absolutely not convinced by the Google Earth argument:
As long as that server isn’t compromised, or a rogue DNS server doesn’t direct clients to a host controller by attackers, or… PDF viewers also nominally only “take input from the user”, but I will run out of fingers if I try to count arbitrary code execution vulnerabilities in PDF clients.
Anything that can take arbitrary input must never trust that input blindly.
Note that the author worked in Google Earth team, and is speaking from experience.
I agree you should not trust input blindly, but it is a spectrum, isn’t it? I hope we can agree that memory safety is less important in Google Earth compared to Google Chrome. That’s all the author is saying.
I think by “take input from the user” he meant GUI events like click/drag/keystroke, which we can agree are pretty safe. (Mostly. I just remembered that “hey, type this short gibberish string into your terminal” meme that triggers a fork bomb…)
“Open this random PDF file” is technically a user event, but that event comes with a big payload of untrusted 3rd party data, so yeah, vulns galore.
I feel like Earth is a bad example, then. Google Earth lets you read/write your GIS data as a KMZ file. KMZ files are zipped(!) XML(!!) documents – that’s quite a bit of surface area for a malicious payload to work with.
Keep in mind Earth is sandboxed, since it runs on the web (and Android and iOS) so theres just not much damage that can be done by a malicious KMZ.
😬 At least there probably aren’t too many people using that feature. Unless they start getting urgent emails from “the IT department” telling them “you need to update your GIS data in Google Earth right away!!! Download this .kmz file…”
But some of the people who do use that are very juicy targets who make heavy use of features like that, like people mapping human rights abuses or wars. It’s not like this software is just a toy that doesn’t see serious use.
Earth passed some pretty intense security reviews before it was launched and before every major feature release, and those security teams know their stuff.
Likely a big help was how well sandboxing works in WebAssembly, Android, and iOS.
Even without SPARK, Ada still has a lot of checks built into how it handles memory.
Pointers (really “access types”, but I’ll use “pointer” to make it read easier) are typed and associated with a memory allocator (typically the default). Since pointers are associated with an allocator, it makes sense that you can’t assign one to the other since they might not free memory the same way.
Plain access types also must point to allocated memory. If you want to point to all possible forms, you use a general access type. These can point to stack, but only if that variables is marked as
aliased
. They can also point to heap allocations.There’s also a set of static and runtime accessibility rules to help ensure that an accessed object lives as long as its access type in this case. It’s not as sophisticated as the Rust borrow checker though. You can throw caution to the wind and use
Value'Unchecked_Access
(and GNAT’sValue'Unrestricted_Access
) to completely subvert the rules, but you can turn this off though withpragma Restrictions(No_Unchecked_Access)
.When accepting pointers as parameters, you can accept either a typed pointer, or an “anonymous access” type (like the
access all
type), to take any type of pointer to that type.You can’t assign
Ptr
to an aA
orB
here (without a cast). You also can’t freePtr
withinFoo
though, because the free functionUnchecked_Deallocation
is a generic function that must be instantiated (template-likes in Ada require explicit instantiation) and that takes the pointer type. The pointer passed here is anonymous, so you can’t free it (without forcing a cast). Along with the typed pointers initially mentioned, you can use this to constrain who can free memory in general.There’s also a few guardrails with a non-null attribute you can append, and analogs to C++ style
const*
or*const
.Even though pointers set themselves to null when being freed, you can end up here with use-after-free due to pointer aliasing. Pointers also initialize to null and perform a null check before attempting an access.
Pointers also aren’t really just generic memory addresses. If you’re looking to convert and do pointer arithmetic you have to go through the
System.Address_To_Access_Conversions
.In general, these rules make it pretty hard to shoot yourself in the foot, though you can, and I have before.
Use unsafe languages when you want to let the user hack their console using a long horse name.
Always, unless you can’t (legacy codebases, sigh).
Off topic but colored footnotes in the sidebar are quite nice.
How can the author say that rolling your own memory allocator is safer than using malloc()? Raw memory managers (e.g. pml4t and all that other stuff malloc depends on) are significantly harder to develop than simply remembering to call free(). They also usually entail choosing arbitrary limits which is hard to do when developing things which don’t have definite requirements, like realtime and embedded. Leaning too heavily on the stack also has issues if you do something like allocate a buffer that’s larger than your guard page.