Dataclasses have a lot of issues, including requiring type annotations, too many features, poor performance, etc. People want to have basic data types that are easy to define and simple to use without all the overhead of OOP. Also, immutable types like this could enable better sharing between sub-interpreters for performance.
I couldn’t really see the difference with a NamedTuple subclasses, which behave pretty much like structs, the only caveat being reserved attribute names like ‘index’.
Namedtuple allows positional access, which is an error-prone interface that should be avoided. Namedtuples are really only there to help you migrate existing tuple code over to objects with attributes.
With the “I haven’t done Python in a while” caveat, I wonder — why not make a new namedtuple-like that doesn’t allow indexing? The Author’s reference implementation could easily be the template here, no?
Maybe the underlying question is “why can’t it just be a new collection class? why is built-in language support and a new struct keyword needed?” I think it’s this part:
def __new__(cls, x: int, y: int) -> Self:
"""Create a new, immutable instance."""
# Pretend this makes everything immutable in the end.
self = mutable(cls.__slots__)
self.x = x
self.y = y
return immutable(self)
There’s no immutable in userspace today as far as I understand it. Some values like frozenset and tuple are actually immutable, and there are proposals for others, but not for objects. You can simulate immutability like dataclasses already do but it’s not truly frozen. SimpleNamespace is fully mutable. So that’s part of the gap to fill.
If the goal is performance, aren’t you in the wrong language? At some point either the dynamic features of the language are worth having, or they need to be shed (yes, pun). But what I see happening (from a far) is that the language keeps growing and growing in language surface area instead of fixing/creating better baseline abstractions. We’re just minutes away from Python being the next “kitchen sink” language… it seems. Far from the language I loved using in the 00s and early 10s.
But why is it good to discuss it? What does this offer that dataclasses don’t? Honest question here, because I’m not seeing it, but I could be missing something.
One problem with @dataclass being an annotation is that it will either create a new class or modify an existing class based on slots=True, and creating a new class is an odd behavior as well. Syntax could unify this to a degree and simplify it too.
Additionally, people won’t choose @dataclass as often. It’s one of those things that is less likely to be reached for because people prefer native features.
That’s about it though. Personally, I don’t really think a native version is a big win tbh
One of my biggest issues with dataclasses, specially working with people without a lot of python experience, is that it’s based on deep, dark, magic. The second someone asks “why do I have to put this @symbol over my class”, bam, you either have to say “Trust me, you just have to”, or you will spend the next 7 hours explaining how decorators work and what even are metaclasses.
Decorators are just functions that take objects, modify or wrap them, and return a new object. Metaclasses need not apply for a basic understanding of decorators. Far cry from “black magic,” I’d say.
edit: and upon inspection of the dataclasses source, I don’t think it actually relies on metaclasses at all.
Point taken on metaclasses, although it’s worth mentioning that lots of other libraries that provide similar functionality to dataclasses use metaclasses, and so does other stuff on the stdlib, like enums.
That said, decorators are only simple if you know them. Even the concept of functions as first class objects can be startling to newcomers. Would be nice if one could introduce simple data objects without having to go into that.
I’m imagining if struct is built-in to syntax then it could provide better performance like that as well. Like sharing structs between sub-interpreters could be done within the runtime transparently. Seems like it could be more than sugar. Not sure how the immutability part of Brett’s mock-up would work, though.
I can’t get on board with dropping support for methods or optional fields. I use immutable dataclass objects all the time, but almost all of them have optional fields (with defaults) and convenience methods.
I don’t get it. Dataclasses are already compact. This seems like new syntax for the sake of adding new syntax.
Dataclasses have a lot of issues, including requiring type annotations, too many features, poor performance, etc. People want to have basic data types that are easy to define and simple to use without all the overhead of OOP. Also, immutable types like this could enable better sharing between sub-interpreters for performance.
I couldn’t really see the difference with a NamedTuple subclasses, which behave pretty much like structs, the only caveat being reserved attribute names like ‘index’.
Namedtuple allows positional access, which is an error-prone interface that should be avoided. Namedtuples are really only there to help you migrate existing tuple code over to objects with attributes.
With the “I haven’t done Python in a while” caveat, I wonder — why not make a new namedtuple-like that doesn’t allow indexing? The Author’s reference implementation could easily be the template here, no?
Maybe the underlying question is “why can’t it just be a new collection class? why is built-in language support and a new
struct
keyword needed?” I think it’s this part:There’s no
immutable
in userspace today as far as I understand it. Some values likefrozenset
andtuple
are actually immutable, and there are proposals for others, but not for objects. You can simulate immutability like dataclasses already do but it’s not truly frozen. SimpleNamespace is fully mutable. So that’s part of the gap to fill.So the right thing to do is to make it possible to have immutable class fields, not create a literally new structure/syntax in the language… maybe?
All of the object and class machinery is very expensive, though. So the goal is to avoid that because of performance. Makes sense to me.
If the goal is performance, aren’t you in the wrong language? At some point either the dynamic features of the language are worth having, or they need to be shed (yes, pun). But what I see happening (from a far) is that the language keeps growing and growing in language surface area instead of fixing/creating better baseline abstractions. We’re just minutes away from Python being the next “kitchen sink” language… it seems. Far from the language I loved using in the 00s and early 10s.
Oh well.
I agree it no longer “fits in your head”.
It seems like Mojo is sort of working in that same direction.
This could be a package for structs, does not seem to be needed in the language definition itself. But its good to discuss it.
But why is it good to discuss it? What does this offer that dataclasses don’t? Honest question here, because I’m not seeing it, but I could be missing something.
One problem with
@dataclass
being an annotation is that it will either create a new class or modify an existing class based onslots=True
, and creating a new class is an odd behavior as well. Syntax could unify this to a degree and simplify it too.Additionally, people won’t choose
@dataclass
as often. It’s one of those things that is less likely to be reached for because people prefer native features.That’s about it though. Personally, I don’t really think a native version is a big win tbh
One of my biggest issues with dataclasses, specially working with people without a lot of python experience, is that it’s based on deep, dark, magic. The second someone asks “why do I have to put this @symbol over my class”, bam, you either have to say “Trust me, you just have to”, or you will spend the next 7 hours explaining how decorators work and what even are metaclasses.
Decorators are just functions that take objects, modify or wrap them, and return a new object. Metaclasses need not apply for a basic understanding of decorators. Far cry from “black magic,” I’d say.
edit: and upon inspection of the dataclasses source, I don’t think it actually relies on metaclasses at all.
Point taken on metaclasses, although it’s worth mentioning that lots of other libraries that provide similar functionality to dataclasses use metaclasses, and so does other stuff on the stdlib, like enums.
That said, decorators are only simple if you know them. Even the concept of functions as first class objects can be startling to newcomers. Would be nice if one could introduce simple data objects without having to go into that.
I prefer the way mojo uses
struct
to add a type with new semantics & performance rather than syntactic sugar for dataclasses.I’m imagining if struct is built-in to syntax then it could provide better performance like that as well. Like sharing structs between sub-interpreters could be done within the runtime transparently. Seems like it could be more than sugar. Not sure how the immutability part of Brett’s mock-up would work, though.
[Comment removed by author]
I can’t get on board with dropping support for methods or optional fields. I use immutable dataclass objects all the time, but almost all of them have optional fields (with defaults) and convenience methods.
Cf. https://github.com/tc39/proposal-record-tuple