The point of double-entry bookkeeping isn’t the ledger (the diff) but the whole way credit balance accounts, the balance sheet and the accounting equation work together.
Yeah I think of the main principle as more that transactions always add up to 0. I.e. the conservation of money: Money is never created or destroyed, just moved between accounts. (Though revenue and expenses are both treated as bottomless and effectively “create” money in that sense).
It’s virtually impossible to create a future-proof way to solve this, given the (natural) difficulty of integrating with loads of different banks around the world, without doing some manual data entry. However, I should’ve clarified in the article that I do not recommend you enter data manually in TOML, but rather do some importing based on e.g. CSV statements. The advantage of TOML is that it is easy to read, edit and script around.
I do not recommend you enter data manually in TOML
Oh yeah, that makes a lot more sense hah. My reply was in reaction to thinking that’s what you were proposing. Automating it in another way and just using TOML as the backing store makes a lot more sense.
My initial impression was too that you offer to enter data in TOML and by the end of your article you would offer some convertor from TOML to ledger, hledger and/or beancount formats.
If you do not recommend entering data manually in TOML and if I gather it right, you argue for storing data in TOML, because there are tons of robust libraries to work with this human-readable and machine-friendly format. Hmm? The article title is a bit clickbait-y then (-;
I believe, some of the mentioned PTA software have plug-in systems, perhaps re-usable parser code, and some sort of export to some sort of CSV? How that doesn’t solve problems you are trying to solve? So far to me it seems that you add a superfluous layer in between the wild world of financial data formats and PTA. Although, you said that it was for your personal use, so I guess it’s totally okay (-:
There are plugin systems and reusable parser code, yes, but they often lock you into a certain programming language and I find that libraries for parsing common data formats like TOML often are of higher quality.
That’s not a terrible idea although having to write by hand transactions ultimately deterred me from using hledger/beancount.
I recently tried beancount again using the fava interface but it was still very error prone and then I had to spent quite some time “debugging” where are the $0.06 missing!
I’ve found the perfect balance with csv imports, specifically using the hledger csv importer. It provides tons of power to pre-categorize transactions, and human error of entering the wrong amount/date are completely eliminated!
With that I have a ledger that I can trust, haven’t had any issues with reconciliation and have back on my hands all the power of plaintextaccounting!
I’ve just had a PR merged to add a feature to the hledger CSV rules handler: regex match groups are available in field assignments. Hopefully this is useful to someone other than just me!
FWIW, Beancount also provides a nice CSV importer interface to automate away the manual work.
Depending on how it’s implemented, the importer can take care of the amount/date and even add the balancing post to the other account(s) based on any attribute of the original posting. I wrote about this workflow some time ago on my blog.
My secret sauce is beancount-import coupled with institutions that support OFX export. I’ve been using this combo for the past few years and never had issues–except for attempts at creating a NixOS derivation for it, which has left enough scars that I now do my reconciliation on a separate laptop.
How do you handle inter-account transactions? For example, I use Revolut, that has some insanely stupid csv export format, and occasionally I top it up from my bank card. So I have a duplicate transaction, with slightly different data with up to 3 days difference.
I have written a script that tries to find these which works quite well, but it is definitely not automatic, not even semiautomatic.. and I haven’t even gotten to stuff like revolut’s round up spending going to a saving account..
(I am using beancount with a java program that parses all my beancount files for existing transactions, and matches new ones, and writes out the missing ones - with git as a “transaction handler”. But I don’t see much advantage to the textual format, besides beancount having good GUI)
Regarding inter-account transactions, what I do is that I only import one of them. An example is when I pay my credit card, I get a debit on my bank account and a credit on the credit card. The transactions in my credit card I ignore and the ones is may debit card I make them something like this:
I don’t get it, why is there duplicates? Is this more that there is a gap between paying and the amount actually being settled?
The “easiest” thing to do might be to treat revolut exports as payments that are pulling from “line of credit” account, and then the bank transactions moving money from the bank account to the line of credit. That way you actually know how much you are floating. Though of course I don’t know what’s actually going on in your case, I’ve found modeling based on what is happening bank-side at least gives me options to later look at things and filter nicely.
I use hledger to keep track of my spending, and the fact that it uses a special text format for transactions isn’t a huge concern for me. What is a big concern is that every financial institution I use makes it difficult to programmatically obtain my transaction records, and also to figure out what records I failed to grab properly if my calculated spending doesn’t match up with what the website says. This aspect is by far the biggest problem I have managing my finances and it’s not one that any program I run myself can help with, because the problem is entirely contained within the proprietary software running on the websites of financial institutions subject to their own business and regulatory priorities.
If there was a bank that made it possible for me to make an authenticated REST API request against a website to get my personal transaction data I would seriously consider switching to that bank. But I’m not aware of any such banks and there may be reasons why no bank offers this service.
In the UK, we have Monzo and Starling that both have an authenticated REST API for accessing transactios (and more). The latter is particularly good, but both work great.
That part is so absurd, I don’t understand which part I don’t understand about it. They already have the data ready and kind of expose it. Why not add the extra endpoint that returns it in an actually correct format. (Actually correct because I’m remembering CommBank using minutes instead of the month in the date and ING listing the amount as 1234.34.56 for international payments of 1234.56; I gave up trying ofx since everyone gets it wrong and the unique IDs are not unique).
During my usage of beancount/ledger-cli to track investments and do queries for tax purposes, I would usually waste a non-trivial amount of time skimming through the docs trying to figure it out their DSL syntax. This article shares an interesting idea for an double-entry counting software I had where the records would be translated to Recutils instead of those famous DSLs.
While verbose, I can envision that using TOML might introduce a new ecosystem dynamic, where the tool itself primarily serves as a validator (or even as a library) and the community provides a handful of importers and exporters handling the heavy lifting of populating your records. I mean, even Python now includes toml in its standard library.
Note that, as detailed in the link you provided, Python only includes a toml reader in the standard library right now and recommends you use an external library to write to toml files. Additionally this reader was only added as of 3.11 and in my experience of publishing packages to pypi a lot of users, particularly on Windows, are still using 3.8. For a recent python project I ended up moving my configuration files to yaml after initially using toml for these reasons.
I knew that it was read-only which might be fine for an “exporter”, but I didn’t know about the majority of Windows users using an older version. Is there a reason why?
Not sure but that might just be specifically for my user base, I just monitor pypistats.org to get an idea of who using my software. I suspect most of my users who are on Windows installed python for some reason or another at some point and never bothered to update it while more of my users who are on Linux or Mac are running regular updates via their package manager.
Lazy end user reporting in: I would only bother to bump major versions of something like Python if something didn’t run. I’m on Python 3.11 right now and I don’t remember bumping my version ever.
I wonder if it’s just that the package managers auto-update (I know brew has a tendency to do this at least).
Python doesn’t have an built-in update mechanism on Windows. The vast majority of users have to manually download and run the installer from the website to get a new version.
Me: Tracking my finances is hard, there’s lots of steps and it’s time consuming. I wonder if nerds have figured out a better way…
Article: Track your finances manually with a verbose data format!
Me: Maybe I’ll just check my bank account once a week like I do now…
Why concentrate on managing your finances when you can reinvent double-entry bookkeeping from first principles!
Every nerd who teaches themselves financial literacy builds their own double-entry bookkeeping tool.
The Lost Wisdom of the Ancient Renaissance Bankers.
I preferred Business Secrets of the Pharaohs.
Why is this me?
hah Maybe if I understood double-entry bookkeeping, a lot of these discussions would make more sense.
A git diff shows you what has been removed, and what has been added, and where. That’s double-entry bookkeeping
The point of double-entry bookkeeping isn’t the ledger (the diff) but the whole way credit balance accounts, the balance sheet and the accounting equation work together.
Yeah I think of the main principle as more that transactions always add up to 0. I.e. the conservation of money: Money is never created or destroyed, just moved between accounts. (Though revenue and expenses are both treated as bottomless and effectively “create” money in that sense).
I only regret I have but one upvote to give.
The second step is “and your diffs are every transaction, instead of every month”
Very interesting analogy, I wonder if you could use this to actually build an accounting system with git?
It’s virtually impossible to create a future-proof way to solve this, given the (natural) difficulty of integrating with loads of different banks around the world, without doing some manual data entry. However, I should’ve clarified in the article that I do not recommend you enter data manually in TOML, but rather do some importing based on e.g. CSV statements. The advantage of TOML is that it is easy to read, edit and script around.
Oh yeah, that makes a lot more sense hah. My reply was in reaction to thinking that’s what you were proposing. Automating it in another way and just using TOML as the backing store makes a lot more sense.
I totally get your confusion! I should’ve been a bit more clear about that
My initial impression was too that you offer to enter data in TOML and by the end of your article you would offer some convertor from TOML to ledger, hledger and/or beancount formats.
If you do not recommend entering data manually in TOML and if I gather it right, you argue for storing data in TOML, because there are tons of robust libraries to work with this human-readable and machine-friendly format. Hmm? The article title is a bit clickbait-y then (-;
I believe, some of the mentioned PTA software have plug-in systems, perhaps re-usable parser code, and some sort of export to some sort of CSV? How that doesn’t solve problems you are trying to solve? So far to me it seems that you add a superfluous layer in between the wild world of financial data formats and PTA. Although, you said that it was for your personal use, so I guess it’s totally okay (-:
There are plugin systems and reusable parser code, yes, but they often lock you into a certain programming language and I find that libraries for parsing common data formats like TOML often are of higher quality.
As for the title; yes, it’s clickbait
Mint getting shut down has been a bit of a nightmare scenario for me as that’s where everything I have is budgeted.
That’s not a terrible idea although having to write by hand transactions ultimately deterred me from using hledger/beancount.
I recently tried beancount again using the fava interface but it was still very error prone and then I had to spent quite some time “debugging” where are the $0.06 missing!
I’ve found the perfect balance with csv imports, specifically using the hledger csv importer. It provides tons of power to pre-categorize transactions, and human error of entering the wrong amount/date are completely eliminated!
With that I have a ledger that I can trust, haven’t had any issues with reconciliation and have back on my hands all the power of plaintextaccounting!
I’ve just had a PR merged to add a feature to the hledger CSV rules handler: regex match groups are available in field assignments. Hopefully this is useful to someone other than just me!
FWIW, Beancount also provides a nice CSV importer interface to automate away the manual work.
Depending on how it’s implemented, the importer can take care of the amount/date and even add the balancing post to the other account(s) based on any attribute of the original posting. I wrote about this workflow some time ago on my blog.
My secret sauce is beancount-import coupled with institutions that support OFX export. I’ve been using this combo for the past few years and never had issues–except for attempts at creating a NixOS derivation for it, which has left enough scars that I now do my reconciliation on a separate laptop.
How do you handle inter-account transactions? For example, I use Revolut, that has some insanely stupid csv export format, and occasionally I top it up from my bank card. So I have a duplicate transaction, with slightly different data with up to 3 days difference.
I have written a script that tries to find these which works quite well, but it is definitely not automatic, not even semiautomatic.. and I haven’t even gotten to stuff like revolut’s round up spending going to a saving account..
(I am using beancount with a java program that parses all my beancount files for existing transactions, and matches new ones, and writes out the missing ones - with git as a “transaction handler”. But I don’t see much advantage to the textual format, besides beancount having good GUI)
Regarding inter-account transactions, what I do is that I only import one of them. An example is when I pay my credit card, I get a debit on my bank account and a credit on the credit card. The transactions in my credit card I ignore and the ones is may debit card I make them something like this:
assets:checking -amount liabilities:credit-card amount
I don’t get it, why is there duplicates? Is this more that there is a gap between paying and the amount actually being settled?
The “easiest” thing to do might be to treat revolut exports as payments that are pulling from “line of credit” account, and then the bank transactions moving money from the bank account to the line of credit. That way you actually know how much you are floating. Though of course I don’t know what’s actually going on in your case, I’ve found modeling based on what is happening bank-side at least gives me options to later look at things and filter nicely.
I use hledger to keep track of my spending, and the fact that it uses a special text format for transactions isn’t a huge concern for me. What is a big concern is that every financial institution I use makes it difficult to programmatically obtain my transaction records, and also to figure out what records I failed to grab properly if my calculated spending doesn’t match up with what the website says. This aspect is by far the biggest problem I have managing my finances and it’s not one that any program I run myself can help with, because the problem is entirely contained within the proprietary software running on the websites of financial institutions subject to their own business and regulatory priorities.
If there was a bank that made it possible for me to make an authenticated REST API request against a website to get my personal transaction data I would seriously consider switching to that bank. But I’m not aware of any such banks and there may be reasons why no bank offers this service.
In the UK, we have Monzo and Starling that both have an authenticated REST API for accessing transactios (and more). The latter is particularly good, but both work great.
That part is so absurd, I don’t understand which part I don’t understand about it. They already have the data ready and kind of expose it. Why not add the extra endpoint that returns it in an actually correct format. (Actually correct because I’m remembering CommBank using minutes instead of the month in the date and ING listing the amount as 1234.34.56 for international payments of 1234.56; I gave up trying ofx since everyone gets it wrong and the unique IDs are not unique).
During my usage of beancount/ledger-cli to track investments and do queries for tax purposes, I would usually waste a non-trivial amount of time skimming through the docs trying to figure it out their DSL syntax. This article shares an interesting idea for an double-entry counting software I had where the records would be translated to Recutils instead of those famous DSLs.
While verbose, I can envision that using TOML might introduce a new ecosystem dynamic, where the tool itself primarily serves as a validator (or even as a library) and the community provides a handful of importers and exporters handling the heavy lifting of populating your records. I mean, even Python now includes toml in its standard library.
recutils looks interesting! Thanks for sharing. Is your code published anywhere?
Unfortunately, I don’t have the source anymore, it has been years since I moved from plain-text accounting back to basic Excel spreadsheets. :(
No worries, I’m in spreadsheets too.
Note that, as detailed in the link you provided, Python only includes a toml reader in the standard library right now and recommends you use an external library to write to toml files. Additionally this reader was only added as of 3.11 and in my experience of publishing packages to pypi a lot of users, particularly on Windows, are still using 3.8. For a recent python project I ended up moving my configuration files to yaml after initially using toml for these reasons.
I knew that it was read-only which might be fine for an “exporter”, but I didn’t know about the majority of Windows users using an older version. Is there a reason why?
Not sure but that might just be specifically for my user base, I just monitor pypistats.org to get an idea of who using my software. I suspect most of my users who are on Windows installed python for some reason or another at some point and never bothered to update it while more of my users who are on Linux or Mac are running regular updates via their package manager.
Lazy end user reporting in: I would only bother to bump major versions of something like Python if something didn’t run. I’m on Python 3.11 right now and I don’t remember bumping my version ever.
I wonder if it’s just that the package managers auto-update (I know brew has a tendency to do this at least).
Python doesn’t have an built-in update mechanism on Windows. The vast majority of users have to manually download and run the installer from the website to get a new version.