import Import new transactions from one or more data files to the main journal. Flags: --catchup just mark all transactions as already imported --dry-run just show the transactions to be imported This command detects new transactions in one or more data files specified as arguments, and appends them to the main journal. You can import from any input file format hledger supports, but CSV/SSV/TSV files, downloaded from financial institutions, are the most common import source. The import destination is the default journal file, or another specified in the usual way with $LEDGER_FILE or -f/--file. It should be in journal format. Examples: $ hledger import bank1-checking.csv bank1-savings.csv $ hledger import *.csv Import dry run It's useful to preview the import by running first with --dry-run, to sanity check the range of dates being imported, and to check the effect of your conversion rules if converting from CSV. Eg: $ hledger import bank.csv --dry-run The dry run output is valid journal format, so hledger can re-parse it. If the output is large, you could show just the uncategorised transactions like so: $ hledger import --dry-run bank.csv | hledger -f- -I print unknown You could also run this repeatedly to see the effect of edits to your conversion rules: $ watchexec -- "hledger import --dry-run bank.csv | hledger -f- -I print unknown" Once the conversion and dates look good enough to import to your journal, perhaps with some manual fixups to follow, you would do the actual import: $ hledger import bank.csv Overlap detection Reading CSV files is built in to hledger, and not specific to import; so you could also import by doing hledger -f bank.csv print >>$LEDGER_FILE. But import is easier and provides some advantages. The main one is that it avoids re-importing transactions it has seen on previous runs. This means you don't have to worry about overlapping data in successive downloads of your bank CSV; just download and import as often as you like, and only the new transactions will be imported each time. We don't call this "deduplication", as it's generally not possible to reliably detect duplicates in bank CSV. Instead, import remembers the latest date processed previously in each CSV file (saving it in a hidden file), and skips any records prior to that date. This works well for most real-world CSV, where: 1. the data file name is stable (does not change) across imports 2. the item dates are stable across imports 3. the order of same-date items is stable across imports 4. the newest items have the newest dates (Occasional violations of 2-4 are often harmless; you can reduce the chance of disruption by downloading and importing more often.) Overlap detection is automatic, and shouldn't require much attention from you, except perhaps at first import (see below). But here's how it works: - For each FILE being imported from: 1. hledger reads a file named .latest.FILE file in the same directory, if any. This file contains the latest record date previously imported from FILE, in YYYY-MM-DD format. If multiple records with that date were imported, the date is repeated on N lines. 2. hledger reads records from FILE. If a latest date was found in step 1, any records before that date, and the first N records on that date, are skipped. - After a successful import from all FILEs, without error and without --dry-run, hledger updates each FILE's .latest.FILE for next time. If this goes wrong, it's relatively easy to repair: - You'll notice it before import when you preview with import --dry-run. - Or after import when you try to reconcile your hledger account balances with your bank. - hledger print -f FILE.csv will show all recently downloaded transactions. Compare these with your journal. Copy/paste if needed. - Update your conversion rules and print again, if needed. - You can manually update or remove the .latest file, or use import --catchup FILE. - Download and import more often, eg twice a week, at least while you are learning. It's easier to review and troubleshoot when there are fewer transactions. First import The first time you import from a file, when no corresponding .latest file has been created yet, all of the records will be imported. But perhaps you have been entering the data manually, so you know that all of these transactions are already recorded in the journal. In this case you can run hledger import --catchup once. This will create a .latest file containing the latest CSV record date, so that none of those records will be re-imported. Or, if you know that some but not all of the transactions are in the journal, you can create the .latest file yourself. Eg, let's say you previously recorded foobank transactions up to 2024-10-31 in the journal. Then in the directory where you'll be saving foobank.csv, you would create a .latest.foobank.csv file containing 2024-10-31 Or if you had three foobank transactions recorded with that date, you would repeat the date that many times: 2024-10-31 2024-10-31 2024-10-31 Then hledger import foobank.csv [--dry-run] will import only the newer records. Importing balance assignments Journal entries added by import will have all posting amounts made explicit (like print -x). This means that any balance assignments in the imported entries would need to be evaluated. But this generally isn't possible, as the main file's account balances are not visible during import. So try to avoid generating balance assignments with your CSV rules, or importing from a journal that contains balance assignments. (Balance assignments are best avoided anyway.) But if you must use them, eg because your CSV includes only balances: you can import with print, which leaves implicit amounts implicit. (print can also do overlap detection like import, with the --new flag): $ hledger print --new -f bank.csv >> $LEDGER_FILE (If you think import should preserve implicit balances, please test that and send a pull request.) Import and commodity styles Amounts in entries added by import will be formatted according to the journal's canonical commodity styles, as declared by commodity directives or inferred from the journal's amounts. Related: CSV > Amount decimal places. Import archiving When importing from a CSV rules file (hledger import bank.rules), you can use the archive rule to enable automatic archiving of the data file. After a successful import, the data file (specified by source) will be moved to an archive folder (data/, next to the rules file, auto-created), and renamed similar to the rules file, with a date. This can be useful for troubleshooting, detecting variations in your banks' CSV data, regenerating entries with improved rules, etc. The archive rule also causes import to handle source glob patterns differently: when there are multiple matched files, it will pick the oldest, not the newest. Import special cases Deduplication Here are two kinds of "deduplication" which import does not handle (and should not, because these can happen legitimately in financial data): - Two or more of the new CSV records are identical, and generate identical new journal entries. - A new CSV record generates a journal entry identical to one(s) already in the journal. Varying file name If you have a download whose file name varies, you could rename it to a fixed name after each download. Or you could use a CSV source rule with a suitable glob pattern, and import from the .rules file. Multiple versions Say you download bank.csv, import it, but forget to delete it from your downloads folder. The next time you download it, your web browser will save it as (eg) bank (2).csv. The source rule's glob patterns are for just this situation: instead of specifying source bank.csv, specify source bank*.csv. Then hledger -f bank.rules CMD or hledger import bank.rules will automatically pick the newest matched file (bank (2).csv). Alternately, what if you download, but forget to import or delete, then download again ? Now each of bank.csv and bank (2).csv might contain data that's not in the other, and not in your journal. In this case, it's best to import each of them in turn, oldest first (otherwise, overlap detection could cause new records to be skipped). Enabling import archiving ensures this. Then hledger import bank.rules; hledger import bank.rules will import and archive first bank.csv, then bank (2).csv.