sqlite3 -readonly /path/db.sqlite "VACUUM INTO '/path/backup.sqlite';"
From https://sqlite.org/lang_vacuum.html : The VACUUM command with an INTO clause is an alternative to the backup API for generating backup copies of a live database. The advantage of using VACUUM INTO is that the resulting backup database is minimal in size and hence the amount of filesystem I/O may be reduced.
Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption.
For replicating sqlite databases safely there is
As others have mentioned an incremental rsync would be much faster, but what bothers me the most is that he claims that sending SQL statements is faster than sending database and COMPLETELY omiting the fact that you have to execute these statements. And then run /optimize/. And then run /vacuum/.
Currently I have scenario in which I have to "incrementally rebuild *" a database from CSV files. While in my particular case recreating the database from scratch is more optimal - despite heavy optimization it still takes half an hour just to run batch inserts on an empty database in memory, creating indexes, etc.
Nice tricks in the article, but you can more easily use the builtin utility now :)
I blogged about how it works in detail here: https://nochlin.com/blog/how-the-new-sqlite3_rsync-utility-w...
-z, --compress compress file data during the transfer
--compress-level=NUM explicitly set compression level
Probably it's faster to compress to gzip and later transfer. But it's nice to have the possibility to improve the transfer with a a flag.Another alternative is to skip compression of the dump output, let rsync calculate the differences from an previous uncompressed dump to the current dump, then have rsync compress the change sets it sends over the network. (rsync -z)
I just tried some comparisons (albeit with a fairly small sqlite file). The text compressed to only about 84% of the size of the compressed binary database, which isn't negligible, but not necessarily worth fussing over in every situation. (The binary compressed to 7.1%, so it's 84% relative to that).
bzip2 performed better on both formats; its compression of the binary database was better than gzip's compression of the text (91.5%) and bzip2's text was better than binary (92.5).
Though that is not available inside rsync, it indicates that if you're going with an external compression solution, maybe gzip isn't the best choice if you care about every percentage reduction.
If you don't care about every percentage reduction, maybe just rsync compression.
One thing worth mentioning is that if you are updating the file, rsync will only compress what is sent. To replicate that with the text solution, you will have to be retaining the text on both sides to do the update between them.
Isn't this a case for proper database servers with replication?
Or if it's an infrequent process done for dev purposes just shut down the application doing writes on the other side?
Just ssh the machine, dump the SQL and load it back into SQLite locally.
ssh username@server "sqlite3 my_remote_database.db .dump | gzip -c" | gunzip -c | sqlite3 my_local_database.db
Anyway I don't think the database file size was really an issue, it was a relatively big schema but not many indices and performance wasn't a big consideration - hence why the backend would concatenate query results into an XML file, then pass it through an xml->json converter, causing 1-2 second response times on most requests. I worked on a rewrite using Go where requests were more like 10-15 milliseconds.
But, I still used sqlite because that was actually a pretty good solution for the problem at hand; relatively low concurrency (up to 10 active simultaneous users), no server-side dependencies or installation needed, etc.
if retaining the snapshot file is of value, great.
I'd be a tiny bit surprised if rsync could recognize diffs in the dump, but it's certainly possible, assuming the dumper is "stable" (probably is because its walking the tables as trees). the amount of change detected by rsync might actually be a useful thing to monitor.
Why not just compress the whole database using `gzip` or `lz4` before rsyncing it instead? `zstd` works too but seems like it had a bug regarding compressing file with modified content.
better yet, split your sqlite file to smaller piece. it is not like it needs to contain all the app data in a single sqlite file.
So I see basic stuff needs to be repeated as people still miss those kinds of things.
But I learned that you can easily dump SQLite to a text file - neat!
My first try would've been to copy the db file first, gzip it and then transfer it but I can't tell whether compression will be that useful in binary format.
When is a guy supposed to get a coffee and stretch his legs anymore?