How to extract multiple .tgz Google takeout archives

I love Google Photos for its easy of use and features. But, it’s Google. As you may knwo I like to selfhost all the things, but for the longest time I was not able to find a good selfhosted alternative for Google Photos.

Immich

Immich is a selfhosted photo and video management system. Sounds fancy, and it is. Aside from having a good mobile app for iOS that will do background backups, it sports facial recognition, hardware transcoding of videos, reverse geocoding and lots more. There’s a big disclaimer though, Immich is still under active development. I see that more as an encouragment to use it than a warning :-)

So, naturally, I want to move all my photos and videos (going back to 2006) from Google Photos to Immich.

Google Takeout

Due to great EU legislation Google Takeout exists. It allows you to easily create an archive of your data for export. Now, how useful that data export is going to be is another matter.

Because I have a lot of data, Google is offering me to create multiple .tgz files for export of 50GB each. I now have six Google takeout archives:

1-rwxrwxr-x 1 ariejan ariejan 50G Mar  5 11:16 takeout-001.tgz
2-rwxrwxr-x 1 ariejan ariejan 50G Mar  5 11:16 takeout-002.tgz
3-rwxrwxr-x 1 ariejan ariejan 50G Mar  5 11:16 takeout-003.tgz
4-rwxrwxr-x 1 ariejan ariejan 50G Mar  5 11:16 takeout-004.tgz
5-rwxrwxr-x 1 ariejan ariejan 50G Mar  5 11:16 takeout-005.tgz
6-rwxrwxr-x 1 ariejan ariejan 39G Mar  5 11:16 takeout-006.tgz

So, I was wondering, how do I unpack these files? Well, there are several ways I want to document.

cat | tar

With some linux-fu I could easily pipe all these files into tar to extract:

1cat takeout-{001..006}.tgz | tar xzivf -

Or, I could just glob all the files, they will be ordered automatically. How nice is that!

1cat takeout-*.tgz | tar xzivf -

Both these methods will pipe the data from the archives, in order, into tar, which will extract the provided data.

pv | tar

If you want to go fancy and have pv installed, you can use that as well. pv is a utility that monitors progress of data through a pipe.

1pv takeout-*.tgz | tar xzif -
288.9GiB 0:19:19 [89.2MiB/s] [==========>                             ] 30% ETA 0:43:04

Importing into Immich

Now, with the data extracted, how am I going to import it into Immich? There are tools for that and I’ll post again on how that process went.

Tags: protip tgz tar