Coder Social home page Coder Social logo

jakewharton / docker-gphotos-sync Goto Github PK

View Code? Open in Web Editor NEW
391.0 19.0 35.0 42 KB

A Docker image for synchronizing your original-quality Google Photos

Home Page: https://hub.docker.com/r/jakewharton/gphotos-sync

License: MIT License

Dockerfile 35.75% Shell 64.25%
docker google-photos backup

docker-gphotos-sync's Introduction

Docker GPhotos Sync

A Docker container which runs the gphoto-cdp tool automatically to synchronize your Google Photos (in original quality!).

Docker Image Version Docker Image Size

Motivation

Your photos are too valuable to leave solely in Google's hands. While it's extremely unlikely that Google would ever lose data, it's far more likely for you to lose access to your account (for whatever reason).

Currently, the only way to obtain original backups of your photos is through Google Takeout. Obtaining this backup is a tedious, manual process which is also not incremental.

Google Photos does have an API making traditional incremental backup feasible, but you do not have access to the original image making it a lossy solution.

The gphoto-cdp tool uses the Chrome Devtools Protocol to drive the normal web interface of Google Photos to download orignial copies in an incremental fashion. This repository is a Docker-ized version of the tools which runs it on a periodic basis.

Setup

Select and create two directories:

  • The "download" directory where images will be stored. (From now on referred to as /path/to/download)
  • The "config" directory where your Google account authentication credentials will be stored. (From now on referred to as /path/to/config)

Sign In

In order for the headless, automatic sync to work, you need to first authenticate with your Google account.

Linux

Open a terminal and run the following command.

chromium-browser --user-data-dir=/path/to/config https://photos.google.com

Click "Go to photos" and sign in to your Google account. You are free to use your real password or to create an app-specific password.

Close the Chrome window. You're done!

MacOS / Windows

To do this, we start a Linux-based Docker container with Chrome and remotely sign in.

In a terminal, run the following command, replacing /path/to/config with your chosen "config" directory.

$ docker run -p 6080:80 \
    -v /path/to/config:/config \
    dorowu/ubuntu-desktop-lxde-vnc

Visit http://localhost:6080 (or http://server-ip:6080 if running on a remote machine) which will connect to the container's desktop.

Inside the container desktop, click on the menu (lower left icon), go to "System Tools", and select "LXTerminal". Run the following command.

google-chrome --user-data-dir=/config --no-sandbox https://photos.google.com

(Note: Do not change the /config path!)

Click "Go to photos" and sign in to your Google account. You are free to use your real password or to create an app-specific password.

Close the Chrome window inside the container desktop.

Close your browser's tab.

Press CTRL+C to quit the Docker image. You're done!

Initial Sync

The first time this container runs, it will start from your oldest photo and cycle its way to your newest, downloading each along the way. This is a very slow operation that will take many hours or even days. Yes, days. In the process it will download tens or hundreds of gigabytes of images.

It is not required, but if you'd like to run this sync manually you can choose to do so. This allows you to temporarily interrupt it at any point and also intervene if it gets stuck.

$ docker run -it --rm
    -v /path/to/config:/tmp/gphotos-cdp \
    -v /path/do/downloads:/download \
    jakewharton/gphotos-sync
    /app/sync.sh

This will run until all photos have been downloaded. At this point, you should set it up to run automatically on a schedule.

Running Automatically

To run the sync automatically on a schedule, pass a valid cron specifier as the CRON environment variable.

$ docker run -it --rm
    -v /path/to/config:/tmp/gphotos-cdp \
    -v /path/do/download:/download \
    -e "CRON=0 * * * *" \
    jakewharton/gphotos-sync

The above version will run every hour and download any new photos. For help creating a valid cron specifier, visit cron.help.

More

To be notified when sync is failing visit https://healthchecks.io, create a check, and specify the ID to the container using the HEALTHCHECK_ID environment variable.

Because the sync can occasionally fail, it's best to set a grace period on the check which is a multiple of your cron period. For example, if you run sync hourly give a grace period of two hours.

To write data as a particular user, the PUID and PGID environment variables can be set to your user ID and group ID, respectively.

Diagnosing Blockages

The script will occasionally fail to download an image or video. This usually isn't something to worry about and it will resume when retried (either manually or automatically).

Sometimes, however, the script will get stuck on a single item and be unable to make progress. Usually this item will be a video.

When this happens, open the last Google Photos link from the logs. This is the last successful item that was download. Pressing the left arrow will move forward in time to the offending item. If you click "Download" in the three-dot menu and the item downloads then keep trying the sync. But if it fails to download, there is something wrong on Google's side. The only recourse is to delete the image or video. This will allow the script to continue on its next run.

Deleting an image or video should be a last resort. Retry at least 5 times, potentially waiting an hour or two in between. You can also get a Google Takeout of your Photos data and look for the item in the resulting archives.

Docker Compose

version: '2'
services:
  gphotos-sync:
    image: jakewharton/gphotos-sync:latest
    restart: unless-stopped
    volumes:
      - /path/to/config:/tmp/gphotos-cdp
      - /path/to/download:/download
    environment:
      - "CRON=0 * * * *"
      #Optional:
      - "HEALTHCHECK_ID=..."
      - "PUID=..."
      - "PGID=..."

Note: You may want to specify an explicit version rather than latest. See https://hub.docker.com/r/jakewharton/gphotos-sync/tags.

Development

With Docker installed, docker build . will give you a SHA that you can use.

$ docker build .
...
Successfully built 7b431e7e9868

Use that SHA in place of jakewharton/gphotos-sync in the commands above to manually test.

LICENSE

MIT. See LICENSE.txt.

Copyright 2020 Jake Wharton

The Chrome installation in the Dockerfile is from Zenika/alpine-chrome. jhead installation from sourcelevel/engine-image-optim

docker-gphotos-sync's People

Contributors

burntcookie90 avatar chrisbanes avatar jakewharton avatar jmfayard avatar marek77 avatar mick88 avatar msfjarvis avatar rharter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

docker-gphotos-sync's Issues

Container exits after attempting 1 photo

crond[198]: crond (busybox 1.31.1) started, log level 6 [services.d] done. INFO: Starting sync.sh PID 208 Fri May 29 14:31:22 UTC 2020 2020/05/29 14:31:22 Session Dir: /tmp/gphotos-cdp 2020/05/29 14:31:22 pre-navigate 2020/05/29 14:31:26 post-navigate 2020/05/29 14:31:27 Page loaded, most recent item in the feed is: AF1QipPMyx6SD5ntf3vOiWHpjFuTL4HtRE9eyhWEJnAY 2020/05/29 14:31:29 Event: {Type:keyDown Modifiers:Shift Timestamp:<nil> Text: UnmodifiedText: KeyIdentifier: Code:KeyD Key:D WindowsVirtualKeyCode:68 NativeVirtualKeyCode:68 AutoRepeat:false IsKeypad:false IsSystemKey:false Location:0} 2020/05/29 14:31:29 Event: {Type:keyUp Modifiers:Shift Timestamp:<nil> Text: UnmodifiedText: KeyIdentifier: Code:KeyD Key:D WindowsVirtualKeyCode:68 NativeVirtualKeyCode:68 AutoRepeat:false IsKeypad:false IsSystemKey:false Location:0} 2020/05/29 14:31:30 ERROR: unhandled page event *page.EventDownloadWillBegin 2020/05/29 14:31:30 Marking https://photos.google.com/photo/AF1QipPMyx6SD5ntf3vOiWHpjFuTL4HtRE9eyhWEJnAY as done 2020/05/29 14:31:30 Running /app/fix_time.sh on /download/AF1QipPMyx6SD5ntf3vOiWHpjFuTL4HtRE9eyhWEJnAY/IMG_1587.HEIC Unable to set file mtime. Unsupported file extension: heic OK INFO: Completed sync.sh PID 208 Fri May 29 14:31:30 UTC 2020 [cmd] /app/sync.sh exited 0 [cont-finish.d] executing container finish scripts... [cont-finish.d] done. [s6-finish] waiting for services. [s6-finish] sending all processes the TERM signal. [s6-finish] sending all processes the KILL signal and exiting.

Everything seems to be set up correctly just fails after this 1 photo constantly

"more than one file (2) in download dir"

$ docker run --name gphotos-cdp-sync --rm -v /gphotos-sync:/tmp/gphotos-cdp -v /dl/cdp-gphotos-sync:/download jakewharton/gphotos-sync:0.3.1 /app/sync.sh

...snip
2020/11/15 21:26:24 Running /app/fix_time.sh on /download/AF1QipBizz_jZXshGM1QuBcdxkMmiP-_EDMi7eLjKxHY/DSC00874.JPG
/download/AF1QipBizz_jZXshGM1QuBcdxkMmiP-_EDMi7eLjKxHY/DSC00874.JPG
2020/11/15 21:26:25 Event: {Type:keyDown Modifiers:Shift Timestamp:<nil> Text: UnmodifiedText: KeyIdentifier: Code:KeyD Key:D WindowsVirtualKeyCode:68 NativeVirtualKeyCode:68 AutoRepeat:false IsKeypad:false IsSystemKey:false Location:0}
2020/11/15 21:26:25 Event: {Type:keyUp Modifiers:Shift Timestamp:<nil> Text: UnmodifiedText: KeyIdentifier: Code:KeyD Key:D WindowsVirtualKeyCode:68 NativeVirtualKeyCode:68 AutoRepeat:false IsKeypad:false IsSystemKey:false Location:0}
2020/11/15 21:26:25 ERROR: unhandled page event *page.EventDownloadWillBegin
2020/11/15 21:26:26 more than one file (2) in download dir "/download"
[cmd] /app/sync.sh exited 1
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.

This seems to be happening after a few photos are downloaded.

And the files:

-rwxr-xr-x 1 user grp       0 Nov 15 13:26 DSC00853.JPG*
-rwxr-xr-x 1 user grp 3009501 Nov 15 13:26 DSC00853.JPG.crdownload*
-rwxr-xr-x 1 user grp      76 Nov 15 13:26 .lastdone*

Seems like a race?

authentication not possible in -headless mode

System Windows10 1909
DockerDesktop 2.2.0.5

Sign-in via dorowu/ubuntu-desktop-lxde-vnc without problems

PS C:\Users\philbring\Documents\GooglePhotosBackup> docker-compose up
Creating network "googlephotosbackup_default" with the default driver
Creating googlephotosbackup_gphotos-sync_1 ... done Attaching to googlephotosbackup_gphotos-sync_1
gphotos-sync_1 | INFO: No CRON setting found. Running sync once.
gphotos-sync_1 | INFO: Add CRON="0 0 * * *" to perform sync every midnight
gphotos-sync_1 | INFO: Starting sync.sh PID 7 Mon Apr 13 06:44:10 UTC 2020
gphotos-sync_1 | INFO: Starting sync!
gphotos-sync_1 | 2020/04/13 06:44:10 Session Dir: /tmp/gphotos-cdp
gphotos-sync_1 | 2020/04/13 06:44:10 pre-navigate
gphotos-sync_1 | 2020/04/13 06:44:11 authentication not possible in -headless mode

downloading in "/download" took too long to start

I am not sure where this is hanging up but I presume it's because I have so many photos that it can't get to the end in time.
I have tried putting newer / random photo URLs into .lastdone but this continues to plague my backup keeping it from ever getting all the photos.

Custom `-run` script without re-building the container?

I'm wondering if you have ideas on how to run a custom script with -run without rebuilding the container. I mainly want to extend the current script to fix some exif data (which will require exiftool to be added to the container1), and to copy/move files to a different directory.

If it's a thing you expect others to want to do, maybe exiftool can be added to the container, and -run script path can be a env-var?

1: I'm noticing that some files have incomplete EXIF data (for eg, they have modified date, but not created - defaulting thus to file creation date). In those cases, I want to fix the timestamps.

First time sync - a tip

Can be useful with large libraries.

Instead of letting the app navigate to the first image, one could go to the Photos web app, scroll all the way to the bottom, find the last image, open it. Copy the resulting URL and then:

# In the download directory
$ echo -e $URL > .lastdone

Now, CDP will open that file and start navigating to the left and downloading images. Of course this means that the first image won't be downloaded, but you can do that manually.

Container does nothing?

Hello,

I've configured the container in Synology including setting up dorowu/ubuntu-desktop-lxde-vnc to log into photos.google.com and save the username/password to /config.

However, when I launch gphoto-sync, nothing seems to happen? The last log I see is [services.d] done... and it just sits there?

Jake

lsof missing from images

Because we're relying on other images for Chrome. Probably need to build our own to keep the size minimal and to install this tool.

Testing?

We can easily test failure scenarios, but it's hard to test regular working scenarios. Figure it out.

Chromium-browser is missing

In the terminal when I try to run the first command chromium-browser I get a error that it is not found. the only thing it finds is "chromium-browser-sound.sh"
I cannot get chrome to open up so I can sign into the browser.
Thanks

authentication not possible in -headless mode

Is this project dead now? I haven't seen any response to my issue requests.

I'm now receiving the error in the title . I have re-authenticated several times (perhaps I need to update chrome in that container)

INFO: Starting sync.sh PID 252 Tue Apr 12 00:29:11 UTC 2022
2022/04/12 00:29:12 Session Dir: /tmp/gphotos-cdp
2022/04/12 00:29:12 pre-navigate
2022/04/12 00:29:22 authentication not possible in -headless mode

any ideas?

Dockerfile no longer builds

I tried building the dockerfile, but it's using go get, which is deprecated on the version of go installed in the base image:

 => [build 5/6] RUN wget http://www.sentex.net/~mwandel/jhead/jhead-3.04.tar.gz     && tar zxf jhead-3.04.tar.gz     && cd jhead-3.04     && make     && make install                     1.7s
 => ERROR [build 6/6] RUN go get github.com/perkeep/gphotos-cdp@e9d1979707191993f1c879ae93f8dd810697fd6e                                                                                  0.6s
------                                                                                                                                                                                         
 > [build 6/6] RUN go get github.com/perkeep/gphotos-cdp@e9d1979707191993f1c879ae93f8dd810697fd6e:                                                                                             
#13 0.564 go: go.mod file not found in current directory or any parent directory.                                                                                                              
#13 0.564       'go get' is no longer supported outside a module.                                                                                                                              
#13 0.564       To build and install a command, use 'go install' with a version,                                                                                                               
#13 0.564       like 'go install example.com/cmd@latest'                                                                                                                                       
#13 0.564       For more information, see https://golang.org/doc/go-get-install-deprecation                                                                                                    
#13 0.564       or run 'go help get' or 'go help install'.                                                                                                                                     
------                                                                                                                                                                                         
executor failed running [/bin/sh -c go get github.com/perkeep/gphotos-cdp@e9d1979707191993f1c879ae93f8dd810697fd6e]: exit code: 1                                                              

Hang process workaround - timeout

So I just noticed that the container was hung for me for the last 2 days. A workaround I figured is to set a timeout on the container itself:

# on my ubuntu server
$ timeout 3h docker run --name gphotos-sync --rm -v ...

This basically kills the container after 3h and my crontab carries on the next time it ticks. Thought it was worth sharing here.

groupmod not available using trunk

Just tried out trunk for the new ARM builds. Unfortunately it fails at creating the user.

A quick search brought up the Alpine specific adduser command: https://wiki.alpinelinux.org/wiki/Setting_up_a_new_user

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...       
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...   
[cont-init.d] 10-adduser.sh: executing...                     
/var/run/s6/etc/cont-init.d/10-adduser.sh: line 26: groupmod: not found
/var/run/s6/etc/cont-init.d/10-adduser.sh: line 27: usermod: not found
id: unknown user abc
id: unknown user abc

Initializing container

User uid:
User gid:

chown: unknown user/group abc:abc                             
[cont-init.d] 10-adduser.sh: exited 1.                        
[cont-finish.d] executing container finish scripts...         
[cont-finish.d] done.
[s6-finish] waiting for services.                             
[s6-finish] sending all processes the TERM signal.            
[s6-finish] sending all processes the KILL signal and exiting.

deleting the most recent item on the server causes hang

Following a successful sync, if the most recent item is deleted from Google Photos, the next sync hangs indefinitely after "Marking https://photos.google.com/photo/xxxxxxxxxxxxxx" without timing out (xxxxxxxxxxxxxx is the deleted item).
Uploading more items hence creating a new most recent item in the feed won't fix the problem. Only restoring the deleted item from the trash enables successful completion.
Looks like if the program can't find the last item it synced in the previous iteration on the server, it just hangs.

Allow custom healthcheck URL

I use a self-hosted HC instance and would like to use that for monitoring gphotos-sync.

It'll be more flexible if HEALTHCHECK_ID was instead HEALTHCHECK_URL.

fix_time.sh fails for PNGs and halts the backup

This seems to be happening for screenshots I got in my photo library.

2020/10/18 08:28:55 Running /app/fix_time.sh on /download/AF1QipNs0cGskrFJsoxUW-X65C777zLQ3Gmp0O4VXFC2/Screenshot_20201017-215132.png
Unable to set file mtime. Unsupported file extension: png

Error on Synology DSM 7

I'm trying to follow this guide (https://github.com/JakeWharton/docker-gphotos-sync) using portainer but I'm getting this error (Error relocating /usr/lib/libnspr4.so) and I couldn't find a solution on google. Could you please help me out?

This is the log:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 10-adduser.sh: executing...
Initializing container

User uid: 1024
User gid: 100

[cont-init.d] 10-adduser.sh: exited 0.
[cont-init.d] 20-cron.sh: executing...

Initializing cron

*

[cont-init.d] 20-cron.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
INFO: Starting sync.sh PID 215 Thu Jul 29 18:00:00 UTC 2021
2021/07/29 18:00:01 Session Dir: /tmp/gphotos-cdp
2021/07/29 18:00:03 chrome failed to start:
Error relocating /usr/lib/libnspr4.so: gettid: symbol not found
INFO: Starting sync.sh PID 234 Thu Jul 29 19:00:00 UTC 2021
2021/07/29 19:00:01 Session Dir: /tmp/gphotos-cdp
2021/07/29 19:00:03 chrome failed to start:
Error relocating /usr/lib/libnspr4.so: gettid: symbol not found

This is what I have in portainer:
version: '2'
services:
gphotos-sync:
container_name: gphotos-sync
image: jakewharton/gphotos-sync:latest
restart: unless-stopped
volumes:
- /volume1/docker/gphotos-cdp:/tmp/gphotos-cdp
- /volume1/photo/gphotos-sync:/download
environment:
- TZ=${TZ}
#hourly
- "CRON=0 * * * *"
#Optional:
- "HEALTHCHECK_ID=..."
- PUID=1024
- PGID=100

Chromium browser missing in dorowu/ubuntu-desktop-lxde-vnc

I'm debugging some auth issues (related to issue 25), and was trying to use the VNC image:

docker run -p 6080:80 \
    -v /path/to/config:/config \
    dorowu/ubuntu-desktop-lxde-vnc

However, it appears that chromium-browser is missing. The README should be updated with an install command.

ARM/M1 compatibility

Hello!

Thanks for making this module :)

I'm trying to do my initial signing, but when running the command in LXTerminal I get a qemu uncaught target signal 5 error that I suspect is because of my CPU architecture.

Is there another way to get the login configs I need? I am happy to dig around in browser console and localstorage if needed.

Thank you!

Do something about hangs

Sometimes the gphotos-cdp app hangs after running fix_time. Thankfully monitoring will detect this, assuming you have it set up. Other cron-based containers have additional cron schedules to force kill the main app if running.

Readme updates for log?

Hi there -- thanks so much for this tool!

For those of us that don't have experience with chrome dev tools, can you provide any insight in the readme how to look at the log? You state that you should look at the log if you think something is hung, but the only logs I can find in the container are from the chrome profile within /tmp/gphotos-cdp. Any chance you could update the README with some info on how to read those (or where the correct logs are)?

Error relocating /usr/lib/libnspr4.so: gettid: symbol not found

Chrome is failing to start on my system.

Ubuntu Server 20.04

Authenticated via Chromium on PopOS and scp'd the files over.

Error Log

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 10-adduser.sh: executing... 

Initializing container

User uid: 1000
User gid: 1000

[cont-init.d] 10-adduser.sh: exited 0.
[cont-init.d] 20-cron.sh: executing... 

Not running in cron mode

[cont-init.d] 20-cron.sh: exited 0.
[cont-init.d] done.
[services.d] starting services
crond[202]: crond (busybox 1.31.1) started, log level 6
[services.d] done.
INFO: Starting sync.sh PID 212 Mon Oct  4 21:31:53 UTC 2021
2021/10/04 21:31:53 Session Dir: /tmp/gphotos-cdp
2021/10/04 21:31:53 chrome failed to start:
Error relocating /usr/lib/libnspr4.so: gettid: symbol not found
[cmd] /app/sync.sh exited 1
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.
[s6-finish] waiting for services.
[s6-finish] sending all processes the TERM signal.
[s6-finish] sending all processes the KILL signal and exiting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.