Comments (28)
Sorry was waiting for my weekend, ran the script and my input folder is at 101.82 gigabytes and my output folder is coming to 101.8 gigabytes, so it looks like those MKVs are transferring! Although I'm deleting them because they shouldn't even be in my library, at least the script now accounts for MKV though. I'm working on going through the 1,641 files with the wrong date and luckily I am getting somewhere, some are junk that I can delete, and the most important ones I can fix the dates by the filename. Anyways, thanks for your help!
from googlephotostakeouthelper.
I think I'm getting the same error here. I did read somewhere just delete the last picture it was working on, but it still crashes.
Traceback (most recent call last):
File "/usr/local/bin/google-photos-takeout-helper", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/google_photos_takeout_helper/__main__.py", line 593, in main
for_all_files_recursive(
File "/usr/local/lib/python3.8/dist-packages/google_photos_takeout_helper/__main__.py", line 117, in for_all_files_recursive
file_function(file)
File "/usr/local/lib/python3.8/dist-packages/google_photos_takeout_helper/__main__.py", line 517, in fix_metadata
set_creation_date_from_exif(file)
File "/usr/local/lib/python3.8/dist-packages/google_photos_takeout_helper/__main__.py", line 366, in set_creation_date_from_exif
exif_dict = _piexif.load(str(file))
File "/usr/local/lib/python3.8/dist-packages/piexif/_load.py", line 46, in load
exif_dict["GPS"] = exifReader.get_ifd_dict(pointer, "GPS")
File "/usr/local/lib/python3.8/dist-packages/piexif/_load.py", line 108, in get_ifd_dict
tag_count = struct.unpack(self.endian_mark + "H",
struct.error: unpack requires a buffer of 2 bytes
from googlephotostakeouthelper.
Ugh... piexif
is super buggy 😕
Tho I don't know if I can replace it... I neeed to just wrap it in try-catch
...
from googlephotostakeouthelper.
from googlephotostakeouthelper.
I know... nevertheless, it weird - i thought all exif operations were try-catched
🤔
I will improve this when I have some time 👍
from googlephotostakeouthelper.
This is a little off-topic but I'll be adopting PhotoPrism and I see a setting for not creating ExifTool JSON files, which means it does produce these same files. Does that mean I could import a takeout directly into PhotoPrism? Regardless, I was going to wait until this issue is fixed and then add the pictures into PhotoPrism, I was just curious how it would work with existing ExifTool files or if it would ignore them.
from googlephotostakeouthelper.
Ah I see they have their own page on it. I also looked to see if they had any issues open for the new Google Takeout years format but I don't see anything, I'm going to give their import process a shot.
from googlephotostakeouthelper.
ExifTool JSON files
Google's JSONs are not from ExifTool :/
they have their own page on it
Can you link it here? I'm curious what they have
I'm going to give their import process a shot.
Let us know how well it works 👍
from googlephotostakeouthelper.
pip install -U google-photos-takeout-helper==2.1.0b2
As always, try it out and let me know if works 👍
from googlephotostakeouthelper.
Nope. Still fails on the images it doesn't like:
/usr/local/bin/google-photos-takeout-helper -i brokenimages/ -o testout/
Heeeere we go!
=====================
Fixing files metadata and creation dates...
=====================
brokenimages/IMG_7322(1).jpg
Traceback (most recent call last):
File "/usr/local/bin/google-photos-takeout-helper", line 8, in <module>
sys.exit(main())
File "/home/jasontitus/.local/lib/python3.8/site-packages/google_photos_takeout_helper/__main__.py", line 593, in main
for_all_files_recursive(
File "/home/jasontitus/.local/lib/python3.8/site-packages/google_photos_takeout_helper/__main__.py", line 117, in for_all_files_recursive
file_function(file)
File "/home/jasontitus/.local/lib/python3.8/site-packages/google_photos_takeout_helper/__main__.py", line 517, in fix_metadata
set_creation_date_from_exif(file)
File "/home/jasontitus/.local/lib/python3.8/site-packages/google_photos_takeout_helper/__main__.py", line 366, in set_creation_date_from_exif
exif_dict = _piexif.load(str(file))
File "/home/jasontitus/.local/lib/python3.8/site-packages/piexif/_load.py", line 43, in load
exif_dict["Exif"] = exifReader.get_ifd_dict(pointer, "Exif")
File "/home/jasontitus/.local/lib/python3.8/site-packages/piexif/_load.py", line 122, in get_ifd_dict
value_num = struct.unpack(self.endian_mark + "L",
struct.error: unpack requires a buffer of 4 bytes
```
`
from googlephotostakeouthelper.
Um, did you even update the script?
pip install -U google-photos-takeout-helper==2.1.0b2
You should at least get a "oh-oh, script crashed" message I introduced in #56
For me, the output for your problematic image is like this:
Heeeere we go!
=====================
Fixing files metadata and creation dates...
=====================
test/IMG_4661(3).jpg
Can't read file's exif!
No exif for test/IMG_4661(3).jpg
Couldn't find json for file
Last chance, coping folder meta as date...
Couldn't pull datetime from album meta
ERROR! There was literally no option to set date!!!
TODO: We should do something about this - move it to some separate folder, or write it down in another .txt file...
=====================
Coping all files to one folder...
(If you want, you can get them organized in folders based on year and month. Run with --divide-to-dates to do this)
=====================
=====================
Removing duplicates...
=====================
DONE! FREEEEEDOOOOM!!!
Final statistics:
Files copied to target folder: 1
Removed duplicates: 0
Files for which we couldn't find json: 1
from googlephotostakeouthelper.
ExifTool JSON files
Google's JSONs are not from ExifTool :/
they have their own page on it
Can you link it here? I'm curious what they have
I'm going to give their import process a shot.
Let us know how well it works 👍
I'm referring to this help topic.
And ah ok I just assumed regarding ExifTool. Anyway I did what the help topic suggested and it seems it does import in the data, but it's still a mess of course (maybe that's because of the new structure). I'm thinking of trying this out again with your patch. Just curious though, what happens to files that can't find JSON? I'm assuming they get left in the original directory? Would I be left with a folder full of the original Takeout, and a folder that has most of the pictures but not the ones that failed?
from googlephotostakeouthelper.
from googlephotostakeouthelper.
what happens to files that can't find JSON?
Script tries to find any other way to find their creation date, from exif or folder name, and if there is absolutely no way, it just copies it as-is
Although I want to change this behavior later so it copies it to separate folder
from googlephotostakeouthelper.
Okay so I was able to run this successfully this time, but I noticed the output folder is 2GB less. 93 gigabytes vs 95 gigabytes. I don't think every file is in the output folder. What do you think of my results?
DONE! FREEEEEDOOOOM!!!
Final statistics:
Files copied to target folder: 20188
Removed duplicates: 0
Files for which we couldn't find json: 2761
Files where inserting correct exif failed: 2902
- you have full list in takeout-combined2/failed_inserting_exif.txt
Files where date was set from name of the folder: 0
(you have full list in takeout-combined2/date_from_folder_name.txt)
from googlephotostakeouthelper.
Removed duplicates: 0
HUH. This is either very lucky, or very weird... aspecially for 95GB 🤔
What do you think of my results?
Do you maybe have Linux/Mac? You can easily do:
cd you/takeout/folder
du -ch **/*.json
# This will print out what total weight of all json files
For my sample 4.3GB it was 31MB...
Please try to find how to count it on Windoza, if you have the misfortune to have it
I'm gonna be honest - I don't know, and have no good way to test it, it this script works flawlessly and copies everything... all the workarounds around duplicates etc made it complicated... but it should...
I just have an idea - can I replace the final copy
command with move
- just experimentally, to se if you maybe have some weird files in weird formats that are not included in is_photo() or is_video()
funcion... This wouldn't test if de-duplicating works well, but would always be something!
Files where inserting correct exif failed: 2902
Out of curiosity - can you tell me (just ctrl+f
in notepad in .txt.
file) how many of these files were jpg's and how many png's etc?
from googlephotostakeouthelper.
Total weight is at 123M. Failed PNGs at 910, failed JPGs at 447. Let me know if you want me to test anything else.
from googlephotostakeouthelper.
Failed PNGs
Failed files should be moved too, hmmm
I think this is our remove_duplicates
function could be somehow broken... either deleting something what it shouldn't, or just not logging something it legitimately deleted 🤔
Try to find if you have any photos/videos that are not from this list:
photo_formats = ['.jpg', '.jpeg', '.png', '.webp', '.bmp', '.tif', '.tiff', '.svg', '.heic']
video_formats = ['.mp4', '.gif', '.mov', '.webm', '.avi', '.wmv', '.rm', '.mpg', '.mpe', '.mpeg', '.m4v']
from googlephotostakeouthelper.
Sorted through using extension and only slight deviations I see are some files have capitalized extensions mixed in --> .HEIC .JPG .MOV .MP4
This is from the output folder.
from googlephotostakeouthelper.
is_photo and is_video
use .lower()
, so that shouldn't be a problem...
from googlephotostakeouthelper.
So even in Google Photos, I have a lot of pictures that lost their metadata. I had switched Google accounts at some point and uploaded all my pictures without the JSON, I don't even know if Google's upload tools take them into account? Anyway, a quarter of my library is under one day in Google Photos. Now as soon as I download or extract these pictures, they end up having a created date of today, but the filename is right.
Is it possible to add an option to count files that have creation dates as the current day as wrong and to get the date from the filename? This is an example - IMG_20161223_183024 1.jpg - this file has a date of June 10th, 2020 in Google Photos, but that was I believe the day I uploaded to the second Google account, when I download it, the date becomes today, and when I extract it from the Takeout archive it also becomes today.
from googlephotostakeouthelper.
Well not quite a quarter of my pictures, 920 to be exact, but that's still a lot I need to fix somehow. A lot of them are saved snapchats though which have random filenames, besides them though I have a lot that have the date and time in the filename.
from googlephotostakeouthelper.
Is it possible to add an option to count files that have creation dates as the current day as wrong and to get the date from the filename
Huh... that is doable...
Maybe I will do this in separate branch, just for you, because it could mess the script (and it's performance) very much, and 99% people won't use it
Then, you will just manually git pull
it
from googlephotostakeouthelper.
Ok, it's up to you, I was looking at using the divide-to-dates parameter to see what all pops up in today's folder, so I can see all the problem files I've accumulated from reuploads to Google Photos. I looked around and saw examples of using exiftool to do it, but I haven't tested the commands yet because I'm moving the archives to another system that has more storage so I don't have to keep getting low disk space warnings.
from googlephotostakeouthelper.
That's not to say this will find the 2GB of data not in the output folder, but I could always use the tool on everything separate from the folder containing all the missing metadata and see if that still happens.
from googlephotostakeouthelper.
using the divide-to-dates
That's a good idea! Then you can do:
for f in os.listdir():
if f[0:4] == 'IMG_':
date = f[4:12]
timestamp = datetime.strptime(
f[0:4]+':'+f[4:6]+':'+f[6:8],
'%Y:%m:%d'
)
os.utime(f, (timestamp, timestamp))
// This is just a reference script, it won't work. I can finish it if you don't know how to do it yourself 👍
2GB of data
Perhaps #57 fixed your problem? Try searching for more weird files
Sorted through using extension
...but inside input folder
from googlephotostakeouthelper.
Okay still missing 2GB, I'm on my Mac now though so I used HoudahSpot to do a more advanced search and these are the extensions my takeout has.
m4v, gif, heif, jpeg, mkv, mts, mp4, png, mov, bmp
I think it's the MKV files! They weren't even supposed to be in Google Photos, they accidentally got uploaded in lol.
from googlephotostakeouthelper.
Yay! So my script isn't fundamentally broken (maybe) 🎉!
Updated it. Try to pip install -U ...
and run again (good luck with that 95GB 😅 )
from googlephotostakeouthelper.
Related Issues (20)
- FileSystemException: Cannot set modification time HOT 8
- version: 3.4.1 maintains duplicates in output folder HOT 5
- Albums should use relative symlinks to allow moving the output tree as a whole HOT 4
- Mapping edited GPS data from json to exif HOT 2
- Option to divide output to folders by year/month/day HOT 3
- Can't download it HOT 1
- GPTH Doesn't change the date created, but the date modified HOT 1
- Can't get dates on -modifié files HOT 2
- date-unknown works only with divide-to-dates
- Invalid symlinks on Windows HOT 4
- Couldn't find dates for a lot of MP4 files HOT 1
- Can't output files to different drive than input HOT 6
- Need to run multiple times to move all files HOT 2
- PowerShell doesnt work :( - report that to @TheLastGimbus HOT 1
- gpth-windoza when press enter, doesn't get the open dialog box HOT 1
- The system cannot move the file to a different disk drive HOT 1
- Windows: duplicate files in separate folders HOT 6
- Google Takeout no longer exports metadata as JSON? HOT 5
- Feature Request: output folders by year (no month) HOT 2
- Interactive on windoza is confusing - need to type some char to actually respond etc HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from googlephotostakeouthelper.