What version of the package or command are you using?
v3.1.1
What are you trying to do?
Trying to Extract and/or Unarchive files which are hardlinked to other files in a tarball.
What steps did you take?
tar.Unarchive(archivePath, destDir string)
tar.Extract(archivePath, target, destDir string)
What did you expect to happen, and what actually happened instead?
I expected a file which is a hardlink to another file in a tarball to be unarchived successfully. However, the untarFile
makes some bad assumptions about the value of hdr.Linkname
and calls to the Unarchive
and Extract
always fail. Following is a basic example of the original behavior.
Given a tarball with the following tree:
tarball.tar
dir-1/
-- dir-2/
---- file-a
---- file-b (hardlinked to file-a)
Now we unarchive:
toDir, _ := ioutil.TempDir("", "archiver")
err := archiver.DefaultTar.Unarchive("tarball.tar", toDir)
if err != nil {
panic(err)
}
And get an error similar to the following:
panic: reading file in tar archive: /tmp/archiver798762007/dir-1/dir-2/file-b: making hard link for: link /tmp/archiver798762007/dir-1/dir-2/file-b/dir-1/dir-2/file-a /tmp/archiver798762007/dir-1/dir-2/file-b: no such file or directory
What this error is saying is that an attempt was made to create a hardlink from new file /tmp/archiver548873541/dir-1/dir-2/file-b
to existing file /tmp/archiver548873541/dir-1/dir-2/file-b/dir-1/dir-2/file-a
. Observe that /tmp/archiver548873541/dir-1/dir-2/file-b/dir-1/dir-2/file-a
definitely does not exist as a file, though /tmp/archiver548873541/dir-1/dir-2/file-a
does.
The first part of the bug is that untarFile is joining the values of to
(which is "/tmp/archiver548873541/dir-1/dir-2/file-b"
) and hdr.Linkname
(which is "dir-1/dir-2/file-a"
) together to form a new, incorrect path. To get the correct path to file-a
the destination
path provided to Unarchive
should be joined with hdr.Linkname
, at least for tar (GNU tar) 1.30
; this might be different with other impl's of tar
.
While this "should" fix issues with calls to Unarchive
, calls to Extract
a directory will not immediately work. For example, given the same tarball.tar
, if instead we have the following code:
toDir, _ := ioutil.TempDir("", "archiver")
err := archiver.DefaultTar.Extract("tarball.tar", "dir-1/dir-2", toDir)
if err != nil {
panic(err)
}
The resulting panic is pretty familiar:
panic: walking file-b: extracting file dir-1/dir-2/file-b: /tmp/archiver851272160/dir-2/file-b: making hard link for: link /tmp/archiver851272160/dir-2/file-b/dir-1/dir-2/file-a /tmp/archiver851272160/dir-2/file-b: no such file or directory
Notice that the path to the new file /tmp/archiver851272160/dir-2/file-b
is correct. Also notice, however, that the path to the source of the link (file-a
) is that same as with the Unarchive
call. So, the fix here is to create the correct path to file-a
by joining the paths:
destination
provided to Extract
- the value from this call
path.Base(target)
given that target
was provided to Extract
path.Base(hdr.Linkname)
in untarFile
How do you think this should be fixed?
- Write unit tests around this behavior, first. Having this already would have prevented any issues to begin with.
- Create multiple tarballs with different
tar
variants (BSD, GNU, etc.) and into these tarballs put files that are hardlinked to other files in the tarball. Place these files at different dir depths within the tarball. Add these tarballs to the tests.
- See notes in previous section about the nature of the exact problem and how to fix.
A final note: these changes should fix unarchiving hardlinks from tar files, but no tests currently exist for creating tar files with hardlinks. It's very possible that functionality is also broken.