mahsoud / boar Goto Github PK
View Code? Open in Web Editor NEWAutomatically exported from code.google.com/p/boar
Automatically exported from code.google.com/p/boar
What steps will reproduce the problem?
1. Check out a workdir
2. Make the repository unavailable (move it or rename it)
3. Execute "boar info"
What is the expected output? What do you see instead?
Boar should give us any available info on the workdir. Now boar only reports an
error because the repo is unavailable.
Original issue reported on code.google.com by [email protected]
on 25 Sep 2011 at 9:10
I notice boar spends significant amount of time listing files and folders
(common.py:get_tree), for instance, listing just 20000 files may take > 15s.
This is an usability issue for folders with hundreds of thousands files. I have
a folder with 400,000 files and it takes forever to complete.
My suggestion is boar should display verbose progress for some of its steps
where it potentially takes a significant amount of time to complete. I think
just counting the number of files being listed in a folder should have a
positive psychological effect, though you can't know in advance the total
number of files.
Also printing line after line like this:
Remaining: 2053 files, 14959 Mb (0.0% complete, 0.0 Mb/s)
Remaining: 2052 files, 14415 Mb (3.6% complete, 16.3 Mb/s)
...
is not very user friendly. Instead it should update the progress on a single
line, similar to this: http://pypi.python.org/pypi/progressbar/2.2
Original issue reported on code.google.com by [email protected]
on 9 Feb 2012 at 9:34
The ci, update, and status commands currently always work on the full workdir
tree. Scanning the full tree can be time consuming. To make it easier to use
work directories containing many files, it should be possible to execute
operations on only parts of the tree. For instance, if you know that you have
only changed files in a specific directory, it serves no purpose to scan any
files outside that directory.
Original issue reported on code.google.com by [email protected]
on 9 Feb 2012 at 9:58
The option to specify files to ignore would be very useful. Currently, I need
to be able to ignore Thumbs.db in all directories, so a spec like "Thumbs.db"
should ignore any file with that filename.
Similarly, other options might be "c:\temp" to ignore an entire subdirectory,
"*.tmp" to ignore any file ending in tmp, "c:\temp\*.tmp" for a combination.
Original issue reported on code.google.com by [email protected]
on 21 Feb 2011 at 2:50
What steps will reproduce the problem?
1. Import a directory with option -v -n
What is the expected output? What do you see instead?
Boar should dry run through and print out all necessary information. Instead it
spitted out :
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa1'
in position 0: ordinal not in range(128)
and stop
What platform are you using? (Windows XP, Windows 7, Linux, ...)
Windows 7
What version of Python are you using?
2.6
What version of boar are you using? (Mercurial change id or daily build
date)
Boar daily 13 Feb-2011
Please provide any additional information below.
This is an issue with sys.stdout.print cannot print out non-ascii character. A
common fix is to use Django's smart_str.
Reference :
http://www.saltycrane.com/blog/2008/11/python-unicodeencodeerror-ascii-codec-can
t-encode-character/
Original issue reported on code.google.com by [email protected]
on 24 Feb 2011 at 4:30
As it stands, the blobs/ dir is subdivided into 256 folders. For use cases
involving very large datasets (ie 1M+ files) having directories with 3000+
files in them gets unwieldy and can effect performance. What are your thoughts
on allowing an upgrade path for /blobs/12/34/1234567890abcdef ?
This would allow for virtually any size dataset (2 subdir nesting is what you
often see in the urls of file and imagehosts that store files by hash). If you
want to allow backward compatibility, you could specify a "repo version"
property either in the main repo dir or in the session file?
Anyway, really liking boar. I need to brush up on python a bit, but I'd like to
submit patches sooner rather than later, and not just endless issues/requests :P
Original issue reported on code.google.com by [email protected]
on 21 Dec 2011 at 5:57
What steps will reproduce the problem?
1. Go to any non-workdir directory
2. Execute "status" or "info" commands.
What is the expected output? What do you see instead?
There is an ugly exception. There should be a nice message explaining that
these commands cannot be used outside a workdir.
Original issue reported on code.google.com by [email protected]
on 21 Jan 2012 at 4:57
What steps will reproduce the problem?
1. Create a repository with a repo v0 version of boar (boar-daily.11-Jul-2011
or earlier)
2. Make the repository read-only by changing the permissions or burn it to a
dvd.
3. Try to check out the contents of the repository.
What is the expected output? What do you see instead?
The operation should complete normally. Instead, an exception occurs because
the repository can't be upgraded to the current format due to write protection.
A workaround is to copy the repository to a location where it can be modified
to allow the repository to be upgraded.
Original issue reported on code.google.com by [email protected]
on 8 Oct 2011 at 11:14
A diff command to be able to see the differences between revisions.
I know there has been discussion of issues with linked vs independent snapshot
states with regard to efficiency trade offs. What are your thoughts on breaking
up the single bloblist into separate components more along the lines of the git
object model?
http://book.git-scm.com/1_the_git_object_model.html
I understand that simplicity is a guiding principal of boars design, but having
separate per directory hashed "tree" objects can retain the same simple json
format, but should allow a diff operation to run more efficiently by quickly
surfacing only those directories that are dissimilar between revisions and
allowing the script to quickly drill down to those differences. Just a thought.
Original issue reported on code.google.com by [email protected]
on 19 Dec 2011 at 7:08
What steps will reproduce the problem?
1. Install ActivePython 2.7.1.3
2. Install Boar
3. execute boar
What is the expected output? What do you see instead?
C:\>boar.bat
C:\>C:\Python27\python.exe C:\Python27\boar\boar
Traceback (most recent call last):
File "C:\Python27\boar\boar", line 35, in <module>
import workdir
File "C:\Python27\boar\workdir.py", line 40, in <module>
import dbhash
File "C:\Python27\lib\dbhash.py", line 7, in <module>
import bsddb
ImportError: No module named bsddb
What platform are you using? (Windows XP, Windows 7, Linux, ...)
windows XP
What version of Python are you using?
2.7.1.3.
What version of boar are you using? (Mercurial change id or daily build date)
boar-daily.05-Feb-2011.zip
Please provide any additional information below.
from http://docs.python.org/library/bsddb.html
Deprecated since version 2.6: The bsddb module has been deprecated for removal
in Python 3.0.
Original issue reported on code.google.com by [email protected]
on 9 Feb 2011 at 11:11
What steps will reproduce the problem?
1. Create a small test file tree containing an unreadable sub directory (chmod
a-rx)
2. Import the tree with boar.
What is the expected output? What do you see instead?
The import goes without problems. Boar should indicate in some way that not the
entire tree could be imported.
This problem is not easy to fix in python earlier than 3.0, because of the way
the os.path.walk function behaves. There is no way to handle errors in a custom
way, and the default handling is to ignore unreadable directories. Implementing
a pure python walker would be a solution, but is likely to be too slow to be
worth it.
Original issue reported on code.google.com by [email protected]
on 7 Aug 2011 at 1:43
When boar can't update a file, an unhandled exception occurs and stops the
process.
What steps will reproduce the problem?
1. Attept to update a directory containing files with the System and Hidden
attributes.
2. Observe the exception.
What is the expected output? What do you see instead?
I expect the action taken to be configurable, (ie. Can't update Thumbs.db,
Skip/All/Ignore/Fail)
The Traceback is as follows:
Updating: photos/2009/100CANON/Thumbs.db
Traceback (most recent call last):
File "C:\Python26\boar\boar", line 498, in <module>
main()
File "C:\Python26\boar\boar", line 470, in main
cmd_update(args[1:])
File "C:\Python26\boar\boar", line 280, in cmd_update
wd.update(new_revision = options.revision)
File "C:\Python26\boar\workdir.py", line 141, in update
fetch_blob(front, b['md5sum'], target_abspath, overwrite = True)
File "C:\Python26\boar\workdir.py", line 482, in fetch_blob
f = open(target_path, "wb")
IOError: [Errno 13] Permission denied:
u'E:\\Pictures\\photos/2009/100CANON/Thumbs.db'
What platform are you using? (Windows XP, Windows 7, Linux, ...)
Windows 7 Ultimate
What version of Python are you using?
Python 2.6.5
What version of boar are you using? (Mercurial change id or daily build date)
Daily Build 2/13
Original issue reported on code.google.com by [email protected]
on 21 Feb 2011 at 3:00
At the moment, when a user runs import on a directory and nothing has changed,
the import is still committed. Is there a use case for this?
I'm not sure if this behavior is by design, or if it would perhaps be more
intuitive to display an info/warning message to the user that nothing has
changed and not create the new session or at the very least ask for
confirmation from the user.
If this is by design, I'm just wondering what the use case for it is (sessions
where nothing has changed).
Original issue reported on code.google.com by [email protected]
on 15 Jan 2012 at 12:36
Watching a cursor blink for 20+ minutes and *hoping* the program you're running
is doing something and didn't just hang is a little unsettling. It's nice to be
able to monitor (or at least get some sense) of the progress of any program
that might take a particularly long time to complete execution (as is the case
with boar when importing large directories).
Having a --progress switch, varying levels of verbosity (ie. -v, -vv, ..), or
even a --debug switch (along the lines of wget) to print some sort of progress
output to the console would be much appreciated.
Even if there's a performance hit, for those that choose to enable it, having a
(processed/total) style progress indicator can be very nice especially on huge
imports as otherwise there is zero way to gage how far along boar is (or that
it is infact making progress during certain phases like the initial hash
calculation).
Thanks :)
Original issue reported on code.google.com by [email protected]
on 21 Dec 2011 at 5:15
Add a --version command to print out the current version of boar. When
reporting issues or troubleshooting problems it makes it easy to confirm the
version you are running from the command line.
Original issue reported on code.google.com by [email protected]
on 8 Jun 2011 at 10:17
This enhancement request is similar to Issue 13, but the idea would be to be
able to strike an entire directory from the "snapshot" of a given session
without disrupting the rest of the snapshot.
The use case is similar to that of issue 13. In the case of considerable
reorganization of a media collection and particularly when a large directory is
permanently removed from the collection and there is no longer a need/desire to
maintain copies of that data, it would be nice to be able to remove a directory
from a snapshot.
It would require iterating thru the bloblists of all linked sessions and
removing any entries occurring at and below a given directory and removing
blobs that are no longer present in any of the resulting bloblists.
To guard against corruption the linked sessions can first be cloned and the
operation can be conducted on this clone to create a fork of the session.
In fact the "purge" command can simply handle the bloblist cleanup, while a
second "cleanup" command can created to remove blobs that no longer have any
entries in the bloblist.
Given that a guiding use case for boar is managing large media collections,
being able to gracefully retire content from boar sessions/repos is important.
I may upgrade all my music to a lossless format and decide to retire my "mp3"
folder from my boar music repo for instance.
Anyway, I can't think of any obstacles that should make this too difficult to
implement, however I've been wrong before... once ;)
Regardless, I think this is an important/desirable functionality for a tool
like boar.
Original issue reported on code.google.com by [email protected]
on 10 Jun 2011 at 1:09
Currently list shows all sessions
list session show revisions within a session
extending this to list session/offset would make the use of /offset much more
useful (as it stands, I'm not exactly sure what functionality the
session/offset provides). Is there some way to drill down to only view things
by session/offset already or is that an option to allow for future
functionality?
Original issue reported on code.google.com by [email protected]
on 21 Dec 2011 at 11:11
What steps will reproduce the problem?
1. Create a new directory with the following files with any content:
deleted_file.txt, subdir/file.txt.
2. Import the directory as a new session named "TestSession"
3. Check out the root of "TestSession" as workdir1
4. Check out "TestSession/subdir" as workdir2
5. Delete the file workdir1/deleted_file.txt
6. Commit the changes in workdir1
7. Execute an update command in workdir2
(These steps are also implemented in the attached test script)
What is the expected output? What do you see instead?
The update should run without any error messages. Instead, a message "Deletion
failed: deleted_file.txt" is shown.
Please use labels and text to provide additional information.
Original issue reported on code.google.com by [email protected]
on 11 Apr 2011 at 7:50
Attachments:
I know you stated that boar is designed to work with bash and windows cmd
shells, so I'm not sure if you ever have occasion to use cygwin on windows. It
would be nice if boar would do something intelligent when it sees a cygwin
style path in terms of converting it to a compatible path so regardless of what
style of path (*nix, windows, or cygwin) boar behaves correctly cross platform.
Cygwin paths looks like "/cygdrive/drive_letter/path/to/file".
When making a directory a working directory I know the path value is stored in
the info file; it is these that should be correctly interpreted if one
shell/platform is used to make a directory a working directory and a different
shell is used to make a check in from that directory.
Original issue reported on code.google.com by [email protected]
on 15 Jan 2012 at 7:21
Some commands that require a workdir or a repository to work on, will not give
a help message when --help is given. Instead, an error message about missing
repository will be printed. (Also, a few commands do not accept the --help
option at all)
> What is the expected output? What do you see instead?
All commands should always print a helpful message, and nothing else, when
--help is specified.
Original issue reported on code.google.com by [email protected]
on 9 Feb 2011 at 9:40
For purposes of organization and efficiency it makes sense to great separate
sessions for directories that are more or less self contained. Particular
photoshoots/projects, different events, etc. So rather than having a session
called "pics" I find that I have sessions called personal_xmas_2012,
work_client1_images, etc.
This is fine for a smaller number of sessions, but I'm in the process of
putting more or less all of my media under revision control and the session
list quickly balloons out of control.
It would be nice to be able to appending some kind of "tag" metadata to
sessions to be able to quickly filter logical session groups. Like only showing
sessions with +work, or filtering on multiple tags like +work +2012. At the
moment I'm overloading the session name to accomplish this, but it would be
much nicer to do it with a separate metadata attribute that could be easily
added to the session.json files.
On a related note, it would be nice to be able to specify a session by using a
partial session name so long as it was unambiguous. So import -v -m "minor
edits" "photos" "very_long_session_name_blah_blah_blah" could be "very_long*"
(assuming there was nothing else that started with very_long).
Original issue reported on code.google.com by [email protected]
on 15 Jan 2012 at 6:32
BOAR: 3-Mar-2011 daily
OS: Windows 7 x64
PYTHON: 2.7.1
When doing an initial import (in this case I'm importing a large document
repository network mapped as drive letter X) boar quits with:
Traceback (most recent call last):
File "C:\[home]\boar\boar", line 501, in <module>
return_code = main()
File "C:\[home]\boar\boar", line 458, in main
return cmd_import(args[1:])
File "C:\[home]\boar\boar", line 274, in cmd_import
log_message = log_message)
File "C:\[home]\boar\workdir.py", line 183, in checkin
self.get_changes()
File "C:\[home]\boar\workdir.py", line 353, in get_changes
filelist[f] = self.cached_md5sum(fn)
File "C:\[home]\boar\workdir.py", line 310, in cached_md5sum
stat = os.stat(abspath)
WindowsError: [Error 2] The system cannot find the file specified:
u'x:\\/JOHNDOE/
DIR1/PDFsamTMPbufferELENL1.pdf'
I'm trying to do an import on a "live" document repository. At the moment, it
seems boar quits when it finds a locked file or a file it expects disappears. I
realize there are concerns with regard to snapshot integrity, but is there any
harm in giving the user the option to ignore/skip a file/folder rather than
aborting? It seems that would be a more graceful way to handle these cases. If
a file is locked for editing, or is a tmp file that disappears during the
import the file should be easy enough to add when it becomes available by doing
an update correct?
Anyway, just wondering what you thoughts are on alternatives to simply aborting
a large import process given how long such a process can take.
Original issue reported on code.google.com by [email protected]
on 8 Jun 2011 at 10:48
1. rm a dir [with subdirs] in WorkingCopyA
2. boar co
3. cd corkingCopyB; boar update
What is the expected output?
In WorkingCopyB: same structure as WorkingCopyA
What do you see instead?
In WorkingCopyB: old empty dirs
What platform are you using? (Windows XP, Windows 7, Linux, ...)
Ubuntu natty
What version of Python are you using?
2.7
What version of boar are you using? (Mercurial change id or daily build
date)
BOAR_VERSION = "boar-daily.11-Jul-2011"
It's nasty do to do it myself, because I always have to check if evereythin
really is empty before deleting. I'd like to see that boar worked correctly.
great tool!
Original issue reported on code.google.com by [email protected]
on 7 Oct 2011 at 3:49
Boar should be able to find identical blocks between files and only store them
once. Boar does currently only perform file level deduplication. That is,
identical files are only stored once in the repository, regardless of filename
or session.
This feature will reduce overhead when performing small changes to large files,
such as editing EXIF data in an image. Also, it will make it feasible to
version control large data files with frequent small changes, such as virtual
machine images.
Original issue reported on code.google.com by [email protected]
on 22 Mar 2011 at 11:48
/usr/bin/env python Should be more platform independent, because now python can
be in other places, like the users home directory.
Original issue reported on code.google.com by [email protected]
on 10 Mar 2011 at 9:20
Attachments:
What steps will reproduce the problem?
1. Import 2 directories with one directory's name is part of the other
2. Modify something in the longer-named directory
3. Go to the short-named directory and update
Run issue.sh to simulate this situation
What is the expected output? What do you see instead?
Boar should update normally, instead it spit out:
AssertionError: Offset was: workdir Path was: workdir_longer/file
What platform are you using? (Windows XP, Windows 7, Linux, ...)
Ubuntu 10
What version of Python are you using?
Python 2.6
What version of boar are you using? (Mercurial change id or daily build
date)
70b62d23db
Please provide any additional information below.
My attempt: issue9.patch
Original issue reported on code.google.com by [email protected]
on 11 Mar 2011 at 1:46
Attachments:
boarmount now checks for sufficient arguments. Otherwise an ugly python
exception is shown, because sys.argv[1:3] barfs if there aren't enough
arguments.
Original issue reported on code.google.com by [email protected]
on 10 Mar 2011 at 10:34
Attachments:
What steps will reproduce the problem?
1. Update a directory with files that have special unicode character in their
names
What is the expected output? What do you see instead?
Boar should update normally. Instead we see UnicodeEncodeError
What platform are you using? (Windows XP, Windows 7, Linux, ...)
Windows 7. I tested on my Ubuntu and this doesn't occur, it seems that the
default local in my Ubuntu is some kind of unicode instead of ascii.
What version of Python are you using?
2.6
What version of boar are you using? (Mercurial change id or daily build
date)
403cbb6bc635
Please provide any additional information below.
Original issue reported on code.google.com by [email protected]
on 14 Apr 2011 at 2:01
This is essentially just looking in the filename for the hash and seeing if the
hash and the computed hash match. Attached is a link to a python script that
checks crc32s after finding them in the filename.
http://agafix.org/anime-crc32-checksum-in-linux-v20/
Original issue reported on code.google.com by [email protected]
on 4 Mar 2011 at 9:08
What steps will reproduce the problem?
1. execute the command "boar --repo=/whatever"
What is the expected output? What do you see instead?
There should be a helpful message telling that you need to issue a command word
as well. Instead, there is an ugly exception.
Original issue reported on code.google.com by [email protected]
on 2 Aug 2011 at 10:02
What steps will reproduce the problem?
1. Create a new session.
2. Check out the new session to two workdirs, workdir1 and workdir2.
3. Add some large files to workdir1 so that it will take a long time to commit.
4. Start committing the changes to workdir1.
5. Add some small files to workdir2 so that it will commit quickly.
6. Commit the changes to workdir2 before the workdir1 commit has completed.
What is the expected output? What do you see instead?
Both commits will succeed. The slow commit will be the last to finish, and will
be the latest revision of the session, hiding the changes committed in workdir2.
The commit to workdir2 should fail with an error message "This session is
currently being updated. Try again later." (Or, the commit could simply wait
until the other commit is finished, but the likely result then will also be an
error, "Your workdir is our of date. You need to update your workdir before you
can commit")
Original issue reported on code.google.com by [email protected]
on 22 Mar 2011 at 11:35
I was just wondering if there is any purpose to the initial revision for any
session being empty. Is that just an artifact of the fact that when you
"create" a new session it is necessarily empty, or is there any use case or
technical reason why having an empty revision is a good idea?
Original issue reported on code.google.com by [email protected]
on 13 Jun 2011 at 7:26
What steps will reproduce the problem?
1. Create a repository on a FAT32 drive
2. Import a directory with a file with size > 4GB
What is the expected output? What do you see instead?
There is an error message. The operation should complete normally.
Due to how boar store files internally, file size limits imposed by the file
system also affects boar. FAT32 is unfortunately still quite common as the
default file system on portable media such as USB memory sticks and external
HDDs. Boar should split large files so that a repository can always be stored
on FAT32. FAT32 max file counts and filename limits are not expected to cause
any problems.
There are possibly other file systems that should be supported by boar, but
this issue covers only FAT32.
Original issue reported on code.google.com by [email protected]
on 15 Aug 2011 at 2:03
I primarily use boar in a windows 7 environment, however the lack of support
for tracking symlinks/junctions makes it very difficult to use as a "backup"
solution.
The issue I have is that by not at least storing metadata about the existence
of junctions in the bloblist, information about the directory structure you are
ostensibly checking in is irrevocably lost.
You want to know that your data is safe when using any version control system.
Symlinks and junctions *are* important. They're usually there for a reason, and
if they're missing, things can break in nonobvious ways. By not having any
means to at least track or log their existence, boar is losing potentially
important data. When restoring a session that included symlinks or junctions
the user will not have any indication what files are missing or any place to
look for hints about recreating them.
Even if boar doesn't allow you to recreate them fully (due to filesystem or
permissions issues) perhaps simply providing a hint/placeholder file or
including an entry in the bloblist with symlink and reparse point (name ->
target) information would allow the user to recreate these structures on
whatever filesystem they are using with a fairly trivial script using mklink or
ln (or what have you).
Also, does boar store information about empty directories? Again, this comes
down to the issue of "losing" information. I realize boar is aimed at story
binary data, but as a practical matter the directory structures (ie the
"where") in which blobs are stored matters. Even if a directory is empty, the
name or location of a directory can be provide important information that is
otherwise irrevocably lost if not stored somehow.
Philosophically, I feel backup and especially archival tools should err on the
side of caution when it comes to being able to faithfully restore the data they
are entrusted with. If a user has a file in a directory the safest assumption
is because they want it there and as such any VCS that deals with files should
provide a mechanism for restoring or at least capturing information about it.
Any thoughts on this (even simply having an entry in the bloblist about
directories and symlinks/junctions)?
Original issue reported on code.google.com by [email protected]
on 19 Dec 2011 at 5:46
Definitely a low priority "convenience" feature... but tab completion for
commands and sessions would be great. Would be especially nice for sessions
with subpath/offset components :)
In a related vein, a --color=WHEN style switch for shells that support color
would be great too as far as the someday maybe features go ;)
Original issue reported on code.google.com by [email protected]
on 21 Dec 2011 at 5:29
the --ignore-errors switch is great for the import command, given that ci is
simply a shortcut to perform an import for working dirs, the same
--ignore-errors switch would be desirable for the same reason as it is on
import (having a large commit abort after process for a long time vs skipping
locked/modified files is not annoying).
Thanks and hope you enjoy your holidays :)
Original issue reported on code.google.com by [email protected]
on 21 Dec 2011 at 10:52
Currently it lists all files which makes it hard to manage the repository.
I think it should be like svn list, which will help tremendously.
Original issue reported on code.google.com by [email protected]
on 2 Mar 2011 at 6:33
boarmount now checks for sufficient arguments. Otherwise an ugly python
exception is shown, because sys.argv[1:3] barfs if there aren't enough
arguments.
Original issue reported on code.google.com by [email protected]
on 10 Mar 2011 at 11:20
Attachments:
This might just be a matter of me having trouble with the appropriate syntax to
accomplish what I'd like, but ultimately, I'd like to be able to say "ignore
all files *except* .xyz".
For instance, if I wanted to only import video files or image files from.
The only solution I was able to find about doing this in svn (as you mentioned
it uses similar syntax) was here: http://www.thoughtspark.org/node/38
The syntax seemed rather clunky. Is there any chance you could perhaps
implement a "+/-" syntax for inclusion / exclusion of a file mask. The idea
being someone could type:
-*
+*.xyz
The masks get in sequential order, so first you "exclude" everything and then
you "include" files ending in ".xyz". This is just an example of a way the
inclusion/exclusion filter might work (this is how HTTRACK implements file
masks, but again, it's just one possible way to implement it).
Thanks
Original issue reported on code.google.com by [email protected]
on 23 Jun 2011 at 2:53
The clone command will update an older copy of a repository with all the
changes in a more recent version. It would be useful to be able to make boar
automatically keep a clone updated.
The boar "clone" command shall take a new flag, "--continuous" which shall make
boar simply repeat the clone operation every n seconds.
Original issue reported on code.google.com by [email protected]
on 18 Aug 2011 at 2:21
When an import or checkin is taking place, the incomplete data is stored under
tmp/ in the repository directory. If the operation is aborted, the data stays
in the tmp directory indefinitely. There should be a "cleanup" command to
delete old files from that directory.
A manual workaround is to make sure that no boar command is currently running,
and then simply delete the contents of the tmp directory.
Automatic cleanup might be implemented in the future, but this item only covers
an explicit cleanup command.
Original issue reported on code.google.com by [email protected]
on 1 Sep 2011 at 8:08
Unlike a normal revision control software that deal mostly with text files,
boar is designed for large multimedia files or other types of binary files such
as MSOffice documents and other proprietary formats. While we do want to keep
some revisions (e.g. for important documents) , we normally don't want to keep
ALL revision, especially when major reorganization of repositories occur. For
instance, we may want to move a large directory, says 100gb out of the
repository, and we're mostly certainly sure that we don't ever want it back,
there's no point in keeping that directory in the history of the repository.
So with this reasoning, I propose a purge command that removes all history from
a certain revision and before. Boar should be able to make it as if a certain
revision in the middle is the initial import.
Original issue reported on code.google.com by [email protected]
on 13 Mar 2011 at 9:22
In function check_in_file in file workdir.py (line 464) all occurances of
"path" variable should be replaced with "abspath" or "sessionpath".
Original issue reported on code.google.com by [email protected]
on 1 Jan 2012 at 7:30
What steps will reproduce the problem?
1. Check out a non-existing subdir from a session, like "boar co
MySession/this_dir_does_not_exist"
What is the expected output? What do you see instead?
The operation completes without an error and an empty workdir is created. This
is confusing if the user intended to check out an existing directory.
This is an accidental feature. It is a very convenient way to create a new
directory in a session. But the behaviour is likely to confuse a user who
simply mistyped the name of the directory. There should be a notification
message explaining that a new directory will be created if you commit any
changes in this workdir.
Original issue reported on code.google.com by [email protected]
on 5 Feb 2012 at 10:55
Am I right you want to do EXACTLY the same thing that is possible by using:
1. rdiff-backup
2. rdiff-backup-fs
?
If not, I have an issue for you - very limited description at project homepage
that prevents me judging that I'm interested or not.
If you would like to do same as rdiff-backup but optimised in incremental data
saving space algorithms using object specific techniques then this project is
VERY interesting. If not, this is simple backup utility clone.
So, I would like an answer which should be at least descriptive about plans for
the future if you want help developing such a thing.
Original issue reported on code.google.com by [email protected]
on 26 May 2011 at 11:19
Boar uses the 128 bit md5 checksum algorithm. The odds against an accidental
collision (two different files having the same checksum) are truly astronomical
(if you have 10 000 000 000 files, the risk of at least one collision is about
10^-19). However, md5 does have a weaknesses that makes it possible to
construct collisions intentionally. This feature is therefore mostly a security
issue, since accidental collisions are rare enough.
Collisions will cause problems, as boar currently assumes that files with the
same md5 checksum are always identical. Most likely, one of the files causing
the collision will be lost.
Boar should prevent such problems by storing an alternative checksum (maybe
some variant of SHA) for every stored file, and use this to make sure that
files with the same md5 checksum are truly identical. There will be no attempt
at making the boar repository store md5 collisions. If a collision is found
during an import or checkin, boar will abort the operation and print an error
message.
Original issue reported on code.google.com by [email protected]
on 23 Mar 2011 at 8:34
What steps will reproduce the problem?
1. Create a new repo
2. Start some large writing operation, like importing a huge directory
3. Kill the boar process in some way that will not allow normal cleanup
procedures (kill -9, or yank the drive)
4. Execute the same operation again (or any other writing operation)
What is the expected output? What do you see instead?
The operation should resume normally. Instead, the following exception is
thrown:
Traceback (most recent call last):
File "C:\Python26\boar\boar", line 584, in <module>
return_code = main()
File "C:\Python26\boar\boar", line 563, in main
return cmd_clone(args[1:])
File "C:\Python26\boar\boar", line 446, in cmd_clone
repo2 = repository.Repo(repopath2)
File "C:\Python26\boar\blobrepo\repository.py", line 117, in __init__
self.repo_mutex.lock_with_timeout(60)
File "C:\Python26\boar\common.py", line 317, in lock_with_timeout
except MutexLocked:
NameError: global name 'MutexLocked' is not defined
A workaround is to make sure that no other process is accessing the repository,
and then remove all the mutex-* files in the tmp directory in the repo.
Original issue reported on code.google.com by [email protected]
on 20 Jul 2011 at 8:46
When checking in a directory if boar finds .boar "working directory" metadata
folders it should intelligently leverage these to greatly increase the
efficiency of check ins of a parent directory that includes directories already
under version control within it.
== Reasoning ==
Given the way the logging/reporting features currently work (revision count,
etc) it seems to make the most sense to set up sessions on a "per project"
basis. Since file changes are usually localized on a per project basis it makes
sense to be able to "check in" changes to a project (ie work_john.doe_website,
rev 3) rather than checking in a giant global folder every time (ie websites,
rev 2042). When using a global session/folder that includes many subprojects
(like a folder called pictures, or websites, etc) it quickly becomes unwieldy
to try to track down revisions for a specific project since there is a lot of
background noise for other projects. That being said, it *is* nice to be able
to have a "global" view of such folders sometimes. Even if you have subfolders
in your pics folder or your website folder, etc to correspond to specific
subprojects, sometimes there are files that are *only* in the root of the pics
folder or are not specific to any particular project, or furthermore, if you
ever reorganize the structure of the pics/websites/etc folder it's nice to be
able to have it under version control so you can return to the old layout if
you decide you don't like the new one.
With all of this in mind, I realize that the "offset" function is designed to
allow the check-in of subfolders, but commits/etc are all still mashed
together under a single session, there is no way to filter on a "per
project/subfolder" basis atm. Also, it's nice to have a project exist as an
atomic work dir that you can freely move and plop down wherever you want on
your filesystem and be able to continue making check-ins (since paths are based
at the work dir level) without worrying that it is no longer an "offset" from
the master pics/websites/etc folder if you decide you want move it to some new
parent directory.
So, ultimately, per project sessions are just fine, but they don't obviate the
usefulness of sessions to track much larger folders that are a superset of many
smaller sessions (and which also include files that *don't* exist in any other
session).
This bring me back around to the original feature request. If you just spent
several days checking in a bunch of sessions/directories to correspond with all
your photoshoots, or video projects, or whatever project category you happened
to use to draw session boundaries with, and at the end of it all you want to
create a "master" snapshot of the entire "photos" folder to catch/track any
stray files that aren't included in existing sessions (and to track the global
layout of your files within that folder), then it should *not* take another
several days to import directories for which there are already bloblist/cache
files (defeats the point of precomputing such things if they aren't used later).
Original issue reported on code.google.com by [email protected]
on 20 Jan 2012 at 2:11
Hi,
for my backups it is very necessary that I can store them encrypted.
Do you see any possibility in future, to implement that in future?
Jan
Original issue reported on code.google.com by [email protected]
on 26 Jun 2011 at 10:18
What steps will reproduce the problem?
1. Import a directory with spaces in its name, e.g
boar import "a directory" "session/a directory"
where session is the session's name
2. Checkout the directory
boar co "session/a directory"
supposed repo option was set correctly.
What is the expected output? What do you see instead?
It should create a new directory "session/a directory"
Instead it output:
AssertionError: Offset was: a Path was: a directory/....
seems that it failed on the first file.
What platform are you using? (Windows XP, Windows 7, Linux, ...)
Ubuntu 9.04
What version of Python are you using?
2.6.2
What version of boar are you using? (Mercurial change id or daily build
date)
boardaily 27-feb-2011
Please provide any additional information below.
If I tried
boar import "a directory" "session/a directory" a_dir_name
it output: ERROR: too many arguments
Original issue reported on code.google.com by [email protected]
on 2 Mar 2011 at 6:22
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.