Coder Social home page Coder Social logo

fangfufu / httpdirfs Goto Github PK

View Code? Open in Web Editor NEW
714.0 714.0 58.0 897 KB

A filesystem which allows you to mount HTTP directory listings, with a permanent cache. Now with Airsonic / Subsonic support!

License: Other

Makefile 15.26% C 60.23% Shell 24.25% M4 0.26%
airsonic filesystem filesystem-utils funkwhale fuse-filesystem http http-client https libcurl libfuse mount subsonic subsonic-client

httpdirfs's People

Contributors

0mp avatar bikemike avatar chrysn avatar cyberjunky avatar edenist avatar fangfufu avatar hiliev avatar jcharaoui avatar jikamens avatar kianmeng avatar lk-me avatar mattiasrunge avatar moschlar avatar nwf-msr avatar vorlonofportland avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

httpdirfs's Issues

Crashes with "API function called from within callback"

I tagged 1.1.8 and readied a package for upload. Before sending the package to the archive I always test it. After a few minutes of streaming a MKV movie file, I get this crash:

transfer_blocking(): 8, API function called from within callback

This happened 3 times back to back with different video files.

The mount was configured with default settings (no cache).

Todo #2

This is the long term plan for HTTDirfS:

  • Improve speed by returning from the download function when sufficient data has been downloaded, rather than waiting for the download of the whole segment.
  • Implement releasedir() for FUSE.
  • Run through UBSan.
  • Update documentation stating that files need to have range support for most cases. (#98)
  • Better error messages - so the user knows what went wrong. (e.g. the situation at #99)
  • Improve mutex lock (#91)
  • #95
  • #100
  • #102

Below are the super long term plans:

  • Add Joplin support.

Requirements for compilation with Debian 11

The README doesn't state the requeirements for compilation on Debian 11 (Bullseye), i took the requirements for Debian 10, but:

cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o util.o src/util.c
Package uuid was not found in the pkg-config search path.
Perhaps you should add the directory containing `uuid.pc'
to the PKG_CONFIG_PATH environment variable
No package 'uuid' found
Package uuid was not found in the pkg-config search path.
Perhaps you should add the directory containing `uuid.pc'
to the PKG_CONFIG_PATH environment variable
No package 'uuid' found
src/util.c:4:10: fatal error: uuid/uuid.h: Datei oder Verzeichnis nicht gefunden
    4 | #include <uuid/uuid.h>
      |          ^~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:20: util.o] Fehler 1

uuid-dev is missing.

Correct requirements for Debian 11 (Bullseye) are::

  libgumbo-dev libfuse-dev libssl-dev libcurl4-openssl-dev uuid-dev

Add permanent caching mode support

Rather than doing range request, use CURLOPT_RESUME_FROM, and save the file to the hard drive, and record which segment had been saved.

Upload new version to Debian

This new version contains a significant new feature - permanent cache system, and various bug fixes and enhancement. I think we should package this and upload it to Debian.

I know the man page needs to be updated.

Is there anything I could do?

fuse: missing mountpoint parameter

Running on RaspberryPi 4.

Linux PiServer 5.10.17-v8+ #1414 SMP PREEMPT Fri Apr 30 13:23:25 BST 2021 aarch64 GNU/Linux

fusermount version: 2.9.9

Pulled down the project, installed the Debian 10 requirements and built using 'make' in the project folder, no apparent issues -

make
rm -f ./.depend
cc -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` -MM src/fuse_local.c src/link.c src/main.c src/util.c src/cache.c src/network.c src/sonic.c -MF ./.depend;
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o main.o src/main.c
src/main.c: In function ‘parse_arg_list’:
src/main.c:173:35: warning: comparison is always true due to limited range of data type [-Wtype-limits]
                     &long_index)) != -1) {
                                   ^~
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o network.o src/network.c
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o fuse_local.o src/fuse_local.c
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o link.o src/link.c
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o cache.o src/cache.c
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o util.o src/util.c
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -c -o sonic.o src/sonic.c
cc  -O2 -Wall -Wextra -Wshadow -rdynamic -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -DVERSION=\"1.2.0\" `pkg-config --cflags-only-I gumbo libcurl fuse uuid expat` `pkg-config --libs-only-L gumbo libcurl fuse uuid expat` -o httpdirfs main.o network.o fuse_local.o link.o cache.o util.o sonic.o -pthread -lgumbo -lcurl -lfuse -lcrypto -luuid -lexpat

But when I try and mount (I have changed my HTTP folder URL for this ticket)

./httpdirfs -f --cache https://myHTTPFolder/ /mnt/HTTP

I get

see httpdirfs -h for usage
fuse: missing mountpoint parameter

Not sure what to do, any ideas would be really appreciated.

Thanks

Why not add a resumable upload function?

When using curl to transfer and download, if a unilateral network failure occurs, it will be directly stuck, and a breakpoint resume function is added to prevent stuck

Funkwhale

Hello.
I'm trying to use httpdirfs with Funkwhale.

This is the command I'm using:
httpdirfs -f --cache --sonic-username myuser --sonic-password APIPassword --sonic-id3 --no-range-check https://tanukitunes.com/ ./mnt

What I get is:

CacheSystem_init(): directory: /home/user/.cache/httpdirfs/https%3A%2F%2Ftanukitunes.com
--------------------------------------------
 LinkTable 0x22f1d70 for https://tanukitunes.com/rest/getArtists.view?u=myuser&t=zzzzzzzzzz&s=xxxxxxxxxx&v=1.13.0&c=HTTPDirFS-1.2.0
--------------------------------------------
0 H 0  https://tanukitunes.com/rest/getArtists.view?u=myuser&t=zzzzzzzzzz&s=xxxxxxxxxx&v=1.13.0&c=HTTPDirFS-1.2.0
1 D 0 ! 
2 D 0 # 
3 D 0 ( 
4 D 0 - 
5 D 0 . 
6 D 0 / 
7 D 0 0 
8 D 0 1 
9 D 0 2 
10 D 0 3 
11 D 0 4 
12 D 0 6 
13 D 0 7 
14 D 0 9 
15 D 0 : 
16 D 0 ? 
17 D 0 A 
18 D 0 B 
19 D 0 C 
20 D 0 D 
21 D 0 E 
22 D 0 F 
23 D 0 G 
24 D 0 H 
25 D 0 I 
26 D 0 J 
27 D 0 K 
28 D 0 L 
29 D 0 M 
30 D 0 N 
31 D 0 O 
32 D 0 P 
...
78 D 0 猫 
79 D 0 骨 
80 D 0 회 
--------------------------------------------
LinkTable_print(): Invalid link count: 0
--------------------------------------------

If I look inside the mnt directory, I can see only four subirectories and an input/output error:

[user@host ~]$ ls -la mnt/
ls: reading directory 'mnt/': Input/output error
total 0
drwxr-xr-x. 1 user user 0 Jan  1  1970 '!'
drwxr-xr-x. 1 user user 0 Jan  1  1970 '#'
drwxr-xr-x. 1 user user 0 Jan  1  1970 '('
drwxr-xr-x. 1 user user 0 Jan  1  1970  -

Add support for LMS

Hello,

It would be great to support password authentication for servers that are using API version <= 1.12.0
Several implementations still stick to older versions, mainly to avoid storing passwords in plain texts.
Examples:

You could imagine a secure fallback to plain password method if the following conditions are met:

  • first attempt returned a code 30 (Incompatible Subsonic REST protocol version. Server must upgrade.)
  • the URL scheme is https

Improper locking error, unlocking an inactive lock?

Dear developers, thank you for your checking. When Seg_exist(cf, dl_offset)==1, should the unlock statement PTHREAD_MUTEX_UNLOCK(&cf->w_lock); try to unlock a lock cf->w_lock that is not acquired?

According to this,
If a thread attempts to unlock a mutex that it has not locked or a mutex which is unlocked, undefined behavior results.

long Cache_read(Cache *cf, char *const output_buf, const off_t len, const off_t offset_start) {
          if (Seg_exist(cf, dl_offset)) {
                    ...;       // when Seg_exist(cf, dl_offset), it goes to this path
           } else {
                    ...;
                     PTHREAD_MUTEX_LOCK(&cf->w_lock);
           }
         ...;
         PTHREAD_MUTEX_UNLOCK(&cf->w_lock); // unlock a inactive lock
         ...;
 }

httpdirfs/src/cache.c

Lines 995 to 1054 in 939e287

if (Seg_exist(cf, dl_offset)) {
send = Data_read(cf, (uint8_t *) output_buf, len, offset_start);
goto bgdl;
} else {
/*
* Wait for any other download thread to finish
*/
lprintf(cache_lock_debug,
"thread %ld: locking w_lock;\n", pthread_self());
PTHREAD_MUTEX_LOCK(&cf->w_lock);
if (Seg_exist(cf, dl_offset)) {
/*
* The segment now exists - it was downloaded by another
* download thread. Send it off and unlock the I/O
*/
send =
Data_read(cf, (uint8_t *) output_buf, len, offset_start);
lprintf(cache_lock_debug,
"thread %x: unlocking w_lock;\n", pthread_self());
PTHREAD_MUTEX_UNLOCK(&cf->w_lock);
goto bgdl;
}
}
/*
* ------------------ Download the segment ---------------------
*/
uint8_t *recv_buf = CALLOC(cf->blksz, sizeof(uint8_t));
lprintf(debug, "thread %x: spawned.\n ", pthread_self());
long recv = Link_download(cf->link, (char *) recv_buf, cf->blksz,
dl_offset);
if (recv < 0) {
lprintf(error, "thread %x received %ld bytes, \
which does't make sense\n", pthread_self(), recv);
}
/*
* check if we have received enough data, write it to the disk
*
* Condition 1: received the exact amount as the segment size.
* Condition 2: offset is the last segment
*/
if ((recv == cf->blksz) ||
(dl_offset == (cf->content_length / cf->blksz * cf->blksz))) {
Data_write(cf, recv_buf, recv, dl_offset);
Seg_set(cf, dl_offset, 1);
} else {
lprintf(error, "received %ld rather than %ld, possible network \
error.\n", recv, cf->blksz);
}
FREE(recv_buf);
send = Data_read(cf, (uint8_t *) output_buf, len, offset_start);
lprintf(cache_lock_debug,
"thread %x: unlocking w_lock;\n", pthread_self());
PTHREAD_MUTEX_UNLOCK(&cf->w_lock);

Best,

Trailing slash in links not preserved

Even when the server puts a trailing slash at a directory in the link (which eg. nginx and Python3's http.server do), httpdirfs still goes for the slash-less version.

Paraphrasing a wireshark capture:

GET /tests/
200 OK, text/html, <a href="foo/" title="foo">foo/</a>
GET /tests/foo
301 Moved Permanently, Location: http://.../test/foo/
GET /tests/foo/
200 OK, ...

This has three downsides:

  • It's a needless roundtrip, contributing a lot to usage patterns where the first slow step is when the client gets a directory tree.
  • It depends on a behavior of the server that is not specified but just convention: While most servers do this, none is required to, for the URIs with and without a trailing slash are technically distinct.
  • If an erroneous server blunders around redirects (eg. nginx) and sends the client off to a bad location, there can even be a lock-up. (see #95).

If there's any way to store the original URI with the path component for each file and directory, that should probably be done. If there is no way to store these, I don't have any concrete suggestions (as adding the trailing slash causes the same trouble on servers that choose not to use a trailing slash, although they're probably rare for practical reasons). If it can be arranged but comes at some cost, I can probably come up with examples of other trouble that crop up if the originally encoded URI is not preserved ;-) (read: It's a larger discussion and several related threads which I don't want to bring in here if it can be resolved easily anyway).

[edit: pointing to known nginx and newly reported httpdirfs issues]

please add snap snapcraft or flatpac flathub release....

i use arch on my home computer and a company customized ubuntu on work, but there is no release available for my ubuntu,

can you guys please release snap or flatpac, i cant build it every time i change desk......

thank you ahead.

I have a problem,fuse: mountpoint is not empty fuse: if you are sure this is safe, use the 'nonempty' mount option

user1@debian:~$ sudo httpdirfs --cache https://cdimage.debian.org/debian-cd/ /mnt/book
libcurl SSL engine: OpenSSL/1.1.1d
CacheSystem_init(): directory: /root/.cache/httpdirfs/https%3A%2F%2Fcdimage.debian.org%2Fdebian-cd%2F
LinkTable_new(): disk_linktbl->num: 13, linktbl->num: 13
LinkTable_invalid_reset(): 0 invalid links
LinkTable_uninitialised_fillDone!

LinkTable 0x20820f0 for https://cdimage.debian.org/debian-cd/

0 H 0 https://cdimage.debian.org/debian-cd/
1 D 0 10.3.0-live https://cdimage.debian.org/debian-cd/10.3.0-live
2 D 0 10.3.0-live https://cdimage.debian.org/debian-cd/10.3.0-live
3 D 0 10.3.0 https://cdimage.debian.org/debian-cd/10.3.0
4 D 0 10.3.0 https://cdimage.debian.org/debian-cd/10.3.0
5 D 0 current-live https://cdimage.debian.org/debian-cd/current-live
6 D 0 current-live https://cdimage.debian.org/debian-cd/current-live
7 D 0 current https://cdimage.debian.org/debian-cd/current
8 D 0 current https://cdimage.debian.org/debian-cd/current
9 D 0 project https://cdimage.debian.org/debian-cd/project
10 D 0 project https://cdimage.debian.org/debian-cd/project
11 F 15353 ls-lR.gz https://cdimage.debian.org/debian-cd/ls-lR.gz
12 F 15353 ls-lR.gz https://cdimage.debian.org/debian-cd/ls-lR.gz

LinkTable_print(): Invalid link count: 0, https://cdimage.debian.org/debian-cd/.

fuse: mountpoint is not empty
fuse: if you are sure this is safe, use the 'nonempty' mount option

wishlist: Cache for directories / metadata only / small files only

For use cases like media (think performous, though I haven't made sure it's really beneficial there) it would be convenient to have the persistent caching store directory listings (maybe including metadata) but not the large files.

(Sure one could go to some lengths of distinguishing cache content that it pays of to cache, but with the filesystem cache also involved and given that it'd be good to know early what to cache and what not, having some heuristics like "it's metadata" or "it's smaller than 4KiB" are probably good enough for many applications).

Ability to specify a single file directly (similar to httpfs)

Is it possible to add the functionality to be able to show just 1 file instead of a directory file listing in a mounted path? I have a ISO file in a path that doesn't support web directory file listing and would like to be able to mount and show only that ISO file if I specify it directly.

Example.

httpdirfs --no-range-check https://www.example.com/mycrazypath/rocky-linux-8.iso /cdrom/

And it should should show
ls /cdrom/*
rocky-linux-8.iso

Problem with HTTP basic authentication

Hello, I'm trying to use httpdirfs with a URL protected by basic auth, but it doesn't work right. When supplying the correct credentials using --username and --password, I get these messages:

libcurl SSL engine: OpenSSL/1.0.2l
LinkTable_new(https://example.com/);
link.c: LinkTable_new() cannot retrive the base URL, URL: https://example.com/, HTTP 401
Error: Network initialisation failed.

When using a URL with the credentials embedded, eg. https://user:[email protected], then the initial connection is established correctly but I can't browse to any subfolders:

libcurl SSL engine: OpenSSL/1.0.2l
LinkTable_new(https://user:[email protected]/);
LinkTable_new(https://user:[email protected]/foo);
link.c: LinkTable_new() cannot retrive the base URL, URL: https://user:[email protected]/foo, HTTP 401

This is on a Debian stretch system.

Thank you for this useful piece of software!

macOS compilation

Hi,

Is there any chance to make httpdirfs work under macOS? If so, could you provide instructions of how to compile from the source?

Thanks.

Directories and files are shown twice

I'm mounting a default apache dirlisting as a user, like so:

mkdir tecneeq
httpdirfs -f --cache -u dude -p secret https://tecneeq.server.com:3412/ tecneeq

But strangely, every entry, be it file or directory, is listed twice (both work as expected):

grafik

My environment:

kkruse@nb12615:~$ httpdirfs -V
HTTPDirFS version 1.1.10
FUSE library version: 2.9.9
fusermount3 version: 3.10.2
using FUSE kernel interface version 7.19
kkruse@nb12615:~$ uname -sr
Linux 5.10.0-4-amd64
kkruse@nb12615:~$ cat /etc/debian_version 
bullseye/sid

404'ing directory breaks httpdirfs

When accessing a directory that is not there (say, because the index.html was faulty, or simply because the directory is just being deleted), httpdirfs breaks badly. To reproduce:

$ mkdir original
$ echo '<a href="foo/">foo!</a>' > origina/index.html
$ python3 -m http.server --directory original 8234
(keep running)
$ mkdir mounted
$ httpdirfs -d http://localhost:8234 mounted
(keep running)
$ ls mounted
total 0
drwxr-xr-x 1 chrysn chrysn     0 Jan  1  1970 .
drwxrwxrwt 1 root   root   28700 Oct  3 14:59 ..
drwxr-xr-x 1 chrysn chrysn     0 Jan  1  1970 foo
$ ls mounted/foo
total 0
$ ls mounted/foo
(hangs indefinitely)

Once one process hangs, any other access to not previously cached content hangs too. (In the minimal example, there's nothing else to access; observed that in the more complex real scenario). As usual with fuse, once processes hang like that they can not be killed with even the most deadly signals, and need to wait for httpdirfs to be killed.

The best behavior would probably be to return ENOENT on the access if that's still an option, or behave as regular file systems behave when the directory a process has already entered is removed. (But anything really that does not make processes using the filesystem hang would be an improvement).

Parsing problems?

Since upgrading to 1.1.6 on Debian Buster I'm seeing problems which I think are related to index parsing. Now, with several pages, some directories are detected as files and vice-versa. See http://security-cdn.debian.org/debian-security/ for an example.

Also, for some directories, HTTPDirFS completely hangs in the background and its impossible to unmount the directory.

The same listings tested with version 1.0.1 work fine.

Trouble with special unicode chars

I get the following error when mounting my funkwhale instance:

Cache_read(): received 276, possible network error.
Cache_read(): thread 140144841582144: path_download(/2/2814/新しい日の誕生/8 - 新しい日の誕生.flac, 41943040-50331648);

this prevents mpd from indexing the db at all as it seems.

httpdirfs missing mountpoint parameter, rpi 4b 64bit

Hello,
I can't get httpdirfs to work on a raspberry pi 4b, I got it working on my desktop ubuntu but on rpi it fails. I'm mounting for example at the end of the command with "/mnt". I saw a similar issue at bug report #62
Here's the error:

print_version: HTTPDirFS version 1.2.3
print_version: libcurl SSL engine: OpenSSL/1.1.1l
see httpdirfs -h for usage
fuse: missing mountpoint parameter

Segfault with partially downloaded file and changed segment size

If a remote file is partially initialized in cache (not fully downloaded) with a segment size (eg. 8 MB), and the mountpoint is reattached later with a different segment size (eg. 2 MB), trying to read the remote file again will trigger a segmentation fault.

Single file mode does not work on a lot of pages

Using commit 40c750f:

Here's one example

user@pc:~/httpdirfs$ httpdirfs --single-file-mode 'https://en.wikipedia.org/wiki/Main_Page' ~/mnt/httpdirfs
print_version: HTTPDirFS version 1.2.3
print_version: libcurl SSL engine: OpenSSL/1.1.1f
LinkTable_print: --------------------------------------------
LinkTable_print:  LinkTable 0x5595c9c48db0 for https://en.wikipedia.org/wiki/Main_Page
LinkTable_print: --------------------------------------------
LinkTable_print: 0 H 0  https://en.wikipedia.org/wiki/Main_Page
LinkTable_print: 1 F 88221 Main_Page https://en.wikipedia.org/wiki/Main_Page
LinkTable_print: --------------------------------------------
LinkTable_print:  Invalid link count: 0
LinkTable_print: --------------------------------------------

Unless the mode is still intended for use only with apache directory listings and similar?

Should I split out the sonicfs functionality?

I think in a previous email, @jcharaoui said that the sonicfs functionality should be split out into a separate binary. I sort of agreed with it, because the Unix philosophy says that we should emphasize building simple, short, clear, modular, and extensible code.

However after having a closer look, I feel the sonicfs functionality is just so intertwined with the main code. A major rewrite would be required. Opinions are welcomed on this matter.

The code for splitting out the sonicfs functionality is here:
0155a6f

Adding support for password protected directories?

It seems like it would be very easy to add. Curl already has support for username and passwords. (See here)

This would add a lot of flexibility for people who want to mount a directory securely, and would be a great addition to the project.

Hangs on redirection

One of the indexes I use with HTTPDirFS uses 301 redirections when querying URLs without a trailing slash. This causes HTTPDirFS 1.1.6 to hang when opening a sub-directory. This problem doesn't occur with 1.0.1:

$ curl -I https://example.com/foo
HTTP/2 301 
date: Sat, 20 Jul 2019 15:34:53 GMT
server: Apache
strict-transport-security: max-age=15768000; includeSubDomains; preload
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
referrer-policy: no-referrer
location: https://example.com/foo/
content-type: text/html; charset=iso-8859-1
$ curl -I https://example.com/foo/
HTTP/2 200 
date: Sat, 20 Jul 2019 15:35:00 GMT
server: Apache
strict-transport-security: max-age=15768000; includeSubDomains; preload
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
referrer-policy: no-referrer
content-type: text/html;charset=UTF-8

Crash in blocking_multi_transfer() ?

I hit this when testing, cache mode off, and caused HTTPDirFS to crash:

blocking_multi_transfer(): 8, API function called from within callback

I haven't been able to reproduce it so far, but it may be indicative of a problem.

Todo

This issue is mainly for me to keep track of what I want to do.

  • Split into two binaries and the associated documentation changes (httpdirfs and subsonicfs) (Abandoned)
  • Improve debug functions
  • Automatic generation of man page with proper description
  • Contact maintainer for a new Debian package

Sort out the multilingual support.

This piece of software was not designed with multilingual support in mind, as supporting Unicode and proper support of percentage encoding is quite complicated.

A patch for Chinese support was provided. However I rejected the patch because the poor quality of the code.
#70

There is minimal support percentage encoding in cache.c and link.c, these are done using curl_easy_escape and curl_easy_unescape. These are the output from Visual Studio Code's search:

src/cache.c:
  83      CURL* c = curl_easy_init();
  84:     char *escaped_url = curl_easy_escape(c, url, 0);
  85      char *full_path = path_append(cache_dir_root, escaped_url);

src/cache.c:
  679      if (!CONFIG.sonic_mode) {
  680:         fn = curl_easy_unescape(NULL, this_link->f_url + ROOT_LINK_OFFSET, 0,
  681                                  NULL);

src/link.c:
  301          CURL* c = curl_easy_init();
  302:         unescaped_linkname = curl_easy_unescape(c, this_link->linkname,
  303                                                  0, NULL);

  437      CURL* c = curl_easy_init();
  438:     unescaped_path = curl_easy_unescape(c, url + ROOT_LINK_OFFSET, 0, NULL);
  439      if (CACHE_SYSTEM_INIT) {

Reasonable cache defaults

--max-seg-count     The maximum number of download segments a file
                    can have. By default it is set to 1048576. This
                    means the maximum memory usage per file is 1MB
                    memory. This allows caching file up to 8TB in
                    size, assuming you are using the default segment
                    size.

By default cache is disabled (good) but I don't think that allowing httpdirfs to allocate up to 8TB by default of disk space is reasonable. For most users that would mean caching everything httpdirfs comes across...

Something in the area of 100 segments (so 800MB by default) would make more sense, I think.

When copying files, the program will leak memory

After I copied multiple files to the local desktop, I found that the memory of httpdifs has increased (the copy has been completed). I wonder if the memory of the copy thread has not been released successfully.

Add support for Funkwhale

Hi there! I'm just discovering httpdirfs through the *Sonic support, and it's really interesting.

I've tried it with Funkwhale (an audio streaming server I work on), since  Funkwhale implement a subset of the Subsonic API. However, it doesn't work as httpdirfs rely on non-ID3 based subsonic endpoints (and Funkwhale only supports ID3 based endpoints).

As per the subsonic documentation:

File structure vs ID3 tags

Starting with version 1.8.0, the API provides methods for accessing the media collection organized according to ID3 tags, rather than file structure.

For instance, browsing through the collection using ID3 tags should use the getArtists, getArtist and getAlbum methods. To browse using file structure you would use getIndexes and getMusicDirectory.

Correspondingly, there are two sets of methods for searching, starring and album lists. Refer to the method documentation for details.

Would you be interested by bringing compatibility with ID3 based Subsonic servers? Unfortunately, I cannot help with the code itself (not a C dev), but we have an open demo server (https://demo.funkwhale.audio) you can use for your tests, and I'm available if you want to discuss this further.

I'd also be happy to add httpdirfs to https://funkwhale.audio/apps/ if we manage to make this work :)

Anyway, thank you for the great project!

Hangs when HTTP/2 is used

When HTTP/2 is used between libcurl and the server, HTTPDirFS will hang when requesting a subdirectory. The initial request for the root directory works fine, it seems to hang when trying to list a directory or requesting a file after the initial setup. For example, with CURLOPT_VERBOSE:

path_download(/foo, 0-4096);
* Expire in 0 ms for 6 (transfer 0x7f0788005500)
* Expire in 15000 ms for 2 (transfer 0x7f0788005500)
* 24 bytes stray data read before trying h2 connection
* 24 bytes stray data read before trying h2 connection
* 24 bytes stray data read before trying h2 connection
* 24 bytes stray data read before trying h2 connection
* 24 bytes stray data read before trying h2 connection
* 24 bytes stray data read before trying h2 connection
* 24 bytes stray data read before trying h2 connection
* Found bundle for host lib3.net: 0x561def466ad0 [can multiplex]
* Connection 0 seems to be dead!

The problems doesn't occur when using HTTP/1.1.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.