Coder Social home page Coder Social logo

uriparser / uriparser Goto Github PK

View Code? Open in Web Editor NEW
316.0 10.0 77.0 1.48 MB

:hocho: Strictly RFC 3986 compliant URI parsing and handling library written in C89; moved from SourceForge to GitHub

Home Page: https://uriparser.github.io/

License: Other

Shell 1.12% C 64.29% C++ 29.88% CMake 4.72%
uriparser c rfc-3986 rfc-3513 cross-platform bsd-3-clause library c89 c90 ansi-c

uriparser's Introduction

Build and test AppVeyor Build Status

uriparser

uriparser is a strictly RFC 3986 compliant URI parsing and handling library written in C89 ("ANSI C"). uriparser is cross-platform, fast, supports both char and wchar_t, and is licensed under the New BSD license.

To learn more about uriparser, please check out https://uriparser.github.io/.

Example use from an existing CMake project

cmake_minimum_required(VERSION 3.5.0)

project(hello VERSION 1.0.0)

find_package(uriparser 0.9.2 CONFIG REQUIRED char wchar_t)

add_executable(hello
    hello.c
)

target_link_libraries(hello PUBLIC uriparser::uriparser)

Compilation

Compilation (standalone, GNU make, Linux)

# mkdir build
# cd build
# cmake -DCMAKE_BUILD_TYPE=Release ..  # see CMakeLists.txt for options
# make
# make test
# make install

Available CMake options (and defaults)

# rm -f CMakeCache.txt ; cmake -LH . | grep -B1 ':.*=' | sed 's,--,,'
// Choose the type of build, options are: None Debug Release RelWithDebInfo MinSizeRel ...
CMAKE_BUILD_TYPE:STRING=

// Install path prefix, prepended onto install directories.
CMAKE_INSTALL_PREFIX:PATH=/usr/local

// Path to qhelpgenerator program (default: auto-detect)
QHG_LOCATION:FILEPATH=

// Build code supporting data type 'char'
URIPARSER_BUILD_CHAR:BOOL=ON

// Build API documentation (requires Doxygen, Graphviz, and (optional) Qt's qhelpgenerator)
URIPARSER_BUILD_DOCS:BOOL=ON

// Build test suite (requires GTest >=1.8.0)
URIPARSER_BUILD_TESTS:BOOL=ON

// Build tools (e.g. CLI "uriparse")
URIPARSER_BUILD_TOOLS:BOOL=ON

// Build code supporting data type 'wchar_t'
URIPARSER_BUILD_WCHAR_T:BOOL=ON

// Enable installation of uriparser
URIPARSER_ENABLE_INSTALL:BOOL=ON

// Use of specific runtime library (/MT /MTd /MD /MDd) with MSVC
URIPARSER_MSVC_RUNTIME:STRING=

// Build shared libraries (rather than static ones)
URIPARSER_SHARED_LIBS:BOOL=ON

// Treat all compiler warnings as errors
URIPARSER_WARNINGS_AS_ERRORS:BOOL=OFF

uriparser's People

Contributors

1480c1 avatar arichardson avatar begasus avatar codeinnovation avatar crawlserv avatar cynerd avatar dependabot[bot] avatar esc avatar ffontaine avatar gaspardpetit avatar gperciva avatar gyh007 avatar hartwork avatar hrw avatar jcunningham10 avatar jensenrichardson avatar jibsen avatar kou avatar manisandro avatar myd7349 avatar niclasr avatar sattvik avatar shehzan10 avatar songweijia avatar spaceim avatar starmaker-dev avatar yescallop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

uriparser's Issues

Can uriparser also parse IRIs?

Can uriparser also parse IRIs? Either way, it may be useful to mentioned this in the README.md and/or in the documentation web site.

Move from SourceForge to GitHub

[>=0.9.2] Does not detect too old version of Google Test

Hi,

uriparser 0.9.3 produces this output from cmake:

-- Found GTest: /opt/local/lib/libgtest.a (Required is at least version "1.8.1")

I actually had Google Test 1.8.0 installed, and it didn't detect that it was too old; it proceeded on to build the code and run the tests.

In fact the tests succeeded even with Google Test 1.8.0, so maybe 1.8.1 is not required after all.

Get uriparser packages updated to 0.9.1

Get uriparser packages updated to 0.9.0

Get uriparser packages updated to 0.8.5

Release uriparser 0.9.3

Specific to 0.9.3:

  • Migrate API doc sync script from GNU Autotools to CMake

Malformed url can cause a bad hostText TextRange struct

I am finding that malformed URLs like //:%aa@ cause uriParseUriA to generate invalid TextRange results. Found via fuzzing with ASAN. I am not sure how to correctly fix uriparser, so I can only offer this defensive patch.

/* TODO(schwehr): When this is true, it indicates a bug in the underlying */
/*   parser that must be fixed. e.g. "//:%aa@" results in a bad hostText. */
static int URI_FUNC(TextRangeInvalid)(const URI_TYPE(TextRange) *range) {
  /* Okay to both be nullptr. */
  if (range->first == NULL && range->afterLast == NULL) return URI_FALSE;

  if (range->first == NULL && range->afterLast != NULL) return URI_TRUE;
  if (range->first != NULL && range->afterLast == NULL) return URI_TRUE;

  /* Smaller than empty string or swapped begin <-> end */
  if (range->first > range->afterLast) {
    return URI_TRUE;
  }

  return URI_FALSE;
}

And used here to prevent trouble from propagating up, here is a quick bandaid:

int URI_FUNC(ParseUriEx)(URI_TYPE(ParserState) * state, const URI_CHAR * first, const URI_CHAR * afterLast) {
	const URI_CHAR * afterUriReference;
	URI_TYPE(Uri) * uri;

	/* Check params */
	if ((state == NULL) || (first == NULL) || (afterLast == NULL)) {
		return URI_ERROR_NULL;
	}
	uri = state->uri;

	/* Init parser */
	URI_FUNC(ResetParserStateExceptUri)(state);
	URI_FUNC(ResetUri)(uri);

	/* Parse */
	afterUriReference = URI_FUNC(ParseUriReference)(state, first, afterLast);
	if (afterUriReference == NULL) {
		return state->errorCode;
	}
	if (afterUriReference != afterLast) {
		URI_FUNC(StopSyntax)(state, afterUriReference);
		return state->errorCode;
	}

  /* BEGIN MODIFICATION */
  if (URI_FUNC(TextRangeInvalid)(&uri->scheme)) {
    fprintf(stderr, "Bad scheme\n");
    return URI_ERROR_SYNTAX;
  }
  if (URI_FUNC(TextRangeInvalid)(&uri->userInfo)) {
    fprintf(stderr, "Bad userInfo\n");
    return URI_ERROR_SYNTAX;
  }
  if (URI_FUNC(TextRangeInvalid)(&uri->hostText)) {
    fprintf(stderr, "Bad hostText\n");
    return URI_ERROR_SYNTAX;
  }
  if (URI_FUNC(TextRangeInvalid)(&uri->portText)) {
    fprintf(stderr, "Bad portText\n");
    return URI_ERROR_SYNTAX;
  }
  if (URI_FUNC(TextRangeInvalid)(&uri->query)) {
    fprintf(stderr, "Bad query\n");
    return URI_ERROR_SYNTAX;
  }
  if (URI_FUNC(TextRangeInvalid)(&uri->fragment)) {
    fprintf(stderr, "Bad fragment\n");
    return URI_ERROR_SYNTAX;
  }
  /* END MODIFICATION */
	return URI_SUCCESS;
}

Bugs in uriRemoveBaseUri

When I test with codes below

    UriUriA base;
    UriParserStateA state;
    state.uri = &base;
    ASSERT_EQ(uriParseUriA(&state, "http://example2/x/y/z"), URI_SUCCESS);
    UriUriA source;
    state.uri = &source;
    ASSERT_EQ(uriParseUriA(&state, "http://example/x/abc"), URI_SUCCESS);
    UriUriA dest;
    ASSERT_EQ(uriRemoveBaseUriA(&dest, &source, &base, URI_FALSE), URI_SUCCESS);
    int size = 0;
    ASSERT_EQ(uriToStringCharsRequiredA(&dest, &size), URI_SUCCESS);
    char buffer[size + 1];
    ASSERT_EQ(uriToStringA(buffer, &dest, size + 1, &size), URI_SUCCESS);
    ASSERT_STREQ(buffer, "//example/x/abc");

It fails

Failure
      Expected: buffer
      Which is: "../abc"
To be equal to: "//example/x/abc"

Drop -DURI_SIZEDOWN and --(enable|disabled)-sizedown

Currently only controls inlining of functions:

include/uriparser/UriDefsConfig.h-
include/uriparser/UriDefsConfig.h-/* Function inlining, not ANSI/ISO C! */
include/uriparser/UriDefsConfig.h:#if (defined(URI_DOXYGEN) || defined(URI_SIZEDOWN))
include/uriparser/UriDefsConfig.h-# define URI_INLINE
include/uriparser/UriDefsConfig.h-#elif defined(__INTEL_COMPILER)

Modern compilers probably know best about inlining. Let's remove it.

Migrate to CMake / Overly complicated Windows building

I'd like to use this in my own project but getting this running on Windows hasn't proven obvious and there are no instructions (outside of the possibility of just adding all the source files and include files into my own project which might be intended) .

Is there any chance I could convince you to adopt a CMake/Meson/<literally anything else other than autotools> for a build system?

Get uriparser packages updated to 0.8.6

Extended-Length Path In Windows Encoded Differently Than Windows Path

An extended-length path in windows is prepended with "\\?\" (for example \\?\c:\windows). As such, uriparser doesn't seem to recognize it as a windows path. It is encoded but not in a typical Windows way.

My current work-around is to remove "\\?\" before passing in the path to uriWindowsFilenameToUriStringA.

Release uriparser 0.8.6

Release uriparser 0.9.1

Specific to 0.9.1:

[>=0.9.0] testrunner crashes when compiled with -DNDEBUG

# gdb -batch -ex run -ex bt --args ./testrunner
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[==========] Running 90 tests from 11 test cases.
[----------] Global test environment set-up.
[----------] 7 tests from FourSuite
[ RUN      ] FourSuite.AbsolutizeTestCases
[       OK ] FourSuite.AbsolutizeTestCases (1 ms)
[ RUN      ] FourSuite.RelativizeTestCases
[       OK ] FourSuite.RelativizeTestCases (0 ms)
[ RUN      ] FourSuite.GoodUriReferences
[       OK ] FourSuite.GoodUriReferences (0 ms)
[ RUN      ] FourSuite.BadUriReferences
[       OK ] FourSuite.BadUriReferences (0 ms)
[ RUN      ] FourSuite.CaseNormalizationTests
[       OK ] FourSuite.CaseNormalizationTests (0 ms)
[ RUN      ] FourSuite.PctEncNormalizationTests
[       OK ] FourSuite.PctEncNormalizationTests (0 ms)
[ RUN      ] FourSuite.PathSegmentNormalizationTests
[       OK ] FourSuite.PathSegmentNormalizationTests (0 ms)
[----------] 7 tests from FourSuite (1 ms total)

[----------] 2 tests from MemoryManagerCompletenessSuite
[ RUN      ] MemoryManagerCompletenessSuite.AllFunctionMembersRequired
munmap_chunk(): invalid pointer

Program received signal SIGABRT, Aborted.
0x00007ffff7a18b1b in raise () from /lib64/libc.so.6
#0  0x00007ffff7a18b1b in raise () from /lib64/libc.so.6
#1  0x00007ffff7a02535 in abort () from /lib64/libc.so.6
#2  0x00007ffff7a5e699 in __libc_message () from /lib64/libc.so.6
#3  0x00007ffff7a664a8 in malloc_printerr () from /lib64/libc.so.6
#4  0x00007ffff7a667f4 in munmap_chunk () from /lib64/libc.so.6
#5  0x00005555555cea17 in uriDefaultFree ()
#6  0x00005555555d56e8 in uriFreeUriMembersMmA ()
#7  0x000055555557cea2 in MemoryManagerCompletenessSuite_AllFunctionMembersRequired_Test::TestBody() ()
#8  0x00007ffff7f617fa in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) () from /usr/lib64/libgtest.so
#9  0x00007ffff7f56bba in testing::Test::Run() () from /usr/lib64/libgtest.so
#10 0x00007ffff7f56d08 in testing::TestInfo::Run() () from /usr/lib64/libgtest.so
#11 0x00007ffff7f56de5 in testing::TestCase::Run() () from /usr/lib64/libgtest.so
#12 0x00007ffff7f5729c in testing::internal::UnitTestImpl::RunAllTests() () from /usr/lib64/libgtest.so
#13 0x00007ffff7f61d0a in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) () from /usr/lib64/libgtest.so
#14 0x00007ffff7f57448 in testing::UnitTest::Run() () from /usr/lib64/libgtest.so
#15 0x00005555555c34d7 in RUN_ALL_TESTS() ()
#16 0x00005555555bf36d in main ()

pathHead is null after parsing "http:/"

On versions 8.4.0 and 8.5.0, the following code will fail on the second assert:

    const char* url = "http:/";
    UriParserStateA state;
    UriUriA uriStruct;
    state.uri = &uriStruct;
    int err = uriParseUriA(&state, url);
    assert(err == URI_SUCCESS);
    assert(uriStruct.pathHead != NULL); // this fails

I'm expecting the pathHead to be non-null, because the input string matches the (scheme ":" path-absolute) form of the URI grammar. Also, I'm expecting the first path-segment to be an empty string.

If I changed the input string to "http:/path", or to "http:///", the above test passes.

Backport NDEBUG fix to 0.9.3

The NDEBUG issue is breaking for me, as I'm trying to update uriparser on Homebrew, but I do not want to disable tests. Could this be backported to a new tag so I can still work off of releases? If not, when would 0.9.4 be coming?

Process for reporting security bugs

Hi Uriparser team,

As part of our fuzzing efforts at Google, we are interested in understanding the process for reporting potential security issues to your project in a private manner. Could you please advise us if there is a private tracker for these kinds of bugs, or if you prefer them filed in a publicly visible way?

Thanks!

Parser will not identify absolute URLs correctly

This ticket is meant to copy bug 30 at SourceForge reported by Stefan Radomski (@sradomski).


Following snippet of code fails with version 0.8.4, despite given URL being absolute.

const char* url = "http://www.heise.de/index.html";
UriParserStateA state;
UriUriA uriStruct;
state.uri = &uriStruct;
int err = uriParseUriA(&state, url);
assert(err == URI_SUCCESS);
assert(uriStruct.absolutePath);

macOS: Configure does not work on Android with NDK 16 (clang)

I tried to run configure for cross compiling on Android and i get the following:

checking for a BSD-compatible install... /usr/local/bin/ginstall -c
checking whether build environment is sane... yes
checking for armv7a-linux-android-strip... no
checking for strip... strip
checking for a thread-safe mkdir -p... /usr/local/bin/gmkdir -p
checking for gawk... no
checking for mawk... no
checking for nawk... no
checking for awk... awk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking for armv7a-linux-android-gcc... NDK_ROOT/toolchains/llvm/prebuilt/darwin-x86_64/bin/clang
checking whether the C compiler works... no
configure: error: in .../openFrameworks/scripts/apothecary/apothecary/build/uriparser': configure: error: C compiler cannot create executables See config.log' for more details

The problem is that when checking the compiler, configure tries to pass the CFLAGS to the linker and the "--sysroot" argument is not accepted by clang when linking:

configure:3253: checking whether the C compiler works
configure:3275: NDK_ROOT/toolchains/llvm/prebuilt/darwin-x86_64/bin/clang -nostdlib --sysroot=NDK_ROOT/sysroot -fno-short-enums .... >&5
ld: unknown option: --sysroot=NDK_ROOT/sysroot
clang: error: linker command failed with exit code 1 (use -v to see invocation)

uriParseUriA returns URI_ERROR_SYNTAX for a valid uri?

Hi there!

I have the following sample code:

UriParserStateA state; 
UriUriA uri;
state.uri = &uri;
int res = uriParseUriA(&state, R"(http://counter.yadro.ru/hit;PLUSO?q;r;s1366*768*24;uhttp%3A//masterok-kirillovka.com.ua/rooms/;h%u041E%u043F%u0438%u0441%u0430%u043D%u0438%u0435%20%u043D%u043E%u043C%u0435%u0440%u043E%u0432%20%u043D%u0430%20%u0431%u0430%u0437%u0435%20%u043E%u0442%u0434%u044B%u0445%u0430%20%u041C%u0410%u0421%u0422%u0415%u0420%u041E%u041A%20%u0432%20%u041A%u0438%u0440%u0438%u043B%u043B%u043E%u0432%u043A%u0435;1)");
uriFreeUriMembersA(&uri);

I used C++ for R delimiter, but you got the idea, I guess. The result of res is URI_ERROR_SYNTAX, however, url is valid and working.

Could you fix it please or say what is the reason?

Thank you!

Release uriparser 0.9.2

  • Changes in code:
    • Bump soname
    • Make sure that all change log items point to GitHub issues and/or pull requests
    • Bump version everywhere
    • Update web-wide unique release identifier
  • On SourceForge:
    • Upload files
    • Mark new uploads as default downloads (.zip for Windows, .tar.bz2 all else)
  • On GitHub:
    • Add and push Git tag
    • Make a new release, upload files
    • Associate GitHub issues and pull requests with the milestone of this release
    • Close milestone on GitHub
  • Update API docs at uriparser website — TODO_LINK
  • Announce release
    • uriparser website — TODO_LINK
    • uriparser-users — TODO_LINK
    • blog.hartwork.org — TODO_LINK
      • Hacker news — TODO_LINK
    • direct mail to distro maintainers
  • Bump Gentoo ebuild

Specific to 0.9.2:

  • Close issue #57

Uriparser uses malloc() directly (custom allocator support)

Currently uriparser uses malloc() directly. For use in some high performance systems, it would help to have an optional API call to allow overriding malloc with custom memory allocator routines. For example, see json_set_alloc_funcs() in the Jansson library. These usually take the form of malloc()-like function pointers, but it would be even more useful if the hooks take an additional void * context argument, since a custom allocator often needs access to the heap object to use.

The set_alloc_funcs() API could either take hold globally, or even better be applied on a parser state, for finer control.

Integrate Clang Code formatting checks into CI

May need post-trusty OS images for recent Clang, hence a move from Travis to CircleCI(?), a check by CI and mass-application to current code at last.

The following Travis CI repos could be of use, if we stick to Ubuntu trusty:

  • llvm-toolchain-trusty-6.0
  • llvm-toolchain-trusty-7
  • ubuntu-toolchain-r-test

Please get in touch about details before starting work on this matter. Thanks!

MinGW does not have function asprintf

environment: mingw
Error information: https://ci.appveyor.com/project/KangLin/uriparser/builds/20608226/job/jwmnthamfvohopxy

i686-w64-mingw32-g++ -DHAVE_CONFIG_H -I. -I..   -I../include  -DGTEST_HAS_PTHREAD=0 -isystem C:/projects/uriparser/install-gtest/include -g -O2 -MT test/uriparser_test-FourSuite.o -MD -MP -MF test/.deps/uriparser_test-FourSuite.Tpo -c -o test/uriparser_test-FourSuite.o `test -f 'test/FourSuite.cpp' || echo '../'`test/FourSuite.cpp
i686-w64-mingw32-g++ -DHAVE_CONFIG_H -I. -I..   -I../include  -DGTEST_HAS_PTHREAD=0 -isystem C:/projects/uriparser/install-gtest/include -g -O2 -MT test/uriparser_test-MemoryManagerSuite.o -MD -MP -MF test/.deps/uriparser_test-MemoryManagerSuite.Tpo -c -o test/uriparser_test-MemoryManagerSuite.o `test -f 'test/MemoryManagerSuite.cpp' || echo '../'`test/MemoryManagerSuite.cpp
mv -f test/.deps/uriparser_test-MemoryManagerSuite.Tpo test/.deps/uriparser_test-MemoryManagerSuite.Po
i686-w64-mingw32-g++ -DHAVE_CONFIG_H -I. -I..   -I../include  -DGTEST_HAS_PTHREAD=0 -isystem C:/projects/uriparser/install-gtest/include -g -O2 -MT test/uriparser_test-VersionSuite.o -MD -MP -MF test/.deps/uriparser_test-VersionSuite.Tpo -c -o test/uriparser_test-VersionSuite.o `test -f 'test/VersionSuite.cpp' || echo '../'`test/VersionSuite.cpp
../test/VersionSuite.cpp: In member function 'virtual void VersionSuite_EnsureVersionDefinesInSync_Test::TestBody()':
../test/VersionSuite.cpp:36:70: error: 'asprintf' was not declared in this scope
    URI_VER_MAJOR, URI_VER_MINOR, URI_VER_RELEASE, URI_VER_SUFFIX_ANSI);
                                                                      ^
make[2]: *** [Makefile:1062: test/uriparser_test-VersionSuite.o] Error 1
make[2]: *** Waiting for unfinished jobs....
mv -f test/.deps/uriparser_test-FourSuite.Tpo test/.deps/uriparser_test-FourSuite.Po
make[2]: Leaving directory '/c/projects/uriparser/build-automake'
make[1]: *** [Makefile:1601: check-am] Error 2

Move -DURI_NO_ANSI and -DURI_NO_UNICODE to config.h?

Related to #47

Potentially, we could use two files:

  • a private own for things like HAVE_REALLOC that matter during compilation
  • a public one that gets install that has things like URI_NO_ANSI that matter during and after compilation

Incomplete test coverage for UriFile.c

I redid the tests for UriFile.c for my own use in GoogleTest. In doing so, I noticed that test.cpp doesn't fully cover the compilation unit. You are welcome to take any of my test code and adapt for uriparser. If you aren't interested in the coverage, feel free to close the issue.

The checks with CheckUriUriStringToWindowsFilenameA are bit chaotic and could use some improvement. The code is only tested with ASCII input as I don't build with wchar_t support.

I do one test file for each compilation unit rather than a catch all test.cpp.

Code is copyright Google and donated to uriparser under the uriparser license.

Here is what I have:

// Test UriFile.c
//
// A = ASCII
//
// Many of these test examples show very strange behavior.

#include <cstddef>
#include <cstring>
#include <vector>

#include "gtest.h"
#include "uriparser/Uri.h"
#include "/uriparser/UriBase.h"

namespace {

TEST(UriUnixFilenameToUriStringA, Nullptr) {
  const char uri[] = "/bin/bash";
  std::vector<char> buf(10);

  ASSERT_EQ(URI_ERROR_NULL, uriUnixFilenameToUriStringA(nullptr, &buf[0]));
  ASSERT_EQ(URI_ERROR_NULL, uriUnixFilenameToUriStringA(uri, nullptr));
  ASSERT_EQ(URI_ERROR_NULL, uriUnixFilenameToUriStringA(nullptr, nullptr));
}

size_t UnixFilenameToUriStringSize(const char *filename) {
  // "file://" plus each character could be expanded to a 3 character percent
  // representation and finishing up with a NUL termination sentinal.
  return 7 + 3 * strlen(filename) + 1;
}

void CheckUriUnixFilenameToUriStringA(const char *filename, const char *uri) {
  std::vector<char> buf(UnixFilenameToUriStringSize(filename));
  ASSERT_EQ(URI_SUCCESS, uriUnixFilenameToUriStringA(filename, &buf[0]));
  EXPECT_STREQ(uri, &buf[0]) << "For: \"" << filename << "\"";
}

TEST(UriUnixFilenameToUriStringA, WhiteSpace) {
  CheckUriUnixFilenameToUriStringA("", "");
  CheckUriUnixFilenameToUriStringA(" ", "%20");
  CheckUriUnixFilenameToUriStringA("a b", "a%20b");
  CheckUriUnixFilenameToUriStringA("\t", "%09");
  CheckUriUnixFilenameToUriStringA("\v", "%0B");
  CheckUriUnixFilenameToUriStringA("\n", "%0A");
  CheckUriUnixFilenameToUriStringA("\r", "%0D");
  CheckUriUnixFilenameToUriStringA("\r\n", "%0D%0A");
}

TEST(UriUnixFilenameToUriStringA, Slashes) {
  CheckUriUnixFilenameToUriStringA("/", "file:///");
  CheckUriUnixFilenameToUriStringA("/a", "file:///a");
  CheckUriUnixFilenameToUriStringA("/b/", "file:///b/");
  CheckUriUnixFilenameToUriStringA("c", "c");
  CheckUriUnixFilenameToUriStringA("d/e", "d/e");
  CheckUriUnixFilenameToUriStringA("f/", "f/");

  CheckUriUnixFilenameToUriStringA("//", "file:////");

  // DOS style backslash.
  CheckUriUnixFilenameToUriStringA("\\", "%5C");
}

TEST(UriUnixFilenameToUriStringA, Dots) {
  // "." and ".." are not interpreted in any way.
  CheckUriUnixFilenameToUriStringA(".", ".");
  CheckUriUnixFilenameToUriStringA("..", "..");
  CheckUriUnixFilenameToUriStringA("...", "...");
  CheckUriUnixFilenameToUriStringA("/.", "file:///.");
  CheckUriUnixFilenameToUriStringA("/./", "file:///./");
  CheckUriUnixFilenameToUriStringA("/..", "file:///..");
  CheckUriUnixFilenameToUriStringA("/../", "file:///../");
  CheckUriUnixFilenameToUriStringA("/../a", "file:///../a");
  CheckUriUnixFilenameToUriStringA("/../.b", "file:///../.b");
  CheckUriUnixFilenameToUriStringA("../.c", "../.c");
  CheckUriUnixFilenameToUriStringA("/.././d.e", "file:///.././d.e");
}

TEST(UriUnixFilenameToUriStringA, WrongWay) {
  CheckUriUnixFilenameToUriStringA("file://", "file%3A//");
}

size_t UriStringToUnixFilenameSize(const char *uri) {
  // Skip removing the 7.
  return strlen(uri) + 1;
}

TEST(UriUriStringToUnixFilenameA, Nullptr) {
  const char uri[] = "/a/b";
  std::vector<char> buf(UriStringToUnixFilenameSize(uri));
  EXPECT_EQ(URI_ERROR_NULL, uriUriStringToUnixFilenameA(nullptr, &buf[0]));
  EXPECT_EQ(URI_ERROR_NULL, uriUriStringToUnixFilenameA(uri, nullptr));
  EXPECT_EQ(URI_ERROR_NULL, uriUriStringToUnixFilenameA(nullptr, nullptr));
}

void CheckUriUriStringToUnixFilenameA(const char *uri, const char *filename) {
  std::vector<char> buf(UriStringToUnixFilenameSize(uri));
  ASSERT_EQ(URI_SUCCESS, uriUriStringToUnixFilenameA(uri, &buf[0]));
  EXPECT_STREQ(filename, &buf[0]) << uri;
}

TEST(UriUriStringToUnixFilenameA, NoFile) {
  CheckUriUriStringToUnixFilenameA("", "");
  CheckUriUriStringToUnixFilenameA("a", "a");
  CheckUriUriStringToUnixFilenameA("%0A", "\n");
}

TEST(UriUriStringToUnixFilenameA, WithFile) {
  CheckUriUriStringToUnixFilenameA("file://", "");
  CheckUriUriStringToUnixFilenameA("file:///", "/");
  CheckUriUriStringToUnixFilenameA("file://a", "a");
  CheckUriUriStringToUnixFilenameA("file:///b", "/b");
  CheckUriUriStringToUnixFilenameA("file:///c/", "/c/");
  CheckUriUriStringToUnixFilenameA("file:///d/e", "/d/e");
}

TEST(UriUriStringToUnixFilenameA, Dots) {
  CheckUriUriStringToUnixFilenameA("file://.", ".");
  CheckUriUriStringToUnixFilenameA("file://..", "..");
  CheckUriUriStringToUnixFilenameA("file:///.", "/.");
  CheckUriUriStringToUnixFilenameA("file:///..", "/..");
}

TEST(UriWindowsFilenameToUriStringA, Nullptr) {
  const char uri[] = "c:/foo";
  std::vector<char> buf(10);

  ASSERT_EQ(URI_ERROR_NULL, uriWindowsFilenameToUriStringA(nullptr, &buf[0]));
  ASSERT_EQ(URI_ERROR_NULL, uriWindowsFilenameToUriStringA(uri, nullptr));
  ASSERT_EQ(URI_ERROR_NULL, uriWindowsFilenameToUriStringA(nullptr, nullptr));
}

size_t WindowsFilenameToUriStringSize(const char *filename) {
  // "file:///" plus each character could be expanded to a 3 character percent
  // representation and finishing up with a NUL termination sentinal.
  return 8 + 3 * strlen(filename) + 1;
}

void CheckUriWindowsFilenameToUriStringA(const char *filename,
                                         const char *uri) {
  std::vector<char> buf(WindowsFilenameToUriStringSize(filename));
  ASSERT_EQ(URI_SUCCESS, uriWindowsFilenameToUriStringA(filename, &buf[0]));
  EXPECT_STREQ(uri, &buf[0]) << "For: \"" << filename << "\"";
}

TEST(UriWindowsFilenameToUriStringA, WhiteSpace) {
  CheckUriWindowsFilenameToUriStringA("", "");
  CheckUriWindowsFilenameToUriStringA(" ", "%20");
  CheckUriWindowsFilenameToUriStringA("a b", "a%20b");
  CheckUriWindowsFilenameToUriStringA("\t", "%09");
  CheckUriWindowsFilenameToUriStringA("\v", "%0B");
  CheckUriWindowsFilenameToUriStringA("\n", "%0A");
  CheckUriWindowsFilenameToUriStringA("\r", "%0D");
  CheckUriWindowsFilenameToUriStringA("\r\n", "%0D%0A");
}

TEST(UriWindowsFilenameToUriStringA, Slashes) {
  // DOS style backslash.
  CheckUriWindowsFilenameToUriStringA("\\", "/");
  CheckUriWindowsFilenameToUriStringA("\\\\", "file://");  // ?
  CheckUriWindowsFilenameToUriStringA("a:\\", "file:///a:/");
  CheckUriWindowsFilenameToUriStringA("b:\\c", "file:///b:/c");

  CheckUriWindowsFilenameToUriStringA("a:\\\\", "file:///a://");

  // Unix
  CheckUriWindowsFilenameToUriStringA("/", "%2F");
  CheckUriWindowsFilenameToUriStringA("/a", "%2Fa");
  CheckUriWindowsFilenameToUriStringA("/b/", "%2Fb%2F");
  CheckUriWindowsFilenameToUriStringA("c", "c");
  CheckUriWindowsFilenameToUriStringA("d/e", "d%2Fe");
  CheckUriWindowsFilenameToUriStringA("f/", "f%2F");
}

size_t UriStringToWindowsFilenameSize(const char *uri) {
  // Skip removing the 5.
  return strlen(uri) + 1;
}

TEST(UriUriStringToWindowsFilenameA, Nullptr) {
  const char uri[] = "/a/b";
  std::vector<char> buf(UriStringToWindowsFilenameSize(uri));
  EXPECT_EQ(URI_ERROR_NULL, uriUriStringToWindowsFilenameA(nullptr, &buf[0]));
  EXPECT_EQ(URI_ERROR_NULL, uriUriStringToWindowsFilenameA(uri, nullptr));
  EXPECT_EQ(URI_ERROR_NULL, uriUriStringToWindowsFilenameA(nullptr, nullptr));
}

void CheckUriUriStringToWindowsFilenameA(const char *uri,
                                         const char *filename) {
  std::vector<char> buf(UriStringToWindowsFilenameSize(uri));
  ASSERT_EQ(URI_SUCCESS, uriUriStringToWindowsFilenameA(uri, &buf[0]));
  EXPECT_STREQ(filename, &buf[0]) << "For: \"" << uri << "\"";
}

TEST(UriUriStringToWindowsFilenameA, NoFile) {
  CheckUriUriStringToWindowsFilenameA("", "");
  CheckUriUriStringToWindowsFilenameA("a", "a");
  CheckUriUriStringToWindowsFilenameA("%0A", "\n");
}

TEST(UriUriStringToWindowsFilenameA, WithFile) {
  CheckUriUriStringToWindowsFilenameA("file://", "\\\\");
  CheckUriUriStringToWindowsFilenameA("file:///", "");
  CheckUriUriStringToWindowsFilenameA("file://a", "\\\\a");
  CheckUriUriStringToWindowsFilenameA("file:///b", "b");
  CheckUriUriStringToWindowsFilenameA("file:///c/", "c\\");
  CheckUriUriStringToWindowsFilenameA("file:///d/e", "d\\e");
}

TEST(UriUriStringToWindowsFilenameA, Dots) {
  CheckUriUriStringToWindowsFilenameA("file://.", "\\\\.");
  CheckUriUriStringToWindowsFilenameA("file://..", "\\\\..");
  CheckUriUriStringToWindowsFilenameA("file:///.", ".");
  CheckUriUriStringToWindowsFilenameA("file:///..", "..");
}

TEST(UriUriStringToWindowsFilenameA, Slashes) {
  // DOS style backslash.
  CheckUriUriStringToWindowsFilenameA("/", "\\");
  CheckUriUriStringToWindowsFilenameA("file://", "\\\\");
  CheckUriUriStringToWindowsFilenameA("file:///a:/", "a:\\");
  CheckUriUriStringToWindowsFilenameA("file:///b:/c", "b:\\c");

  CheckUriWindowsFilenameToUriStringA("file:///a://",
                                      "file%3A%2F%2F%2Fa%3A%2F%2F");

  // Unix
  CheckUriUriStringToWindowsFilenameA("%2F", "\\");
  CheckUriUriStringToWindowsFilenameA("%2Fa", "\\a");
  CheckUriUriStringToWindowsFilenameA("%2Fb%2F", "\\b\\");
  CheckUriUriStringToWindowsFilenameA("c", "c");
  CheckUriUriStringToWindowsFilenameA("d/e", "d\\e");
  CheckUriUriStringToWindowsFilenameA("f/", "f\\");
}

}  // namespace

IPv6 hostText Display Issue

It's highly possible this is user error. I'm seeing the '[' being omitted from hostText but the ']' is present. The code comments seem to indicate that '[' and ']' shouldn't be in hostText but because that shows a port with the ':' separator you would seem to be required to do so. The output of a uriparse of https://[::1]:9090 and https://127.0.0.1:9090 can be seen below:

https://[::1]:9090
::1]:9090
::1
https://127.0.0.1:9090
127.0.0.1:9090
127.0.0.1

A simple and very boiled down chunk of code showing the problem:

#include <arpa/inet.h>
#include <uriparser/Uri.h>

int main()
{
    UriUriA u;
    const char *errorPosition;
   
    uriParseSingleUriA(&u, "https://[::1]:9090", &errorPosition);
    fprintf(stdout,"%s\n",u.scheme);
    fprintf(stdout,"%s\n",u.hostText);
    if (u.hostData.ip6)
    {   
        char ip6str[INET6_ADDRSTRLEN] = ""; 
        inet_ntop(AF_INET6, u.hostData.ip6->data, ip6str, sizeof(ip6str));
        fprintf(stdout,"%s\n",ip6str);
    }   

    uriParseSingleUriA(&u, "https://127.0.0.1:9090", &errorPosition);
    fprintf(stdout,"%s\n",u.scheme);
    fprintf(stdout,"%s\n",u.hostText);
    if (u.hostData.ip4)
    {   
        char ip4str[INET_ADDRSTRLEN] = ""; 
        inet_ntop(AF_INET, u.hostData.ip4->data, ip4str, sizeof(ip4str));
        fprintf(stdout,"%s\n",ip4str);
    }   
    return 0;
}

Note: I'm running uriparser from Fedora 29 repositories. It reports the version as 0.9.0-1.

Release uriparser 0.9.0

Specific to 0.9.0:

  • Adjust one-line project description up here regarding C89
  • Adjust project description on website regarding C89
  • Create downstream tracking ticket — #38

Possible incorrect output from `uriUriStringToUnixFilenameA()`

I have a Java program that makes some URIs and sends them to a C++ program (using uriparser) to something with. Below is an example of what Java's giving me:

ubuntu@DEV:~$ jshell
|  Welcome to JShell -- Version 10.0.1
|  For an introduction type: /help intro

jshell> Path p = Paths.get("/home/ubuntu/")
p ==> /home/ubuntu

jshell> p.toUri()
$2 ==> file:///home/ubuntu/

jshell> p.resolve("somefolder/").toUri()
$3 ==> file:///home/ubuntu/somefolder/

jshell> p.toUri().resolve("somefolder/")
$4 ==> file:/home/ubuntu/somefolder/

I'm running under the assumption that Java is doing the right thing here. Putting those URIs into uriparse:

ubuntu@DEV:~/Desktop/urib/build$ ./uriparse file:///home/ubuntu/somefolder/
uri:          file:///home/ubuntu/somefolder/
scheme:       file
hostText:     
 .. pathSeg:  home
 .. pathSeg:  ubuntu
 .. pathSeg:  somefolder
 .. pathSeg:  
absolutePath: false
              (always false for URIs with host)
ubuntu@DEV:~/Desktop/urib/build$ ./uriparse file:/home/ubuntu/somefolder/
uri:          file:/home/ubuntu/somefolder/
scheme:       file
 .. pathSeg:  home
 .. pathSeg:  ubuntu
 .. pathSeg:  somefolder
 .. pathSeg:  
absolutePath: true

Now, for the actual issue:

If I stick those URIs into uriUriStringToUnixFilenameA(), I get:

  • file:///home/ubuntu/somefolder/ -> /home/ubuntu/somefolder/, and
  • file:/home/ubuntu/somefolder/ -> file:/home/ubuntu/somefolder/

My other code chockes on the second output as it's expecting a valid path.

Does this look like a bug in uriparser, or am I simply using it wrong?

I've attached a sample program here that shows this behaviour.

If this is indeed a bug I have a patch ready, but I'd want to be sure first.

EDIT: This is using current master (20ae776)

Please tag for 0.9.2

Hi, I would like to use 0.9.2, because of the cmake build availability. Could you tag the repo so I can pull this in?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.