Coder Social home page Coder Social logo

tony19 / named-regexp Goto Github PK

View Code? Open in Web Editor NEW
90.0 10.0 20.0 699 KB

Named capture groups for Java 5/6 regular expressions

Home Page: http://tony19.github.io/named-regexp/index.html

License: Apache License 2.0

Java 99.67% Shell 0.33%
regex android java

named-regexp's Issues

Broken backward compatibility of `namedGroups()` method

Hi!

I've just found a possible bug in the library, the following snippet will works with 0.2.6 and throw an exception with 0.2.7:

        final String regex = "/teamDrawer/(?<roomId>.*)";
        final String url = "/teamDrawer/12345";

        final Matcher matcher = Pattern.compile(regex).matcher(url);
        final Integer count = matcher.namedGroups().size();
        final Integer mapCount = matcher.namedGroups().get(0).size();
        final String value = matcher.namedGroups().get(0).get("roomId");

The root cause is the .namedGroups() can only callable once in the version 0.2.7, with the other calls it will return an empty list, maybe the !matcher.hitEnd() prevent it to fill up the list of maps again.

--

I'm using it with Kotlin, so, it turned out that it was called several times in under the hood:

        Pattern.compile(pattern).matcher(url).let {
            when {
                it.matches() && it.groupCount() > 0 -> Router.Parameters.Match(it.namedGroups().firstOrNull() ?: mapOf())
                ... -> ...
            }
        }

Up to date release

There is no release available that include your latest commits, f.e. the namedGroup fix for when goupNames() is empty.

Getting runtime crash in release build

java.lang.RuntimeException: An error occured while executing doInBackground()
at android.os.AsyncTask$3.done(AsyncTask.java:304)
at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:355)
at java.util.concurrent.FutureTask.setException(FutureTask.java:222)
at java.util.concurrent.FutureTask.run(FutureTask.java:242)
at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:231)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
at java.lang.Thread.run(Thread.java:818)
Caused by: java.lang.NoClassDefFoundError: Failed resolution of: Lcom/google/code/regexp/Pattern;
at com.android.beewisesdk.PopulateMessagesAsyncTask.populate_messages(BW.java:880)
at com.android.beewisesdk.PopulateMessagesAsyncTask.doInBackground(BW.java:320)
at com.android.beewisesdk.PopulateMessagesAsyncTask.doInBackground(BW.java:251)
at android.os.AsyncTask$2.call(AsyncTask.java:292)
at java.util.concurrent.FutureTask.run(FutureTask.java:237)
at android.os.AsyncTask$SerialExecutor$1.run(AsyncTask.java:231) 
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112) 
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587) 
at java.lang.Thread.run(Thread.java:818) 
Caused by: java.lang.ClassNotFoundException: Didn't find class "com.google.code.regexp.Pattern" on path: DexPathList[[zip file "/data/app/com.citrus.citruspay-1/base.apk"],nativeLibraryDirectories=[/data/app/com.citrus.citruspay-1/lib/arm, /vendor/lib, /system/lib]]
at dalvik.system.BaseDexClassLoader.findClass(BaseDexClassLoader.java:56)
at java.lang.ClassLoader.loadClass(ClassLoader.java:511)
at java.lang.ClassLoader.loadClass(ClassLoader.java:469)

Runs out of memory, when no group is present in the regex

Matcher.namedGroups() is faulty, because when I have no groups in my pattern, the while loop never quits, and causes out of memory exception.

Details: nextIndex is only updated in the for, but when parentPattern.groupNames() is empty, it never enters it.

My use case is to use your lib with dynamic regexes (I'm writing a regex capable router, that can extract parameters from URLs, so they may or may not contain groups), so I cannot determine without your lib if they have groups or not, so I cannot use the built-in regex, if I don't have groups.

Edit: I just discovered groupCount(), but still it would be nice if it wouldn't be needed.

Compiler error with Java 21

Describe the bug

The project fails to compile at the Java 21 language level.

Reproduction

JAVA_HOME=<path to JDK 21> mvn install

Logs

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.12.1:compile (default-compile) on project named-regexp: Compilation failure: Compilation failure:
[ERROR] ./named-regexp/src/main/java/com/google/code/regexp/Matcher.java:[314,38] namedGroups() in com.google.code.regexp.Matcher cannot implement namedGroups() in java.util.regex.MatchResult
[ERROR]   return type java.util.List<java.util.Map<java.lang.String,java.lang.String>> is not compatible with java.util.Map<java.lang.String,java.lang.Integer>
[ERROR] .,/named-regexp/src/main/java/com/google/code/regexp/MatchResult.java:[44,38] namedGroups() in com.google.code.regexp.MatchResult clashes with namedGroups() in java.util.regex.MatchResult
[ERROR]   return type java.util.List<java.util.Map<java.lang.String,java.lang.String>> is not compatible with java.util.Map<java.lang.String,java.lang.Integer>

Validations

  • Follow our Code of Conduct
  • Read the Contributing Guide.
  • Check that there isn't already an issue that reports the same bug to avoid creating a duplicate.
  • Check that this is a concrete bug. For Q&A, please open a GitHub Discussion instead.
  • The provided reproduction is a minimal reproducible of the bug.

Named group index is incorrect after parens in a character class

I was going to log this bug on your Jira, but the Jira captcha is broken for new signups (https://jira.atlassian.com/browse/JRA-34421).

Parenthesis in a character class don't need to be escaped in a regex string. The Pattern#countOpenParens function does not account for this, and will count parenthesis in a character class as a group. This leads to the index being off for all named groups following the character class.

Here is a unit test that will fail due to this:

@Test
public void testIndexOfNamedGroupAfterNonEscapedParenInCharacterClass() {
    Pattern p = Pattern.compile("(a)(?<foo>[()])(?:c)(?<named>x)");
    assertEquals(2, p.indexOf("named"));
}

Exposing java.util.regex.Pattern.CASE_INSENSITIVE as Pattern.CASE_INSENSITIVE

Hi,

I noticed that wrapper Pattern class doesn't expose the constants defined in java.util.regex.Pattern
class.

So we end up writing code like below:

Pattern pattern = Pattern.compile(patternString, java.util.regex.Pattern.CASE_INSENSITIVE);

Can we expose those constants through Pattern class itself, like

Pattern pattern = Pattern.compile(patternString, Pattern.CASE_INSENSITIVE);

In above Pattern is com.google.code.regex.Pattern

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

github-actions
.github/workflows/ci.yml
  • actions/checkout v4@1d96c772d19495a3b5c517cd2bc0cb401ea0529f
  • actions/setup-java v4
maven
pom.xml
  • junit:junit 4.13.2
  • org.apache.maven.plugins:maven-compiler-plugin 3.13.0
  • org.sonatype.plugins:nexus-staging-maven-plugin 1.6.13
  • org.apache.maven.plugins:maven-gpg-plugin 3.2.4
  • org.apache.maven.plugins:maven-javadoc-plugin 3.6.3
  • org.apache.maven.plugins:maven-gpg-plugin 3.2.4
  • org.apache.maven.plugins:maven-source-plugin 3.3.1
  • org.jacoco:jacoco-maven-plugin 0.8.12
  • org.apache.maven.plugins:maven-site-plugin 3.12.1
  • org.jacoco:jacoco-maven-plugin 0.8.12
  • org.apache.maven.plugins:maven-jxr-plugin 3.3.2
  • org.codehaus.mojo:findbugs-maven-plugin 3.0.5
  • org.apache.maven.plugins:maven-pmd-plugin 3.21.2
  • org.apache.maven.plugins:maven-checkstyle-plugin 3.3.1

  • Check this box to trigger a request for Renovate to run again on this repository

Escaped characters get "un-escaped" when using Matcher.appendReplacement

I don't know whether this is still maintained, but there's a bug when using the Matcher.appendReplacement that removes escaping of characters like "\s". Here's a code example:

Pattern pattern = Pattern.compile("[0-9]");
Matcher matcher = pattern.matcher("12345");
StringBuffer buffer = new StringBuffer();
while (matcher.find()) {
    matcher.appendReplacement(buffer, matcher.group() + "\\s");
}
matcher.appendTail(buffer);
System.out.println(buffer);

This should print "1\s2\s3\s4\s5\s" but prints "1s2s3s4s5s".

Named group index is incorrect after character class boundaries '[' and ']' in quote

When checking parenthesis in a character class, need to exclude quoted '[' and ']' characters. This could lead to the index being off for all named groups following the quoted characters.

Here is a unit test that will fail due to this:

@test
public void testIndexOfNamedGroupAfterNonEscapedParenInQuote() {
Pattern p = Pattern.compile("\Q[\E(?<foo>x)(?<bar>y)");
assertEquals(1, p.indexOf("bar"));
}

OOME when group matches empty string

The following code snippet leads to an endless loop in matcher.namedGroups() and eventually to an OutOfMemoryError:

com.google.code.regexp.Matcher matcher = com.google.code.regexp.Pattern.compile("(?<foo>.*)").matcher("bar");
System.out.println(matcher.namedGroups());

Tested with v0.2.6

I would expect that

  1. the expression (?<foo>.*) matched the whole String bar once, because .* is greedy (as opposed to .*?)
  2. no endless loop / OOME occurs, for any given pattern

Infinite loop when using Matcher.namedGroups

It seems that if the pattern matches an empty string, an infinite loop occurs when using Matcher.namedGroups, even if the matched input isn't an empty string. Here's a code example:

Pattern pattern = Pattern.compile("(?<first>A)?(?<second>B)?");
Matcher matcher = pattern.matcher("AB");
if(matcher.matches()) {
    matcher.namedGroups();
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.