Coder Social home page Coder Social logo

nightowl888 / j2n Goto Github PK

View Code? Open in Web Editor NEW
45.0 3.0 8.0 4.02 MB

Java-like Components for .NET

License: Apache License 2.0

C# 99.00% PowerShell 0.98% Shell 0.01% Batchfile 0.01%
text analysis bytebuffer character atomicinteger atomiclong atomicboolean collections-java hacktoberfest linkedhashset

j2n's Introduction

J2N - Java-like Components for .NET

Nuget Azure DevOps builds (branch) GitHub GitHub Sponsors

J2N is a library that helps bridge the gap between .NET and Java.

Our Goals

  • Java-like behaviors
  • .NET-like APIs
  • Be the defacto library to use when porting from Java to .NET
  • Provide high quality, high performance components that can be used in a wide range of .NET applications

Basically, if you are looking for a "JDK.NET", this is about as close as you can get. While we recommend using purely .NET components where possible when porting from Java, there are some Java features that have no .NET counterpart or the .NET counterpart is lacking behaviors that are not easy to reproduce without reinventing the wheel. Even if you prefer to reinvent the wheel by designing your own ".NETified" component, you may still need a Java-like component to compare your component against in tests.

That is why we created J2N. If you like this idea, please be sure to star our repository on GitHub.

Our Focus

  1. Text analysis: code points, normalizing behaviors between different "character sequence" types, tokenizing, etc.
  2. I/O: Reading and writing types in both big-endian and little-endian byte order and providing specialized behaviors for interop with Java-centric file formats.
  3. Collections: .NET's cupboard is a little bare when it comes to specialized collections, so we fill in some gaps.
  4. Equality: Compare collections for structural equality with behaviors that are specific to each collection family, and provide .NET equality comparers for other types that differ in behavior.
  5. Localization: Bridge the gap between .NET's culture-aware and Java's culture-neutral defaults.

NuGet

Install-Package J2N

Contributing

We love getting contributions! If you need something from the JDK that we don't have, this is the right place to submit it. Basically, the following are things that would be a good fit for this library:

  1. Components in the JDK that have no direct counterpart in .NET, or the counterpart is lacking features
  2. Features that make J2N easier to work with in .NET such as extension methods and adapters
  3. Features that make .NET interoperate with Java better

Building and Testing

To build the project from source, see the Building and Testing documentation.

Saying Thanks

If you find this library to be useful, please star us on GitHub and consider a sponsorship so we can continue bringing you great free tools like this one.

GitHub Sponsors

j2n's People

Contributors

introfog avatar nightowl888 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

j2n's Issues

Bug: Inconsistent behavior of Indexof() and LastIndexOf()

We need our own implementations of IndexOf() and LastIndexOf() on StringBuilder because they are not provided in .NET.

The J2N implementations we have account for Ordinal and OrdinalIgnoreCase of StringComparison, but all other options call ToString() on the StringBuilder and then cascade the call to string.IndexOf() and string.LastIndexOf().

However, in the JDK and Apache Harmony, the overload that allows you to specify startIndex/fromIndex do not have any exceptions thrown on the bounds and instead correct anything over Length - 1 to be equivalent to Length - 1 and anything below zero will return -1. So, this behavior is inconsistent between Java and .NET.

However, we have no way to re-implement IndexOf() and LastIndexOf() on String or ReadOnlySpan<char> so the most logical thing to do is to change the Ordinal and OrdinalIgnoreCase to throw when Length is passed so it is consistent with the rest of the .NET behavior. The only downside is that this means that any logic written in Java that is sloppy with the bounds will need to be corrected. However, we could simply provide a note in the API docs that the behavior is inconsistent with Java and consistent with .NET.

Whatever the fix, it will require a breaking behavior change and cannot be addressed without a major version bump.

SIGSEGV while using LurchTable in Xamarin.iOS project

Hi! Ive been attempting to use J2N LurchTable in an iOS app developed using Xamarin. The app crashes with a SIGSEGV on any iOS device.
Error: J2N.Collections.Concurrent.AddInfo``2:CreateValue <0x0006c>
However it does work as expected in the iOS simulator. I am guessing this might be a x64 (Simulator) vs ARM64 (device) issue but I can't see any reason from the source code.

J2N version: 2.0.0-beta-0002

Attached:

The original crash was while trying to using DirectoryTaxonomyReader from Lucene.Net project. However I have narrowed it down to the LurchTable class. J2N version 1.00-beta-0001 crashes the same way. The un-refactored LurchTable from Lucene.Net.Support crashes as well.

Bug: StringBuilderExtensions.Reverse() has issue with reversing surrogates

The Reverse() method has an issue as follows:

https://github.com/NightOwl888/J2N/blob/v2.0.0/src/J2N/Text/StringBuilderExtensions.cs#L1153

char c1 = materializeString ? readOnlyText![i + 1] : text[i];

This line should read

char c1 = materializeString ? readOnlyText![i + 1] : text[i + 1];

Note that we are using the wrong index when not materializing the string. This only occurs when the string length is greater than 16384 characters, so we probably have no tests to catch this scenario.

Conflict with AsReadOnly() extensions

Because newer .net core version has the extension AsReadOnly(), there is a conflict between this extension.

my suggestion is to add #IF_FLAG for the compiler to avoid this conflict.
(this detected as part of targeting lucenenet to 8.0)

Implement SubList(), SubMap(), Head() and Tail() functionality in List<T>, SortedSet<t>, and SortedDictionary<TKey, TValue>

Java has the ability to get a view of sorted collections without allocating additional memory. Unlike LINQ, these are not cut down interfaces that are read-only, but can be edited (which modifies the original collection). .NET has such functionality, but only on SortedSet<T> with the GetViewBetween(T, T) method.

So far, the GetViewBetween(T, T) method of SortedSet<T> has been duplicated along with a second overload that accepts boolean parameters indicating whether the upper and lower bounds should be inclusive. This allows us to match the behavior of Java by making the upper bound exclusive, but still keep compatibility with .NET where both bounds are inclusive.

public virtual J2N.Collections.Generic.SortedSet<T> GetViewBetween (T lowerValue, T upperValue);
public virtual J2N.Collections.Generic.SortedSet<T> GetViewBetween (T lowerValue, bool lowerValueInclusive, T upperValue, bool upperValueInclusive);

However, we are missing several members from Java that can be useful, some of which are used by Lucene.NET. In particular, SubList is used frequently, but currently the implementation has no tests and is only partially implemented.

SortedMap<K,V> headMap(K toKey)
SortedMap<K,V> subMap(K fromKey, K toKey)
SortedMap<K,V> tailMap(K fromKey)

public List<E> subList(int fromIndex, int toIndex)

SortedSet<E> headSet(E toElement)
SortedSet<E> tailSet(E fromElement)

Proposed API

public class SortedDictionary<TKey, TValue>
{
    public virtual J2N.Collections.Generic.SortedDictionary<T> GetHeadView (T upperKey);
    public virtual J2N.Collections.Generic.SortedDictionary<T> GetHeadView (T upperKey, bool upperKeyInclusive);
    public virtual J2N.Collections.Generic.SortedDictionary<T> GetViewBetween (T lowerKey, T upperKey);
    public virtual J2N.Collections.Generic.SortedDictionary<T> GetViewBetween (T lowerKey, bool lowerKeyInclusive, T upperKey, bool upperKeyInclusive);
    public virtual J2N.Collections.Generic.SortedDictionary<T> GetTailView (T lowerKey);
    public virtual J2N.Collections.Generic.SortedDictionary<T> GetTailView (T lowerKey, bool lowerKeyInclusive);
}
public class SortedSet<T>
{
    public virtual J2N.Collections.Generic.SortedSet<T> GetHeadView (T upperValue);
    public virtual J2N.Collections.Generic.SortedSet<T> GetHeadView (T upperValue, bool upperValueInclusive);
    public virtual J2N.Collections.Generic.SortedSet<T> GetViewBetween (T lowerValue, T upperValue); (already exists)
    public virtual J2N.Collections.Generic.SortedSet<T> GetViewBetween (T lowerValue, bool lowerValueInclusive, T upperValue, bool upperValueInclusive); (already exists)
    public virtual J2N.Collections.Generic.SortedSet<T> GetTailView (T lowerValue);
    public virtual J2N.Collections.Generic.SortedSet<T> GetTailView (T lowerValue, bool lowerValueInclusive);
}
public class List<T>
{
    public virtual J2N.Collections.Generic.List<T> GetView (int index, int count); (completed)
}

Structural Equality: Treat ImmutableArray<T> the same as Array

Since ImmutableArray<T> is yet another array implementation of IStructuralEquatable that could be compared against Array, it must be treated like an Array, not compared using IStructuralEquatable.

There are several places in the codebase where we do this:

Unfortunately, we have some challenges to deal with, since if we don't know the generic closing type of ImmutableArray<T>, we will need to do some work to materialize it before we can compare it.

Task: Add cross-OS command-line build script

Currently all building and testing is done on Azure DevOps. However, building and testing on the command line is undocumented.

While it is possible to build using dotnet build, dotnet pack, dotnet test and other commands, it would be simpler for potential contributors if there were a wrapper script to launch these commands.

The technology used for the build script is not that important as long as it runs cross-OS, but a way to get this done without adding any additional technology would simply be to make the script in MSBuild.

The build-pack-and-publish-libraries.yml file can be used as a template for the tasks in the build script.

However it is done, the README.md page should be updated with documentation on how to build/test from the command line.

Exception thrown in List<T> or IDictionary<TKey, TValue> when using interpolation

A simple interpolation attempt:

var x = new J2N.Collections.Generic.List<string>()
            {
                "nothing",
                "else",
                "matters"
            };
            string y = $"{x}";

Yields the following exception:

  Message: 
    System.ArgumentNullException : Value cannot be null.
    Parameter name: format
  Stack Trace: 
    String.FormatHelper(IFormatProvider provider, String format, ParamsArray args)
    CollectionUtil.ToString[T](IFormatProvider provider, String format, ICollection`1 collection)
    StringBuilder.AppendFormatHelper(IFormatProvider provider, String format, ParamsArray args)
    String.FormatHelper(IFormatProvider provider, String format, ParamsArray args)
    String.Format(String format, Object arg0)

AtomicReferenceArray.ToString() doesn't use volatile reads

In addition to changing the ToString() method to using volatile reads, the ideal solution would be to provide an overload of ToString() so the end user can pass in the culture (which will be used to format each array element with AppendFormat().

Complete nullable reference type support

Nullable reference type support has been added for around 30% of the types so far, mostly in J2N.Collections, but there are still several types that it hasn't been added to:

  • - J2N.Collections.Concurrent.LurchTable<TKey, TValue>
  • - J2N.Collections.Generic.DebugView.ICollectionDebugView
  • - J2N.Collections.Generic.DebugView.IDictionaryDebugView
  • - J2N.Collections.Generic.LinkedDictionary<TKey, TValue>
  • - J2N.IO.MemoryMappedFiles.MemoryMappedFileExtensions
  • - J2N.IO.MemoryMappedFiles.MemoryMappedViewByteBuffer
  • - J2N.IO.MemoryMappedFiles.ReadOnlyMemoryMappedViewByteBuffer
  • - J2N.IO.MemoryMappedFiles.ReadWriteMemoryMappedViewByteBuffer
  • - J2N.IO.Buffer
  • - J2N.IO.BufferOverflowException
  • - J2N.IO.BufferUnderflowException
  • - J2N.IO.ByteBuffer
  • - J2N.IO.ByteOrder
  • - J2N.IO.CharArrayBuffer
  • - J2N.IO.CharBuffer
  • - J2N.IO.CharSequenceAdapter
  • - J2N.IO.CharToByteBufferAdapter
  • - J2N.IO.DataInputStream
  • - J2N.IO.DataOutputStream
  • - J2N.IO.DoubleArrayBuffer
  • - J2N.IO.DoubleBuffer
  • - J2N.IO.DoubleToByteBufferAdapter
  • - J2N.IO.Endianness
  • - J2N.IO.HeapByteBuffer
  • - J2N.IO.IDataInput
  • - J2N.IO.IDataOutput
  • - J2N.IO.Int16ArrayBuffer
  • - J2N.IO.Int16Buffer
  • - J2N.IO.Int16ToByteBufferAdapter
  • - J2N.IO.Int32ArrayBuffer
  • - J2N.IO.Int32Buffer
  • - J2N.IO.Int32ToByteBufferAdapter
  • - J2N.IO.Int64ArrayBuffer
  • - J2N.IO.Int64Buffer
  • - J2N.IO.Int64ToByteBufferAdapter
  • - J2N.IO.InvalidMarkException
  • - J2N.IO.ReadOnlyBufferException
  • - J2N.IO.ReadOnlyCharArrayBuffer
  • - J2N.IO.ReadOnlyDoubleArrayBuffer
  • - J2N.IO.ReadOnlyHeapByteBuffer
  • - J2N.IO.ReadOnlyInt16ArrayBuffer
  • - J2N.IO.ReadOnlyInt32ArrayBuffer
  • - J2N.IO.ReadOnlyInt64ArrayBuffer
  • - J2N.IO.ReadOnlySingleArrayBuffer
  • - J2N.IO.ReadWriteCharArrayBuffer
  • - J2N.IO.ReadWriteDoubleArrayBuffer
  • - J2N.IO.ReadWriteHeapByteBuffer
  • - J2N.IO.ReadWriteInt16ArrayBuffer
  • - J2N.IO.ReadWriteInt32ArrayBuffer
  • - J2N.IO.ReadWriteInt64ArrayBuffer
  • - J2N.IO.ReadWriteSingleArrayBuffer
  • - J2N.IO.SingleArrayBuffer
  • - J2N.IO.SingleBuffer
  • - J2N.IO.SingleToByteBufferAdapter
  • - J2N.IO.StreamTokenizer
  • - J2N.Runtime.CompilerServices.IdentityEqualityComparer<T>
  • - J2N.Text.CharArrayCharSequence
  • - J2N.Text.CharArrayExtensions
  • - J2N.Text.CharSequenceComparer
  • - J2N.Text.IAppendable
  • - J2N.Text.ICharSequence
  • - J2N.Text.ICharacterEnumerator
  • - J2N.Text.IStructuralFormattable
  • - J2N.Text.ParsePosition
  • - J2N.Text.StringArrayExtensions
  • - J2N.Text.StringBuffer
  • - J2N.Text.StringBuilderCharSequence
  • - J2N.Text.StringBuilderExtensions
  • - J2N.Text.StringCharSequence
  • - J2N.Text.StringCharacterEnumerator
  • - J2N.Text.StringExtensions
  • - J2N.Text.StringFormatter
  • - J2N.Text.StringTokenizer
  • - J2N.Threading.Atomic.AtomicBoolean
  • - J2N.Threading.Atomic.AtomicInt32
  • - J2N.Threading.Atomic.AtomicInt64
  • - J2N.Threading.Atomic.AtomicReference<T>
  • - J2N.Threading.Atomic.AtomicReferenceArray<T>
  • - J2N.Threading.ThreadJob
  • - J2N.ArrayExtensions
  • - J2N.AssemblyExtensions
  • - J2N.BitConversion
  • - J2N.Character
  • - J2N.DoubleExtensions
  • - J2N.IntegralNumberExtensions
  • - J2N.MathExtensions
  • - J2N.PropertyExtensions
  • - J2N.Randomizer
  • - J2N.SingleExtensions
  • - J2N.Time
  • - J2N.TypeExtensions
  • - J2N.TypeInfoExtensions

ArrayEqualityComparer<T> is using the wrong comparisons for float and double

This is basically a bug that was introduced accidentally when adding references to System.Collections.Generic. We are inadvertently using the System.Collections.Generic.EqualityComparer<T> to do comparisons rather than J2N.Collections.Generic.EqualityComparer<T>, which is basically the same except for float, double and string use comparison rules of Java instead of .NET.

ReadOnlyDictionary IDictionary.this[object] has inconsistent behavior with other collections that implement IDictionary

All dictionaries in System.Collections.Generic return null from IDictionary.this[object] when the key doesn't exist, however ReadOnlyDictionary throws a KeyNotFoundException. This is due to the fact that the IDictionary<TKey, TValue> instance that ReadOnlyDictionary wraps doesn't necessarily need to implement IDictionary, so the call is cascaded to IDictionary<TKey, TValue>.this[TKey], which is expected to throw KeyNotFoundException.

We need to break from the behavior in .NET and return null when the key does not exist to make this behavior consistent across all IDictionary implementations.

Task: J2N.Character.CodePointAt(char[], int, int): Convert limit parameter to length (a .NET convention)

This is a breaking API change that will require a major version bump.

Current API:

namespace J2N
{
    public static class Character
    {
        public static int CodePointAt(this char[] seq, int index, int limit)
    }
}

Desired API:

namespace J2N
{
    public static class Character
    {
        public static int CodePointAt(this char[] seq, int index, int length)
    }
}

Of course, the business logic needs to change to accommodate this fix.

Task: Investigate using AzurePipelines.TestLogger for direct test logging via API

Our current test logging is done to a local file on the build server via TRX format, then the TRX file is uploaded to Azure Pipelines using the PublishTestResults@2 task. This works and allows us to add a title per file upload, allowing us to add the test project name, target framework, OS, and build platform on the top level element so we can see where the test failed.

image

This works well, but being that we have so many different test runs and tasks have to run after the tests are completed, it takes a lot of extra time to push the test results (mostly because it runs a bunch of task conditions only to find out that the task is being skipped because it is not the current configuration).

When we were on TeamCity, the test results automatically were added to the job in real time (including counting up the statistics). This allowed us to start investigating a problem as soon as it appeared in the portal instead of waiting for the whole run to finish and the file to upload before seeing anything. This could mean the difference between seeing the problem 5 minutes into the test cycle instead of waiting 20 to 30 minutes.

The AzurePipelines.TestLogger may do the trick, but I don't think it can report the test project name, target framework, OS, and build platform. So, we may need to fork it and see if we can find a way to add that info, and possibly submit a PR to them.

However, first we need to hook it up to see what information it currently provides.

Once we have a solution for this, it can also be used on ICU4N, RandomizedTesting, Spatial4n, and Morfologik.Stemming.

Not sure it can be used on Lucene.NET, though - we would need approval and we currently don't have enough permissions to setup a new Azure DevOps pipeline, much less use a plugin for it. It would work for our "unofficial" pipelines, though.

Structural Equality: Support additional collection types that are identifiable as List, Set, or Dictionary

Currently, we only support IList<T>, ISet<T> and IDictionary<TKey, TValue> to identify collections from the BCL when comparing for structural equality. There are other types that should be included for completion, and should be comparable among similar interface groups of List, Set, or Dictionary.

List

  • IReadOnlyList<T>
  • IList

Set

  • IReadOnlySet<T>

Dictionary

  • IReadOnlyDictionary<T>
  • IDictionary

Add IsCSharpIdentifier() and IsCSharpIdentifierPart() methods to Character class

Similar methods were part of the JDK implementation. The rules for how to implement these methods for C# are documented here.

The spot reserved for them to match the Apache Harmony's implementation's order is here.

See this usage example for a real-world perspective of how these methods work together to detect a valid class name.

NOTES:

  • We also should have overloads for IsJavaIdentifier and IsJavaIdentifierStart, since this library is a bridge between Java and .NET and the original implementations might come in handy.
  • Prefer the implementation style of the Apache Harmony Character class is 10 years old, so we should take a look at the current JavaDocs to ensure the implementation is up to date.

Note there is also a port of the Java identifier code to C# in Spatial4n which might come in useful for working out some of the more complex rules, but we should prefer the implementation style of Apache Harmony and avoid the Regex class, if possible. The Regex class documentation might come in handy for some clues about how to handle certain character classes, see Character classes in regular expressions.

Also, the Spatial4n implementation has some shortcomings:

  • I suspect it actually detects Java identifiers rather than C# identifiers because of the link over to the Javadoc, which might not be the right choice for Spatial4n
  • The Unicode support is broken - it is ignoring characters outside of the range c < 0x00d800 && c > 0x00dfff

In the latter case, the code point would have to be converted to a surrogate pair before passed into CharUnicodeInfo.GetUnicodeCategory() as was done in other methods of the Character class, such as GetType(). However, since the Apache Harmony implementation uses the GetType() method directly, using that example will avoid this issue.

Create lower-level implementation of LinkedDictionary<TKey, TValue>

The LinkedDictionary<TKey, TValue> is a composite type that internally is using a LinkedList<T> and Dictionary<TKey, TValue>. This is a stopgap to get the behavior we need, but we could achieve better performance by using a lower-level hashtable structure to manage the a single collection of items rather than having 2 collections.

The main differences between Dictionary<TKey, TValue> and LinkedDictionary<TKey, TValue> are:

  • LinkedDictionary<TKey, TValue> maintains insertion order across edits
  • LinkedDictionary<TKey, TValue> allows a null key

Add Serializable support for LurchTable<TKey, TValue>

LurchTable<TKey, TValue> was not made serializable, however all of the other J2N collections are. Given the fact that we already have examples of dictionaries being serialized, this should be fairly straightforward to do. However, there may be additional state on LurchTable that needs to be handled that isn't part of other dictionaries in this library, so it needs analysis.

Accessibility of some members of J2N.Collections.BitSet does not match Apache Harmony

Some members of BitSet were inadvertently made not inheritable, when they should be marked virtual to match the accessibility of Apache Harmony. These members are:

public bool Get(int position)
public BitSet Get(int position1, int position2)
public void Set(int position)
public void Set(int position, bool value)
public void Set(int position1, int position2)
public void Set(int position1, int position2, bool value)
public void Clear()
public void Clear(int position)
public void Clear(int position1, int position2)
public void Flip(int position)
public void Flip(int position1, int position2)
public bool Intersects(BitSet bitSet)
public void And(BitSet bitSet)
public void AndNot(BitSet bitSet)
public void Or(BitSet bitSet)
public void Xor(BitSet bitSet)
public int Count (This is for compatibility, we should have a member with the same value named Capacity, and this hidden from public view)
public int Length

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.