Coder Social home page Coder Social logo

json-avro-converter's Introduction

json-avro-converter

Build Status

This project is no longer in active development and is currently in maintenance mode. We will no longer actively review issues, merge pull-requests, and release new versions. If you are interested in developing it, please fork this repository.

This project is a JSON to Avro conversion tool designed to make migration to Avro easier. It includes a simple command line validator.

Motivation

Apache Avro ships with some very advanced and efficient tools for reading and writing binary Avro but their support for JSON to Avro conversion is unfortunately limited and requires wrapping fields with type declarations if you have some optional fields in your schema. This tool is supposed to help with migrating projects from using JSON to Avro without having to modify JSON data if it conforms to the JSON schema.

JSON2Avro Converter

Features

  • conversion of binary JSON to binary Avro
  • conversion of binary JSON to GenericData.Record
  • conversion of binary JSON to Avro generated Java classes
  • conversion of binary Avro to binary JSON
  • optional field support (unions do not require wrapping)
  • unknown fields that are not declared in schema are ignored

Dependencies

dependencies {
    compile group: 'tech.allegro.schema.json2avro', name: 'converter', version: '0.2.15'
}

Basic usage

import tech.allegro.schema.json2avro.converter.AvroConversionException;
import tech.allegro.schema.json2avro.converter.JsonAvroConverter;
import org.apache.avro.generic.GenericData;
import org.apache.avro.Schema;

// Avro schema with one string field: username
String schema =
            "{" +
            "   \"type\" : \"record\"," +
            "   \"name\" : \"Acme\"," +
            "   \"fields\" : [{ \"name\" : \"username\", \"type\" : \"string\" }]" +
            "}";

String json = "{ \"username\": \"mike\" }";

JsonAvroConverter converter = new JsonAvroConverter();

// conversion to binary Avro
byte[] avro = converter.convertToAvro(json.getBytes(), schema);

// conversion to GenericData.Record
GenericData.Record record = converter.convertToGenericDataRecord(json.getBytes(), new Schema.Parser().parse(schema));

// conversion from binary Avro to JSON
byte[] binaryJson = converter.convertToJson(avro, schema);

// exception handling
String invalidJson = "{ \"username\": 8 }";    

try {
    converter.convertToAvro(invalidJson.getBytes(), schema);    
} catch (AvroConversionException ex) {
    System.err.println("Caught exception: " + ex.getMessage());
}

Advanced usage

If some avro types are not managed by the library, you can extend it by adding your own AvroTypeConverter. An AvroTypeConverter read a json value and convert it to an avro value. This can be useful when some logical-types are missing. The AvroTypeConverter can also be used to define a customer converter for a specific path.

public class CustomFieldConverter implements AvroTypeConverter {
    @Override
    public Object convert(Schema.Field field, Schema schema, Object jsonValue, Deque<String> path, boolean silently) {
        return "custom-" + jsonValue;
    }

    @Override
    public boolean canManage(Schema schema, Deque<String> path) {
        return "customField".equals(path.getLast());
    }
}

To use the converter you should add it to the JsonAvroConverter, to do that you should build it like that

new JsonAvroConverter(new CompositeJsonToAvroReader(new CustomFieldConverter()))

Validator

A command line tool for validating your JSON/Avro documents against a schema.

Build

To bundle the tool into single executable JAR:

./gradlew :validator:shadowJar
java -jar validator/build/libs/json2avro-validator-{version}.jar --help

Usage

Running Validator with --help option will print a help message listing all possible arguments. Sample Avro schema and messages can be found in:

  • schema: validator/src/test/resources/user.avcs
  • JSON message: validator/src/test/resources/user.json
  • Avro message: validator/src/test/resources/user.avro

JSON to Avro

You can validate your JSON to Avro conversion:

java -jar json2avro-validator.jar -s user.avcs -i user.json

If everything processes correctly, the process will end with zero status code.

Avro to JSON

You can convert the Avro binary data into JSON by setting mode -m avro2json option:

java -jar json2avro-validator.jar -s user.avcs -i user.avro -m avro2json

JSON to Avro to JSON

If you would like to know how messages will look like after encoding and decoding invoke:

java -jar json2avro-validator.jar -s user.avcs -i user.json -m json2avro2json

License

json-avro-converter is published under Apache License 2.0.

json-avro-converter's People

Contributors

1123 avatar a10y avatar adamdubiel avatar arkadius avatar bgalek avatar cristaloleg avatar deejay1 avatar druminski avatar janisz avatar jewertow avatar jghoman avatar kyleboyer-optum avatar lukaszjackowski avatar mmadoo avatar piotrrzysko avatar rafal-spryszynski-allegro avatar sorhus avatar szczygiel-m avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

json-avro-converter's Issues

field [fieldName] is expected to be one of these: RECORD, NULL, for nested record with non defined nullable values

I got:
Could not evaluate union, field [fieldName] is expected to be one of these: RECORD, NULL. If this is a complex type, check if offending field: trafficSource.adwordsClickInfo adheres to schema.
when I have nested records, where some of the 'nullable' fields are not specified.

schema sample:

{
    "type": "record",
    "name": "Root",
    "fields": [
        {
            "name": "field1",
            "type": [
                "long",
                "null"
            ]
        },
        {
            "name": "nestedRecord",
            "type": [
                {
                    "type": "record",
                    "namespace": "root",
                    "name": "NestedRecord",
                    "fields": [
                        {
                            "name": "nested1",
                            "type": [
                                "long",
                                "null"
                            ]
                        },
                        {
                            "name": "nested2",
                            "type": [
                                "long",
                                "null"
                            ]
                        }
                    ]
                },
                "null"
            ]
        }
    ]
}

and json string such as:

{
    "field1" : 10999859003, 
    "nestedRecord": 
    { 
        "nested1" : 123321321 
    }
}

I think when it goes in recursion it is not able to skip missing values because for those missing value at level 0 it skips the missing values.

Thank you

Problem decoding union of map: Failed to convert JSON to Avro

Example schema:

protocol ScoreData {
  record ScoreData {
    string id;
    union {null, map<double>} multiScore = null;
  }
}

Example payload:

"{\"id\":\"myKey\",\"multiScore\":{\"map\":{\"a\":10.0}}}"

Exception:

tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro: Could not evaluate union, field multiScore is expected to be one of these: NULL, MAP. If this is a complex type, check if offending field: multiScore adheres to schema.

AFAICT, this is syntactically valid JSON for a union{null, map}. I've tried with avro4s too and it decodes things without issues.

Version: 0.2.10

Replacement Project

Hey guys, I don't know if this is something rude to do, but since this Repo is effectively deprecated, I've decided to create my own implementation.

I implemented this project a few months ago after encountering a conversion problem in this json-avro-converter library and finding no help elsewhere.

https://github.com/LeoFuso/record-mapper/

I'll be maintaining it, your issues, questions, recommendations, requests, and MRs are more than welcome! <3

importing schemas with custom avro types

avro-maven-plugin allows to import schemas, meaning, that I can have one avro split into multiple files, reusing schemas. Is it somehow possible to do the same here? I would like to convert json to avro, where I specify root record avro schema, which references multiple schemas, which defines custom types. Is this possible or will this be possible in future?

Dockerfile

Is it possible to add a docker file for this please.

JsonEncoder is not public in avro

Hi,

In the current implementation, NoWrappingJsonEncoder is using the constructors of JsonEncoder. However, since avro 1.5, the constructors are not public. Instead an object should be created through the supplied factory EncoderFactory.

Fixing this will be much appreciated.

Thanks!

Problem with converting json -> avro -> json for union of records

Hi,

why this test below failed?

Is it proper when json with "fieldB" is converting to other type of record (recordA) with default "fieldA" value?

    def 'should not fail'() {

        given:
        def schema = '''
        {
          "type": "record",
          "name": "testSchema",
          "fields": [
            {
              "name": "someMainField",
              "type": [
                {
                  "type": "record",
                  "name": "recordA",
                  "fields": [
                    {
                      "name": "fieldA",
                      "type": "string",
                      "default": ""
                    }
                  ]
                },
                {
                  "type": "record",
                  "name": "recordB",
                  "fields": [
                    {
                      "name": "fieldB",
                      "type": "string",
                      "default": ""
                    }
                  ]
                }
              ]
            }
          ]
        }
        '''

        def json = '''
        {  
           "someMainField":{  
              "fieldB":"B"
           }
        }
        '''

        when:
        def result = toMap(converter.convertToJson(converter.convertToAvro(json.bytes, schema), schema))

        then:
        !(result["someMainField"] as Map).containsKey("fieldA") // why fail?
        (result["someMainField"] as Map).containsKey("fieldB") // why fail?
    }

Validator fails when field value is null.

Given the following schema

{
    "namespace": "org.srem",
    "type": "record",
    "name": "test_case",
    "doc": "test_schema",
    "fields": [
        {
            "name": "event_id",
            "type": "string",
            "doc": "ID of the event in UUID4 format"
        },
        {
            "name": "payload",
            "doc": "Variable payload depending on client",
            "type": [
                "null",
                {
                    "type": "array",
                    "items": "string"
                },
                {
                    "type": "map",
                    "values": [
                        "null",
                        "string",
                        "int"
		]
                    
                }
            ]
        },
        {
            "name": "__metadata",
            "type": [
                "null",
                {
                    "type": "map",
                    "values": "string"
                }
            ],
            "default": null,
            "doc": "Field used in Hermes internals to propagate metadata like hermes-id"
        }
    ]
}

and the following json data

{
    "event_id": "e130a7ca802a494ebf1997b1e340df71",
    "payload": {
        "foo": null
    }
}

the validator fails with an NPE:

2017-01-12 17:06:09.839 DEBUG t.a.s.j.v.schema.avro.AvroValidator - Converting JSON to AVRO
2017-01-12 17:06:09.883 ERROR t.a.s.j.validator.ValidatorRunner - Document could not be validated
tech.allegro.schema.json2avro.validator.schema.ValidatorException: Error occurred when validating the document
	at tech.allegro.schema.json2avro.validator.schema.avro.AvroValidator.convertJsonToAvro(AvroValidator.java:74) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.validator.schema.avro.AvroValidator.validate(AvroValidator.java:46) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.validator.ValidatorRunner.run(ValidatorRunner.java:27) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.validator.ValidatorRunner.main(ValidatorRunner.java:38) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
Caused by: java.lang.NullPointerException: null
	at java.util.HashMap.merge(HashMap.java:1224) ~[na:1.8.0_111]
	at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1320) ~[na:1.8.0_111]
	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[na:1.8.0_111]
	at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[na:1.8.0_111]
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) ~[na:1.8.0_111]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) ~[na:1.8.0_111]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) ~[na:1.8.0_111]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) ~[na:1.8.0_111]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[na:1.8.0_111]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) ~[na:1.8.0_111]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.readMap(JsonGenericRecordReader.java:96) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.lambda$read$4(JsonGenericRecordReader.java:70) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.onValidType(JsonGenericRecordReader.java:132) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:70) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.readUnion(JsonGenericRecordReader.java:103) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:71) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.lambda$null$0(JsonGenericRecordReader.java:55) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at java.util.Optional.ifPresent(Optional.java:159) ~[na:1.8.0_111]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.lambda$readRecord$1(JsonGenericRecordReader.java:55) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at java.util.LinkedHashMap$LinkedEntrySet.forEach(LinkedHashMap.java:671) ~[na:1.8.0_111]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.readRecord(JsonGenericRecordReader.java:53) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:45) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:36) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonAvroConverter.convertToGenericDataRecord(JsonAvroConverter.java:47) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.converter.JsonAvroConverter.convertToAvro(JsonAvroConverter.java:38) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	at tech.allegro.schema.json2avro.validator.schema.avro.AvroValidator.convertJsonToAvro(AvroValidator.java:70) ~[json2avro-validator-0.2.5-SNAPSHOT.jar:na]
	... 3 common frames omitted

This translates fine to avro using python avro 1.8.1, so the validator shouldn't fail on this.

Converting classes that have avro fields to json

I have a case where I want to parse a normal, non-avro class to json.
I'm using jacksons ObjectMapper, which works for most classes,
But fails when a class has Avro members.

Does your library help with this?

Failed to convert JSON to Avro

I'm getting the following exception:
Caused by: org.apache.avro.AvroRuntimeException: Field deviceid type:LONG pos:0 not set and has no default value
for Json entry "device_id":6854

the schema for this field looks like this:
{"name":"deviceid","type":"long","doc":" The id of the device.","aliases":["device_id","device/id"]}

Is this because the codebase does not take aliases into account?

Feature request: support JSONL

Nice tool, superior to avro tools for our json format.
New features: for conversion, maybe validate, support multi-line json (jsonl) formatted json files. See http://jsonlines.org/

For each json object, in json2avro mode, convert each json object to an avro object. Store just one avro schema in the file header.
Also clean up logging output, log to stderr or add output file option so we can capture the avro output.

cat 1000object.json | java -jar json-avro-converter-snapshot.jar --mode json2avro2json |wc -l
should produce 1000

Convert LogicalType BigDecimal to Json

Hi all,

First of all thank you for the amazing work you have put in, in this library!

I would like some help with the following problem:

As seen below, I am working on a requirement about exposing some data using the Avro serialization format over Rest:

...
final GenericRecord value = stateStoreQuery.get(key);
final byte[] valueBytes = new JsonAvroConverter().convertToJson(value);
final String response = new String(valueBytes);
return ResponseEntity.ok(response);

Almost all data types are getting converted correctly but decimals. I am assuming this is related to the fact that my Avro POJO uses BigDecimals to represent decimal numbers. I know that this is a logical type in Avro rather than a primitive.

For example the number 0.6060 initialized as new BigDecimal("0.6060") comes back as �¬.

What I am trying to detect unsuccessfully so far is whether the cause of the problem is:

  • The default instance of ObjectMapper used in JsonGenericRecordReader and it doesn't support the conversion from BigDecimals to Strings.
  • The fact that the library does not support Logical types which I doubt as it uses the GenericDatumWriter under the hood.
  • I explicitly need to load any logical types given that BigDecimals are encoded as Avro BYTES.

Any help will be much appreciated!

Thanks in advance!

0.2.11 version not allowing extra attributes in request json

We have recently upgraded the library version in our code base from 0.2.9 to 0.2.11 and facing issues after that.

Features Deprecated in 0.2.11

0.2.9 version when the input json contains the extra attributes in the json that are not part of avro were not getting failed and extra attributes were part of the response json

0.2.11 version when the input json contains the extra attributes in the json that are not part of avro are now getting failed and extra attributes were part of the response json The class AvroTypeConverterWithStrictJavaTypeCheck is creating a problem as it check for the fields in the josn showl match the fields in the avro schema.

Why the upgraded version 0.2.11 is not backward compatible?

Is there any provision by which in 0.2.11 version can safeguard the failures and work in compatible with 0.2.9?

issue when schema contains some logical-type

I have an avro schema with some logical-type, like to manage date or BigDecimals.
The date schema is:

 {
          "type" : "long",
          "logicalType" : "timestamp-millis"
        }

and the bigDecimal format is

{
  "type" : "bytes",
  "logicalType" : "decimal",
  "precision" : 15,
  "scale" : 5
}

On my json field are formed like that:

"date": "2021-11-16 15:51:11",
"bigDecimals": "1.00000"

Using your library I can't convert this json to avro, because logical types are not supported, and I can't extends the library because it don't allow that.
So before call the lib, I should convert date to long. And after getting the Record, I should fix the bigdecimals because the default Bytes conversion don't works for it.

I think that is can be cool to:

  1. allow to extends the library (adding custom deserializer)
  2. add basic logical type deserialization

IllegalAccessError on latest avro

I am using this lib (with avro 1.8.2) on Amazon EMR, when I run on the master node this issue happens, any ideas?

java.lang.IllegalAccessError: tried to access method org.apache.avro.io.JsonEncoder.<init>(Lorg/apache/avro/Schema;Ljava/io/OutputStream;)V from class org.apache.avro.io.NoWrappingJsonEncoder
	at org.apache.avro.io.NoWrappingJsonEncoder.<init>(NoWrappingJsonEncoder.java:12)
	at tech.allegro.schema.json2avro.converter.JsonAvroConverter.convertToJson(JsonAvroConverter.java:93)

not able to cast string to int

I'm seeing some error messages like
"Could not evaluate union, field fieldname-here is expected to be one of these: NULL, INT" If this is a complex type, check if offending field: fieldname-here adheres to schema.
when the input is "1" (string)
However I don't think this is consistently the case.
Or at least, this wasn't the case in the past, unless I really missed something.

Problem decoding union of INT or STRING : Failed to validate JSON to Avro

I have a problem to validate JSON from Kafka:

We receive data as an example as follows:

"slotid": {
  "string": "xyz"
}

The validation shows this error:

org.opentest4j.AssertionFailedError: Unexpected exception thrown: tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro: Could not evaluate union, field slotid is expected to be one of these: NULL, STRING. If this is a complex type, check if offending field: slot.slotid adheres to schema.

But it works for an input as follows :

"slotid": "xyz" 

In the schema, we have:

"name" : "slotid",
"type" : [ "null", "string" ],
"default" : null

The same thing for other types like INT, MAP, etc.:

"source": {
  "map": {
    "originCreationTime": "xxx",
    "originId": "yyy"
  }
}

Field [fieldName] is expected to be type: java.util.List

I got this exception message when running it on my json string from a TableRow object.
the record is null and the schema type is:

 {
            "name": "customDimensions",
            "type": {
                "type": "array",
                "items": {
                    "type": "record",
                    "namespace": "root",
                    "name": "CustomDimensions",
                    "fields": [
                        {
                            "name": "index",
                            "type": [
                                "long",
                                "null"
                            ]
                        },
                        {
                            "name": "value",
                            "type": [
                                "string",
                                "null"
                            ]
                        }
                    ]
                }
            }
        },
...

The problem is that when I try the export of a Bigquery table with that schema, even if the record is null it works fine and gives me empty array, when I try this in a dataflow, writing the TableRow object in a GenericRecord using this great utility, then it fails.
I think Google internally does the transformation in Proto (as documented here) first and then in Avro so in these steps it probably checks if the value List is null then serialize empty array. I personally applied this check in:
Object read(Schema.Field field, Schema schema, Object value, Deque<String> path, boolean silently)
when case: ARRAY.

Thanks for help anyway.

Generated GenericRecords with Enums can't be written using GenericDatumWriter

Hey guys,

I'm using tech.allegro.schema.json2avro:converter:0.2.3 and org.apache.avro:avro:1.8.1 trying to convert some json to avro for some code of mine. I want to use the Java API and since I see no other way to get this to work, I write the generic data out to an avro encoded stream, then read it in through the specific datum reader (i know, my heart aches too).

Took this example from another bug, but it illustrates it all the same:

val schema = """{
               |  "type": "record",
               |  "name": "testSchema",
               |  "fields": [
               |    {
               |      "name": "eventType",
               |      "type": {
               |        "name": "EventType",
               |        "type": "enum",
               |        "symbols": ["user"]
               |      }
               |    }
               |  ]
               |}""".stripMargin

val json = """{"eventType":"user"}"""
val generic = new JsonAvroConverter().convertToAvro(json.getBytes, schema)
val out: ByteArrayOutputStream out = new ByteArrayOutputStream()
val encoder: Encoder = EncoderFactory.get().binaryEncoder(out, null)
val writer: GenericDatumWriter[GenericRecord]  = new GenericDatumWriter(getSchema());
writer.write(generic, encoder)

throws
Not an enum: user
org.apache.avro.AvroTypeException: Not an enum: user
at org.apache.avro.generic.GenericDatumWriter.writeEnum(GenericDatumWriter.java:164)
at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:106)

Dug into the code a bit and found this line:

 private Object ensureEnum(Schema schema, Object value, Deque<String> path) {
        List<String> symbols = schema.getEnumSymbols();
        if(symbols.contains(value)){
           return value;
        }
        throw enumException(path, symbols.stream().map(String::valueOf).collect(joining(", ")));
    }

Problem is that returns a String, not a GenericData.EnumSymbol so when this generic object is read, it throws the aforementioned error.

Cannot convert enum types

Hi,
I'm using tech.allegro.schema.json2avro:converter:0.2.2 and org.apache.avro:avro:1.8.0 in order to convert some json that needs to comply to an avro schema that is using enums but the library throws an exception during the conversion.

For example:

val schema = """{
               |  "type": "record",
               |  "name": "testSchema",
               |  "fields": [
               |    {
               |      "name": "eventType",
               |      "type": {
               |        "name": "EventType",
               |        "type": "enum",
               |        "symbols": ["user"]
               |      }
               |    }
               |  ]
               |}""".stripMargin

val json = """{"eventType":"user"}"""
val avro = new JsonAvroConverter().convertToAvro(json.getBytes, schema)

throws:

Not an enum: user
org.apache.avro.AvroTypeException: Not an enum: user
    at org.apache.avro.generic.GenericDatumWriter.writeEnum(GenericDatumWriter.java:164)
    at org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:106)

Field is expected to be type: java.util.Map

When I try to convert input JSON with a schema that has some ebedded records, I get the exception Field <fieldname> expected to be type: java.util.Map when calling convertToGenericDataRecord where is the name of the embedded record field.

Stacktrace:

tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro
 at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:47)
 at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:36)
 at tech.allegro.schema.json2avro.converter.JsonAvroConverter.convertToGenericDataRecord(JsonAvroConverter.java:47)

Next releases

Hello @adamdubiel and the allegro team!

Firstly thanks for creating a library that makes a scala engineer live easier.

I have a question about the library - I want to include it in our project, but I'm not sure if it is still being actively developed, I see that the last release was in September 2019.

Could you share the plans for the future maintenance/release?

Thanks,

Michał Matusiak

Support schema type BYTES

Currently the method tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(Field, Schema, Object, Deque<String>, boolean) does not support the Avro type BYTES. I am currently working on a project, where this is a requirement, though.

My suggestion would be to support the type BYTES by converting a Base64 string to bytes. This is the most natural way I think JSON can represent binary data.

getting error tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro Path in schema: --> payload__id

here is the avro schema -
{
"namespace" : "com.fetch.test",
"name" : "outer_record",
"type" : "record",
"fields" : [ {
"name" : "payload__id",
"type" : "string"
}, {
"name" : "payload_createdDate",
"type" : "long"
}, {
"name" : "payload_lastUpdate",
"type" : "long"
}, {
"name" : "payload_firstName",
"type" : "string"
}, {
"name" : "payload_lastName",
"type" : "string"
}, {
"name" : "payload_email",
"type" : "string"
}, {
"name" : "payload_birthDate",
"type" : "long"
}, {
"name" : "payload_phoneNumber",
"type" : "string"
}, {
"name" : "payload_last4DigitsOfPhoneNumber",
"type" : "string"
}, {
"name" : "payload_signUpPlatform",
"type" : "string"
}, {
"name" : "payload_bonusPointScheduleId",
"type" : "string"
}, {
"name" : "payload_allowedScopes",
"type" : {
"type" : "array",
"items" : "string"
}
}, {
"name" : "payload_role",
"type" : "string"
}, {
"name" : "payload_lifetimeSavingsFromCoupons",
"type" : "double"
}, {
"name" : "payload_deactivatedReason",
"type" : [ "null" ]
}, {
"name" : "payload_userTosAcceptance",
"type" : {
"type" : "array",
"items" : "string"
}
}, {
"name" : "payload_lastPointsViewed",
"type" : [ "null" ]
}, {
"name" : "payload_monthlyBudget",
"type" : [ "null" ]
}, {
"name" : "payload___deleted",
"type" : "boolean"
}, {
"name" : "payload___db",
"type" : "string"
}, {
"name" : "payload___rs",
"type" : "string"
}, {
"name" : "payload___collection",
"type" : "string"
}, {
"name" : "payload___op",
"type" : "string"
}, {
"name" : "payload___ts_ms",
"type" : "long"
}, {
"name" : "_id",
"type" : "string"
} ]
}

getting exception -- org.apache.kafka.streams.errors.StreamsException: Exception caught in process. taskId=0_5, processor=KSTREAM-SOURCE-0000000000, topic=mongoDev.Fetch_V2_Dev.Users, partition=5, offset=188, stacktrace=tech.allegro.schema.json2avro.converter.AvroConversionException: Failed to convert JSON to Avro
at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:53)
at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:41)
at tech.allegro.schema.json2avro.converter.JsonAvroConverter.convertToGenericDataRecord(JsonAvroConverter.java:66)
at topicFlatten.flattenJSON.parseJson(flattenJSON.java:32)
at topicFlatten.jsonTopicFlattenStream.lambda$flattenJSONStream$0(jsonTopicFlattenStream.java:62)
at org.apache.kafka.streams.kstream.internals.AbstractStream.lambda$withKey$1(AbstractStream.java:106)
at org.apache.kafka.streams.kstream.internals.KStreamMapValues$KStreamMapProcessor.process(KStreamMapValues.java:40)
at org.apache.kafka.streams.processor.internals.ProcessorAdapter.process(ProcessorAdapter.java:71)
at org.apache.kafka.streams.processor.internals.ProcessorNode.lambda$process$2(ProcessorNode.java:181)
at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:884)
at org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:181)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:281)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:260)
at org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:219)
at org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:86)
at org.apache.kafka.streams.processor.internals.StreamTask.lambda$process$1(StreamTask.java:731)
at org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:879)
at org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:731)
at org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1177)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:753)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:583)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:556)
Caused by: Path in schema: --> payload__id
at org.apache.avro.generic.GenericData.getDefaultValue(GenericData.java:1173)
at org.apache.avro.data.RecordBuilderBase.defaultValue(RecordBuilderBase.java:142)
at org.apache.avro.generic.GenericRecordBuilder.getWithDefault(GenericRecordBuilder.java:250)
at org.apache.avro.generic.GenericRecordBuilder.build(GenericRecordBuilder.java:227)
at org.apache.avro.generic.GenericRecordBuilder.build(GenericRecordBuilder.java:32)
at tech.allegro.schema.json2avro.converter.types.RecordConverter.convertValue(RecordConverter.java:37)
at tech.allegro.schema.json2avro.converter.types.RecordConverter.convertValue(RecordConverter.java:14)
at tech.allegro.schema.json2avro.converter.types.AvroTypeConverterWithStrictJavaTypeCheck.convert(AvroTypeConverterWithStrictJavaTypeCheck.java:22)
at tech.allegro.schema.json2avro.converter.CompositeJsonToAvroReader.read(CompositeJsonToAvroReader.java:92)
at tech.allegro.schema.json2avro.converter.JsonGenericRecordReader.read(JsonGenericRecordReader.java:49)
... 21 more

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.