Coder Social home page Coder Social logo

pigeon_map's Introduction

Changelog

  • ver. 0.5.1 fixing stuff broken by new dart release ("script" doesn't return path as string)

  • ver. 0.5.0:

  1. (BREAKING CHANGE) changed signature of noSuchAttribute: now I pass a single parameter (Invocation) to make noSuchAttribute more consistent with noSuchMethod
  2. allow undeclared attributes of any kind (not only primitive types) to occur in JSON: when such attribute is encountered while parsing, library calls noSuchAttribute
  3. updated last section (about build.dart) to reflect new interface of build.dart (introduced in latest release of dart SDK)
  • ver. 0.4.13: dependency on deprecated dart:json removed; dependency on json package added.
  • ver. 0.4.12: added support for noSuchAttribute - see text below for details

Overview

Pigeon library provides implementation of memory- and performance-efficient map-backed data structures. These structures are efficiently serializable with no extra boilerplate. Serialization doesn't rely on reflection and withstands dart2javascript conversion and minification.

Library consists of the following components:

  1. pigeon map: something that looks like a map, but stores values in array of slots (much more efficient than standard map memory/performance-wise
  2. pigeon struct - inherits pigeon map, but adds getters/setters for attributes so it looks like a normal class. In other words, pigeon struct is something that looks like a struct and a map at the same time.
  3. various serializers/deserializers (JSON and Pigeonson are supported in this version)
  4. generator of pigeon structs. Very simple one. All it does is this: you write
@Prototype
class Person {
  String firstName;
  String lastName;
  int yearOfBirth;
}

and program generates something like this:

class Person extends PigeonStruct {
  Person() {/*...*/} 
  String get firstName => getValue(0);
  void set firstName(val) => setValue(0,val);
  String get lastName => getValue(1);
  void set lastName(val) => setValue(1,val);
  int get yearOfBirth => getValue(2);
  void set yearOfBirth(val) => setValue(2,val);
}

Pigeon Map

Pigeon map is an implementation of standard Map interface for the case of attribute names known in advance. E.g. if you are going to produce a lot of maps with just 3 attributes: "foo", "bar" and "baz", you do the following:

var nameSet = new NameSet(["foo", "bar", "baz"]);
var list = [];
for (int i=0; i<=1000; i++) { 
  var map = new PigeonMap(nameSet);  // same name set
  map["foo"]=i;
  map["bar"]="abcde"
  map["baz"]=true;
  list.add(map);
}

PigeonMap internally allocates an array of 3 values, which is as memory-efficient as it gets. Though performance is not an immediate goal, pigeon map in fact faster than standard native map (by as much as 3 times in VM) - mostly because the overhead of map creation is very low for PigeonMap, and very high for native map. But even ignoring map creation time, PigeonMap is faster by 20-30%.

[ UPDATE: upon compiling to javascript, it loses speed - code produced by dart2js is inefficient. This is a general problem - not specific to pigeon map ]

Note that PigeonMap implements standard map interface. E.g. you can put only "foo" value in a map:

var map = new PigeonMap(nameSet);
map["foo"]=1;

print(map.containsKey("bar")); // prints false.
print(map.length); // prints 1

How it works

Obviously, if we find the function such that f("foo") = 0, f("bar") = 1, f("baz") =2, the problem is solved. Such function is "officially" called minimal perfect hash (see wikipedia). Minimal perfect hash can be obtained in a trivial manner by auxiliary hashmap that maps keys to positions directly. However, such method will be obviously slower than native hashMap (because it will use native hashMap as a subroutine).

To find a better hash function, we can first obtain "real" 31-bit hashCodes of the words from given set. These codes are almost always unique for the small set (see "birthday problem" in wikipedia). If not, algo switches into slow mode (binary search).

Now let's consider an abstract problem. We have an array of N unique 31-bit numbers h[i], and we want to find a function f such as f(h[i]) produces unique (!) values in the range, say, 0..255. Obvious way to do that is just to take 8 bits 0..7 from each h[i]. If they are unique, we are done. If not, we try bits 1..8, then 2..9 etc - after a number of attempts, we have a good chance to succeed (see below)

Now that we have unique small hashCodes, the task is trivial.

Suppose sh("foo")=0x15, sh("bar")=0xFE, sh("baz") = 0x2A, where sh(x) is a "small hash code".

We create a table of 256 entries initialized to -1, and set

table[0x15]=0; table[0xFE]=1; table[0x2A]=2

which is what we tried to achieve.

In general, the size of the table should be approximately a square of the number of names in the set (see wikipedia). (Consequently, bit size of small hash map is a log2 of table size). During initialization, NameSet counts names, takes a square of this count, rounds up to nearest power of 2 - this is the table size. I found experimentally that with this choice of table size, the chance of NOT finding a mapping is about 1/10000.

One notable property of PigeonMaps is that the keys in NameSet and keys used for reading data are normally constant strings. For constant strings, expenses for hashCode calculations are essentially zero, and comparison for equality is satisfied by identity check. Since hashCode and equality is all that's needed here, and collisions are prevented, finding a slot for value boils down to shift and mask in typical case. (Please refer to the source code for details, otherwise it sounds cryptic)

NOTE: PigeonMap supports up to 64 keys for NameSet. It's enough for most practical purposes. You can have as many NameSets as you wish.

Use cases

Example: you execute sql statement select foo, bar, baz, name, address from customer;

Database driver has to return result set somehow (it can be large!) PigeonMap is a solution: create (automatically, by parsing "select" statement or retrieving ResultSetMetadata) name set containing 5 attributes (foo, bar, baz, name, address) and return array of PigeonMaps - they are memory/performance efficient. The main benefit of using PigeonMaps instead of regular maps: they use significantly less memory.

Other use cases involve PigeonStructs.

Pigeon Struct

PigeonStruct is extension of PigeonMap that provides getters/setters for attributes, so it looks like a hybrid between map and struct. Good for value objects, DAO, DTO (those terms mean more or less the same).

Code generation is simple: create a file containing prototypes of your data structures (say, proto_person.dart), and main function that calls "preprocess":

//>person.dart  // first line should start with //> 
import '../lib/generator.dart'; //-
//+ import '../lib/pigeon.dart';
@Prototype
class Person {
  String firstName;
  String lastName;
  int yearOfBirth = 2000;
  //asis+
  String toString() => "$firstName $lastName ($yearOfBirth)";
  //asis-
}
// other @Prototype classes
main() {        //- lines that contain //- will not be copied into generated file
  preprocess(); //-
}               //-

Press CTRL/R to run it - generator will scan the file and generate another file here all @Prototypes are replaced by generated code. You file otherwise may contain any code you like - it will be copied into generated file as is (only @Prototypes get replaced) (Alternatively, you can include generation logic in build.dart -see example below).

Important rules:

  1. Prototype file should have prefix "proto_" in its name to be recognized by build.dart

  2. First line should be "//>" directive. e.g //>person.dart or //>../generated/person.dart or whatever.

  3. Lines that contain "//-" will not be copied into generated file

  4. Lines that start with "//+" will be stripped of "//+" and copied into generated file. E.g., if generated file is supposed to be part of library, include line //+part of mylibrary; in the source

  5. Initializers are supported : in the example above, yearOfBirth is 2000 by default; "default default" is null, as usual.

  6. If you want to add methods to the generated class, place them between //asis+ and //asis- comments

  7. You can (optionally) override noSuchAttribute (place it in "asis" block), which is called if you try to access the key not defined in name set. Function will be called with a single parameter (Invocation) - as defined by language spec (consistent with noSuchMethod).

Please see more complex example in test/proto_media.dart. When you run this file, it generates test/media.dart.

Since generated object inherits from PigeonMap, all properties of PigeonMap hold.

NOTE: program also generates small amount of metadata for mapping of struct keys to their types. Since keys are the same as attribute names, program uses another PigeonMap to efficiently store metadata.

JSON Serialization

Since PigeonStruct is map-backed, all serializers that normally work for maps will be able to work with PigeonMaps and PigeonStructs. For deserialization, there's additional problem of creating correct PigeonStruct types (e.g. object of type Person while deserializing encoded Person). Implementation supports JSON serialization both ways without any extra boilerplate. Supported data types can be defined recursively:

  • primitive types (int, double, bool, String) are supported

  • if T - supported type, then List<T> and Map<String,T> are supported types

  • typed_data classes are supported (Uint8List, Int8List, Uint16List, Int16List, Uint32List, Int32List, Uint64List, Int64List, Float32List, Float64List)

  • class extending PigeonStruct is supported if all its attribute types are supported.

  • DateTime is supported, but only as an attribute of PigeonStruct (this means that if you have, say, List, it won't work). This limitation is due to insufficient support of custom classes in current JSON parser.

To get JSON string, simply call toJsonString() method of any PigeonStruct. To parse JSON string, call new Person.fromJsonString(str). Performance of JSON toJsonString/fromJSONString is the same as for default JSON library, but you get typed structures for the same money. (The reason why PigeonMaps are not faster for JSON is this: 95% of time is spent in parsing proper, not in map read/write)

Binary serialization

There's (very) fast binary serializer and (not so fast) deserializer. They support format similar to Bson, but optimized for pigeon structs.

Example of invocation

bytes = person.toPgsonMessage();  // "bytes" is Uint8List
new Person.fromPgsonMessage(bytes);

Deserializer is approx. 1.6-1.7 times slower than serializer (this ratio is normal for serializers).

Compared with corresponding Json methods discussed above, Pgson is faster by factor of 6-8(*) in serialization, and by factor of 2.5 in deserialization. Generated byte message is approx. 2 times shorter than corresponding Json string.

(*) depends on type of processor and other factors

Example of build.dart

If you don't like pressing CTRL/R for generation, you can include preprocessing into build.dart. This script will be called each time you save your prototype file. Here is an example of build.dart

import 'dart:io';
import "lib/generator.dart";

void main(args) {
  for (String arg in args) {
    if (arg.startsWith('--changed=')) {
      String fileName = arg.substring('--changed='.length);
      if (fileName.contains("proto_")) {
         preprocess(fileName);        
      } 
    }
  }
}

pigeon_map's People

Contributors

tatumizer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pigeon_map's Issues

draft

I use this ISSUE simply as a draft copy of posts I'm going to send elsewhere.

Support for missing NameSet entries

Alex, thanks for this extremely useful package, it is working very well right out of the box. I wanted to echo Andrés Viedma's comments from the Dart-Misc mailing list and ask if it would be easy for you to add support for missing fields in Pigeon Structs.

I'm not sure about Andrés but I would be alright with these fields deserializing into uninitialized null values. I took a quick try at hacking it in but I was unable to cope with all the downstream effects of my edits. During serialization, it would be preferable for me to have null struct fields disappear entirely, although your package may be doing this already, I have not tested.

The use case for me is with wrapper messages that can hold a number of different content types. My current workaround is to use a list for each field but If I need to store a large number of these structs as JSON, having all of these fields with no values will get expensive. However, if null fields disappear when serialized and non-existent fields are set as null during deserialization then my storage and transfer costs can be minimized.

The other workaround that comes to mind is using the standard dart library to deserialize the JSON into a map, check for existence, and then convert fields into PigeonStructs if they exist but then I'm paying for two parses.

Please let me know if that makes sense, I can provide an example, or if there's some easier approach you know of that I'm not seeing. Thanks again!

dart - mirrors

This is a notebook on dart mirrors

    print("location ${declaration.location} isClass=${declaration is ClassMirror}");
    if (declaration is ClassMirror) {
    
      if (declaration.simpleName.name == "X") {
        declaration.invoke(#staticMethod, []);
      }
    } else if (declaration is MethodMirror) { // for top-level functions
       if (declaration.simpleName == Symbol("foo")) {
         library.invoke(Symbol("foo"), []);
      }
    } 

extension SymbolNameExt on Symbol {
  String get name {
    var str = this.toString();
    if (!str.startsWith('Symbol("')) throw 'Weird symbol $str - BUG!';
    return str.substring(str.indexOf('"') + 1, str.lastIndexOf('"'));
  }
}

reading annotations

for (var meta in declaration.metadata) {
      if (meta.type == reflectClass(ExampleAnnotation)) {
        //print("simple name: ${declaration.simpleName.name}");
        //print(meta.getField(#value));
      }
    }
}

computing effective file name with a default directory

extension on Options {
  String get effectiveFileName {
    if (this.destinationFilePath != null) return this.destinationFilePath!;
    String scriptPath = Platform.script.toFilePath();
    String dirname = path.dirname(scriptPath);
    String mainFileName = path.basenameWithoutExtension(scriptPath);
    return '$dirname/$mainFileName.g.dart';
  }
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.