Coder Social home page Coder Social logo

ipfs-geoip's Issues

Previous Data

@jbenet here:

[geoip lookups] should work in previous releases too. we should be able to regenerate the exact data. the database has versions. so does the codebase.

Is there any way we can do this?

Update the dataset

I think current dataset is quite old. We also could use directory sharding if js-ipfs supports reading it.

Dataset Updating Plan

While planning to move to the new dataset (see #63), I found some problems I need help with!

Field, information, field and more information

First of all, right now, we have the following data for each location:

{
  "country_code": "US",
  "country_name": "United States",
  "region_code": "CA",
  "city": "Mountain View",
  "postal_code": "94040",
  "latitude": 37.3860,
  "longitude": -122.0838,
  "metro_code": "807",
  "area_code": "650",
  "planet": "Earth"
}

The new datasets contain much more than that:

is_anonymous_proxy
is_satellite_provider
postal_code
latitude
longitude
accuracy_radius
locale_code
continent_code
continent_name
country_iso_code
country_name
subdivision_1_iso_code
subdivision_1_name
subdivision_2_iso_code
subdivision_2_name
city_name
metro_code
time_zone
is_in_european_union

I am pretty sure we don't need all of those fields, so the first goal of this issue is to define which informations do we want to provide through this package.

IPv6

The second issue is: how to support IPv6 (#60)? The newest dataset has an IPv6 table too! Just like the IPv4, we are provided with CIDR addresses that allow us to know the range for which to check for IPv6 addresses. However, unlike IPv4, there's no "int long" form of IPv6 so we can't keep the same structure as we have now for IPv4.

Knowing this, how'd you suggest to tackle this issue? How to organize the information in such a way we can fetch it quickly?

Languages?

The new dataset provides translations for just some languages. Are they worth including or shall we keep just the english ones for now?


Also, I am thinking about setting up a way of updating the geoip database automatically since they update it every tuesday. It would be great so we wouldn't need to think a lot about this (perhaps just merging a PR with the newer CID).


Ping @lidel

which GeoLite dataset?

the readme says to generate the tree from path/GeoLite-Blocks.csv, but it's not clear which one? the page http://dev.maxmind.com/geoip/legacy/geolite/ lists several possibilities?

Would be good to include a mapping (on the readme or another file) of:

  • the original import csv (src url + IPFS url -- let's back it up!)
  • the generated geo-ip tree root

that way we can make sure to back them all up as we increase versions.

maybe the list of refs can be -- itself -- an ipfs node. that way we can just back up that root to backup all versions ever. (( we need to come up with a good way of doing this that's friendly with git, github, and ipfs -- i've been using "published-version" files, but this isn't the best thing ever))

formats

wonder how to reconcile json / protobuf dichotomies in ipfs. json is nice for the human readability, and ease of use. protobuf may be better for lookups in datastructures.


btw, @krl awesome work here

Cleanup Configs to Generate Tree-Shakable ESM

This relates to:

The way we're generating ESM right now transpiles src into ESM which exports the required interfaces for performing geo-ip lookups. This works well for all agents that support module types and allows import/export syntax (e.g. browsers, node, etc) (except for the dependency issues in #100).

However, this takes away the ability to tree-shake the module when ipfs-geoip is included as a dependency to say ipfs-webui because we're unable to bundle this properly. e.g. https://github.com/ipfs-shipyard/ipfs-geoip/actions/runs/3287072521/jobs/5415859864#step:5:124

AI:

  • Cleanup Configs to build ESM valid in both Node-like and browser context
  • Establish imports are tree-shakeable
  • Setup better defaults to check this in aegir.

Would love to hear thoughts on this @SgtPooki, @lidel

Add locale (i18n) support

New source dataset format introduced in #80 provides country and city names in other languages than English
(at the time of writing this, we have names in: de, en, es, fr, ja, pt-BR, ru and zh-CN).

We could add support for passing optional language code to the lookup method.

Details of how to modify b-tree format remain TBD.

Open questions:

  • should we have separate b-tree for each language, or should we keep all translations in a single tree?
    • if it is a single tree, how to ensure client is not fetching strings that they do not need?

Move to ipfs-shipyard?

Now we have an org to incubate projects that not part of the core implementation of the protocol or discussion of the spec. That org is ipfs-shipyard created from ipfs/team-mgmt#448

Short description:

IPFS Shipyard is a venue for the community to pursue and collaborate on research experiments, products, code libraries and more around the IPFS project. It is where innovation in userland happens and where we discover and form new primitives to push to the core of IPFS.

Anyone opposing?

Update to work with latest js-ipfs-http-client

@SgtPooki noted that this library does not work with the latest version of https://www.npmjs.com/package/ipfs-http-client.

๐Ÿ‘‰ We need ipfs-geoip to work with the latest ipfs-http-client so we can use it in ipfs-webui and have no regressions on Peers screen.

Some thoughts:

Switch to dag-cbor

Problem

This library is very old, and remembers the time before we had dag-cbor.
It uses stringified JSON put in data field of dag-pb which is not only inefficiency and a technical debt, but an antipattern these days.

Solution

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.