Coder Social home page Coder Social logo

chapinb / chickadee Goto Github PK

View Code? Open in Web Editor NEW
8.0 1.0 0.0 974 KB

Yet another IP address enrichment tool

Home Page: https://chapinb.com/chickadee/

License: MIT License

Python 98.27% Shell 0.17% Makefile 0.61% Batchfile 0.95%
python forensics infosec geoip geolocation incident-response security

chickadee's Introduction

Chickadee

Yet another IP address enrichment tool.

         _          _
        ('<        >')
       \(_)________( \
        (___________)\\        _____ _     _      _             _
           (     )     \      / ____| |   (_)    | |           | |
            |   |            | |    | |__  _  ___| | ____ _  __| | ___  ___
            |   |            | |    | '_ \| |/ __| |/ / _` |/ _` |/ _ \/ _ \
            |   |            | |____| | | | | (__|   < (_| | (_| |  __/  __/
           _|   |_            \_____|_| |_|_|\___|_|\_\__,_|\__,_|\___|\___|
          (_______)

build status Unit Tests Docstring Coverage Coverage Status MIT Licence PyPI version DeepSource

Supported IP address resolvers:

Documentation

This project's documentation is available in the docs/ folder, or hosted on GitHub at https://chapinb.com/chickadee/.

Specific documentation:

Known bugs

Below are a list of known bugs. Please report any new bugs identified or submit a PR to patch any of the below or ones you found on your own. No one is perfect :)

  • IPv6 addresses expressed in expanded form in the source document are not properly deduplicated against discovered IPv6 addresses in compressed form.
  • While you can provide multiple input files in the same instance, the IPs are distinct to a single input item. For example, if you provide a file and folder as two inputs to the same invocation, chickadee will dedupe all IPs within the single file, then separately dedupe all IPs within the files in the directory. This means you may have duplicate resolutions in the same output in this case.
  • JSON and CSV output will show column/field names even if a value is not present. Please enter an issue if this does not support your usecase.

Contributing

Please create a fork of the repository, make your changes, and submit a pull request for review!

You can always use the issues tab to suggest features and identify bugs.

chickadee's People

Contributors

chapinb avatar deepsourcebot avatar dependabot[bot] avatar snyk-bot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

chickadee's Issues

Cannot parse data from stdin

โฏ echo 'hi1.1.1.1.1world' | chickadee
2020-10-27 19:23:33,923 chickadee.py ERROR chickadee file_handler:341 - Failed to parse <_io.TextIOWrapper name='<stdin>' mode='r' encoding='UTF-8'>
2020-10-27 19:23:33,923 chickadee.py ERROR chickadee file_handler:342 - Error message: 'bytes' object has no attribute 'read'
2020-10-27 19:23:33,923 chickadee.py INFO chickadee run:260 - Extracted 0 distinct IPs

Config File

Allow users to create a config file with defaults for fields and output formats. Either provide at the command line or search for it within the home directory.

Order of parameter parsing

  1. Defaults
  2. Config file
  3. Command line arguments

Fix field bug

Currently, backends contains a bug not allowing specification of custom fields.

Add update statement

Add a log statement to encourage the user to update their version of the script. Should check against pypi.

Can't install via pip on macOS?

Until #32 is landed in Homebrew, I'm trying to do a manual install on macOS 13.4, but hitting a wall ๐Ÿ˜ข ๐Ÿค

Here is what I'm trying... I tried to follow the Installation docs as closely as possible

Versions

macOS = 13.4
python3 = 3.11.3
pip = 23.1.2
virtualenv = 20.23.0

Commands

N.B. on my system WORKON_HOME=~/.virtualenvs

cd $WORKON_HOME
git clone https://github.com/chapinb/chickadee.git
cd chickadee/
virtualenv -p python3 venv3
source venv3/bin/activate
pip install .

The error output

Processing /Users/luke/.virtualenvs/chickadee
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  ร— Preparing metadata (pyproject.toml) did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> [20 lines of output]
      Traceback (most recent call last):
        File "/Users/luke/.virtualenvs/chickadee/venv3/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/Users/luke/.virtualenvs/chickadee/venv3/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/Users/luke/.virtualenvs/chickadee/venv3/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 149, in prepare_metadata_for_build_wheel
          return hook(metadata_directory, config_settings)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/n1/kwxhk1g96vl592mftk5lwntc0000gn/T/pip-build-env-tii_4j18/overlay/lib/python3.11/site-packages/poetry/core/masonry/api.py", line 42, in prepare_metadata_for_build_wheel
          builder = WheelBuilder(poetry)
                    ^^^^^^^^^^^^^^^^^^^^
        File "/private/var/folders/n1/kwxhk1g96vl592mftk5lwntc0000gn/T/pip-build-env-tii_4j18/overlay/lib/python3.11/site-packages/poetry/core/masonry/builders/wheel.py", line 60, in __init__
          super().__init__(poetry, executable=executable)
        File "/private/var/folders/n1/kwxhk1g96vl592mftk5lwntc0000gn/T/pip-build-env-tii_4j18/overlay/lib/python3.11/site-packages/poetry/core/masonry/builders/builder.py", line 85, in __init__
          self._module = Module(
                         ^^^^^^^
        File "/private/var/folders/n1/kwxhk1g96vl592mftk5lwntc0000gn/T/pip-build-env-tii_4j18/overlay/lib/python3.11/site-packages/poetry/core/masonry/utils/module.py", line 69, in __init__
          raise ModuleOrPackageNotFound(
      poetry.core.masonry.utils.module.ModuleOrPackageNotFound: No file/folder found for package chickadee
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

ร— Encountered error while generating package metadata.
โ•ฐโ”€> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

More unit testing

  • multi-lingual support
  • large IP sets (testing rate limiting)
  • testing API key usage
  • testing of stdin handling

Cache recent resolver results

Since some APIs are limited, speed up resolutions by caching resent results from resolvers.

Considerations:

  • Where should we store the cache?
    • Platform specific. In common program/application data, preferably per user.
  • What structure/format to use for the cache?
    • Consider leveraging JSON to serialize the data
    • In addition to the results, it should hold the time it was cached and associated resolver name (ie ip-api)
  • How long to keep a result cached?
    • Default to 24-hours?
  • Allow ignoring cache on execution
  • Allow clearing cache on execution

Add more unittesting

Unit tests to include:

  • testing of argument and logging
  • testing of output writing
  • multi-lingual support
  • large IP sets (testing rate limiting)
  • testing API key usage
  • testing for config file parsing
  • testing for parameter inheritance

Function Name Printed on Execution

When running the script, the function name is printed out and is slightly confusing, as in the following example:

2019-10-10 17:15:00,075 chickadee.py INFO chickadee entry 219 Starting Chickadee
2019-10-10 17:15:06,721 chickadee.py INFO chickadee entry 226 Chickadee complete

Would also like to see output here for "X number of lines parsed, ### being processed after deduplication"

Create documentation

Things to include in the documentation:

  • installation instructions
  • example usage
    • read from args, read from file, read from dir
    • outputting to files, to jq, to csvlook
    • changing the fields to support different analysis types, quick vlookups
  • module documentation for library usage
  • contribution guide
  • site hosting the documentation

Use YAML/TOML for config file format

Is your feature request related to a problem? Please describe.
It is difficult to store complex objects in the current .ini format. Using YAML/TOML will allow easier storage

Describe the solution you'd like
Use a YAML/TOML format to specify the same features in a hierarchical format that is easy to implement and parse.

Describe alternatives you've considered
Attempted .ini formats without luck

Additional context
N/A

Chickadee passes fields as string instead of list to resolver

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior. Please share (as you can):

  1. Sample IP addresses causing issue: 1.1.1.1
  2. Arguments used to invoke chickadee: chick = Chickadee(); chick.run('1.1.1.1')
  3. Error message or chickadee.log file: Returns [{"count": 0}]

Version (please complete the following information):

  • OS: Windows and macOS
  • Version: d826822
  • Python version: 3.7.2

Additional context
Add any other context about the problem here.

Appears when you provide an IP address via API call.

Handle 429 status codes (too many requests)

Requirements:

  • Catch 429 error codes, delay until the next minute and try again with the same dataset.
  • Put this logic in a while loop.
  • Log a message to the end-user to let them know what's happening.
  • Create a unit test with a data set of 18k distinct IP addresses

Allow specification of meta fields

Allow users to specify syntax to concatenate fields into a single field on output. Ie region|country would result in a field of region_country and value of Texas, US.

Refactor code to use interfaces

Is your feature request related to a problem? Please describe.

This project is difficult to integrate in to other code bases as a library:

  • The entry points are unclear
  • The steps required to complete an action are not intuitive
  • You need to review the source code in addition to the documentation to determine how to leverage the library

Describe the solution you'd like

A potential solution is to use interfaces and re-structure the library to have a clear workflows for common use cases. Some use cases include:

  • Resolving an IP address against one or more resolvers
  • Extracting an IP address from an input using a parser
  • Leverage utilities that can convert IP addresses to integers, detect that they are BOGON, etc.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Some useful references:

Add progress bars

This could be useful in cases where:

  • The user chooses to install tqdm (otherwise limit dependencies)
  • There are a large number of IPs to query
  • There are a large number of folders/files to recurse

Add cfwho.com API support

Allows submissions of one or more comma delimited IPs. Rate limited to 1 request per second. Unknown max IPs per request, though should aim for 5-15 IPs per request.

https://cfwho.com/api/v1/1.1.1.1

{
    "success": true,
    "results": [
        {
            "asn": [
                "13335"
            ],
            "cidr": "1.1.1.0/24",
            "netname": "APNIC-LABS",
            "location": "AU",
            "ip": "1.1.1.1",
            "ip_class": "4",
            "contacts": {
                "abuse": [
                    "[email protected]"
                ]
            },
            "services": {
                "abusix": [
                    "[email protected]"
                ],
                "bgpview": [
                    "[email protected]"
                ],
                "rdap": []
            }
        }
    ],
    "results_info": {
        "count": 1,
        "cached": 0,
        "bypass": 1,
        "version": "20191218.1000"
    }
}

https://cfwho.com/api/v1/1.1.1.1,2.2.2.2,3.3.3.3

{
    "success": true,
    "results": [
        {
            "asn": [
                "13335"
            ],
            "cidr": "1.1.1.0/24",
            "netname": "APNIC-LABS",
            "location": "AU",
            "ip": "1.1.1.1",
            "ip_class": "4",
            "contacts": {
                "abuse": [
                    "[email protected]"
                ]
            },
            "services": {
                "abusix": [
                    "[email protected]"
                ],
                "bgpview": [
                    "[email protected]"
                ],
                "rdap": []
            }
        },
        {
            "asn": [
                "3215"
            ],
            "cidr": "2.2.0.0/16",
            "netname": "FR-TELECOM-20100712",
            "location": "FR",
            "ip": "2.2.2.2",
            "ip_class": "4",
            "contacts": {
                "abuse": [
                    "[email protected]",
                    "[email protected]",
                    "[email protected]"
                ]
            },
            "services": {
                "abusix": [
                    "[email protected]"
                ],
                "bgpview": [
                    "[email protected]",
                    "[email protected]"
                ],
                "rdap": []
            }
        },
        {
            "asn": [
                "None"
            ],
            "cidr": null,
            "netname": "AT-88-Z",
            "location": "US",
            "ip": "3.3.3.3",
            "ip_class": "4",
            "contacts": {
                "abuse": [
                    "[email protected]"
                ]
            },
            "services": {
                "abusix": [
                    "[email protected]"
                ],
                "bgpview": [],
                "rdap": []
            }
        }
    ],
    "results_info": {
        "count": 3,
        "cached": 0,
        "bypass": 1,
        "version": "20191218.1000"
    }
}

Add support for ip-api.com pro key

Include option to use "single" API (versus batch).

If single API and pro key provided, enable a threaded query approach for improved resolve time.

Code cleanup

Spend some time reducing complexity, adding more documentation, and cleaning up confusing variable names and functionality.

Issue with frequency information in single IP on command line

Partial error:

$ chickadee -p -s 1.1.1.1
[...]
2020-01-14 23:15:00,303 chickadee.py INFO chickadee resolve:181 - Resolved IPs
Traceback (most recent call last):
  File "/usr/local/bin/chickadee", line 8, in <module>
    sys.exit(entry())
  File "/usr/local/lib/python3.7/site-packages/libchickadee/chickadee.py", line 360, in entry
    res = chickadee.run(x)
  File "/usr/local/lib/python3.7/site-packages/libchickadee/chickadee.py", line 85, in run
    results = self.resolve(result_dict)
  File "/usr/local/lib/python3.7/site-packages/libchickadee/chickadee.py", line 187, in resolve
    query = str(result.get('query', ''))
AttributeError: 'list' object has no attribute 'get'

VirusTotal rate limiter does not store last request time

Describe the bug

The sleeper function checks the last request time to determine how long to sleep for. This value is only set at initialization and does not update per-request.

Version (please complete the following information):

  • OS: Any (macOS)
  • Version: 20200802.0
  • Python version: 3.7.7

Additional context

Move parsers to use base class

Create a base class for the parsers.

Should contain check_ips()

    def check_ips(self, data):
        """Check data for IP addresses. Results stored in ``self.ips``.

        Args:
            data (str): String to search for IP address content.

        Returns:
            None
        """
        for ipv4 in IPv4Pattern.findall(data):
            if ipv4 not in self.ips:
                self.ips[ipv4] = 0
            self.ips[ipv4] += 1
        for ipv6 in IPv6Pattern.findall(data):
            if strip_ipv6(ipv6) not in self.ips:
                self.ips[strip_ipv6(ipv6)] = 0
            self.ips[strip_ipv6(ipv6)] += 1

Add frequency count in output

Would require additional handling for inputs. Currently IPs go to a set, would need to track number of instances identified. Possibly with a dictionary where keys are IPs and value is count?

Reduce log messages

Describe the bug
When used as a library, Chickadee outputs one message per request saying "API key found". In some cases, this fills up log files.

Certain fields not working via config file

In previous versions of the chickadee, the mobile, proxy, & hosting fields displayed by default. These bad boys have since gone away since version 20201125.0. I decided to create a config file with the fields I'd like, and while the file is being picked up and parsed correctly, the fields within are not showing. The fields will show if I add them using the -f parameter.

To Reproduce

Showing fields config in chickadee.ini file, attempting to lookup 1.1.1[.]1 using this config file. Fields don't show. Forcing the fields via -f, works:

fields = status,message,continent,continentCode,country,countryCode,region,regionName,city,district,zip,lat,lon,timezone,currency,isp,org,as,asname,reverse,mobile,proxy,hosting,query
$ chickadee -c ~/.chickadee.log 1.1.1.1 | jq
2020-12-05 13:34:52,631 chickadee.py INFO chickadee run:259 - Extracted 1 distinct IPs
{
  "status": "success",
  "country": "Australia",
  "countryCode": "AU",
  "region": "QLD",
  "regionName": "Queensland",
  "city": "South Brisbane",
  "zip": "4101",
  "lat": -27.4766,
  "lon": 153.0166,
  "timezone": "Australia/Brisbane",
  "isp": "Cloudflare, Inc",
  "org": "APNIC and Cloudflare DNS Resolver project",
  "as": "AS13335 Cloudflare, Inc.",
  "query": "1.1.1.1",
  "count": 1
}
$ chickadee -c ~/.chickadee.log -f status,message,continent,continentCode,country,countryCode,region,regionName,city,district,zip,lat,lon,timezone,currency,isp,org,as,asname,reverse,mobile,proxy,hosting,query 1.1.1.1 | jq
2020-12-05 13:35:14,644 chickadee.py INFO chickadee run:259 - Extracted 1 distinct IPs
{
  "status": "success",
  "continent": "Oceania",
  "continentCode": "OC",
  "country": "Australia",
  "countryCode": "AU",
  "region": "QLD",
  "regionName": "Queensland",
  "city": "South Brisbane",
  "district": "",
  "zip": "4101",
  "lat": -27.4766,
  "lon": 153.0166,
  "timezone": "Australia/Brisbane",
  "currency": "AUD",
  "isp": "Cloudflare, Inc",
  "org": "APNIC and Cloudflare DNS Resolver project",
  "as": "AS13335 Cloudflare, Inc.",
  "asname": "CLOUDFLARENET",
  "mobile": false,
  "proxy": false,
  "hosting": true,
  "query": "1.1.1.1",
  "message": null,
  "reverse": null
}

Another example with the fields I personally want in each run (versus all fields as shown above):

fields = query,count,status,country,countryCode,region,regionName,city,zip,lat,lon,timezone,isp,org,as,mobile,proxy,hosting
$ chickadee -c ~/.chickadee.log 1.1.1.1 | jq
2020-12-05 13:37:05,946 chickadee.py INFO chickadee run:259 - Extracted 1 distinct IPs
{
  "status": "success",
  "country": "Australia",
  "countryCode": "AU",
  "region": "QLD",
  "regionName": "Queensland",
  "city": "South Brisbane",
  "zip": "4101",
  "lat": -27.4766,
  "lon": 153.0166,
  "timezone": "Australia/Brisbane",
  "isp": "Cloudflare, Inc",
  "org": "APNIC and Cloudflare DNS Resolver project",
  "as": "AS13335 Cloudflare, Inc.",
  "query": "1.1.1.1",
  "count": 1
}
$ chickadee -c ~/.chickadee.log -f query,count,status,country,countryCode,region,regionName,city,zip,lat,lon,timezone,isp,org,as,mobile,proxy,hosting 1.1.1.1 | jq
2020-12-05 13:37:47,858 chickadee.py INFO chickadee run:259 - Extracted 1 distinct IPs
{
  "status": "success",
  "country": "Australia",
  "countryCode": "AU",
  "region": "QLD",
  "regionName": "Queensland",
  "city": "South Brisbane",
  "zip": "4101",
  "lat": -27.4766,
  "lon": 153.0166,
  "timezone": "Australia/Brisbane",
  "isp": "Cloudflare, Inc",
  "org": "APNIC and Cloudflare DNS Resolver project",
  "as": "AS13335 Cloudflare, Inc.",
  "mobile": false,
  "proxy": false,
  "hosting": true,
  "query": "1.1.1.1",
  "count": 1
}

And finally a default run with no config file provided (or available within the lookup path structure), showing that mobile, proxy, and hosting do not display:

2020-12-05 13:38:49,706 chickadee.py INFO chickadee run:259 - Extracted 1 distinct IPs
{
  "status": "success",
  "country": "Australia",
  "countryCode": "AU",
  "region": "QLD",
  "regionName": "Queensland",
  "city": "South Brisbane",
  "zip": "4101",
  "lat": -27.4766,
  "lon": 153.0166,
  "timezone": "Australia/Brisbane",
  "isp": "Cloudflare, Inc",
  "org": "APNIC and Cloudflare DNS Resolver project",
  "as": "AS13335 Cloudflare, Inc.",
  "query": "1.1.1.1",
  "count": 1
}

Version (please complete the following information):

  • OS: macOS 10.15.7 19H2
  • Version: 20201125.0
  • Python version 3.9.0

Additional context
none :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.