Coder Social home page Coder Social logo

nordic-institute / x-road-metrics Goto Github PK

View Code? Open in Web Editor NEW
6.0 10.0 4.0 9.14 MB

X-Road Metrics is a tool for collecting, storing and analysing reporting data and metrics from an X-Road® ecosystem.

License: MIT License

Python 90.46% Shell 1.57% JavaScript 1.74% CSS 0.58% HTML 3.81% R 1.84%
x-road monitoring hacktoberfest

x-road-metrics's People

Contributors

carohauta avatar ignasgro avatar mvjseppa avatar petkivim avatar raits avatar toomasmolder avatar vitalistupin avatar wisecrow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

x-road-metrics's Issues

As a Metric user I want logs to report correct application version

Currently application logs report that application version is 1.0.0, but the software was updated to 1.1.0

Software version:
1.1.0

Log example:

{"level": "INFO", "timestamp": 1678447142, "local_timestamp": "2023-03-10 13:19:02 +0200", "module": "collector", "activity": "collector_end", "msg": "Total collected: 1, Total error: 1, Total time: 00:00:00", "version": "1.0.0"}

Source example:
https://github.com/nordic-institute/X-Road-Metrics/blob/develop/collector_module/opmon_collector/__init__.py#L23
But the problem is with other modules as well.

Solution:
Bump versions in __init__.py files.

As an Administrator I would like to be able to rerun the database initialisation in case something goes wrong the first time

Currently, the xroad-metrics-init-postgresql script can not re-run in case there were problems during the initialisation. At least one example is that users are already created, and they need to be manually dropped in order for the script to succeed.

Related scripts are available here:

The ticket this issue was created from can be found here: https://nordic-institute.atlassian.net/browse/OPMONDEV-149

Acceptance criteria:

  • The script is checked and updated so that it would be possible to re-run it in case it fails halfway through, ideally the changes could be rolled back
  • At least the case where the usernames already exist is fixed

Opendata Operational monitoring Delay!

X-Road monitoring data is collected from Estonian X-Road members security servers available by X-Road Center (Republic of Estonia Information System Authority, Riigi Infosüsteemi Amet, RIA) and published as opendata with a delay of 10 days from actual transaction execution time.

Is there any way to change this behavior to configure opendata as a real-time operational monitoring solution ? or Is there any other operational monitoring product that is real time monitoring instead of 10 days delay?

As an Administrator I want to be able to configure Metrics components to use TLS client authentication for PostgreSQL so that the connection is more secure

Currently, there is no way to configure the components to connect to PostgreSQL using client TLS even if the PostgreSQL is configured to accept them. We should extend the settings so that PostgreSQL can also be configured to use TLS. Currently, the following modules require access to PostgreSQL:

More information about the X-Road Metrics system architecture is available here.

The ticket this issue was created from is available here: https://nordic-institute.atlassian.net/browse/OPMONDEV-150

Acceptance criteria:

  • Client TLS can be configured in settings.yaml for PostgreSQL connections in all Metrics modules that use PostgreSQL
  • The old configuration continues to work without TLS
  • Documentation is updated to inform users about this option
  • The example settings.yaml files of all affected modules are updated

Problem with collector configuration

Hi!
I have a structure with a CS(ip: 10.114.80.16), SS Management(ip: 10.114.10.17), SS1(ip: 10.114.10.18), SS2(ip: 10.114.10.19) and a server for Metrics(ip: 10.114.80.20 ) in which the different modules (collector, corrector, reports) and mongodb are installed.
I have the 'setting.yaml' of the collector configured like this:
image

Where should I download the "tls-client-certificate" and "tls-client-key"? and then put 'true' in the "tls-server-certificate"
I also have the following error, when using the collector and executing "xroad-metrics-collector update" it returns the following error:

File "/usr/lib/python3/dist-packages/pymongo/topology.py", line 208, in _select_servers_loop
raise ServerSelectionTimeoutError(
pymongo.errors.ServerSelectionTimeoutError: SSL handshake failed: 127.0.0.1:27017: EOF occurred in violation of protocol (_ssl.c:1131)

But when I run "xroad-metrics-collector list" it works correctly.
image

This is how I currently have the mongo configuration:
image

As a Metric user I want opendata DB initialization to not fail

Currently anonymizer fails to initialize empty database because of typo made while refactoring code.

Software version:
1.1.0

Source:
https://github.com/nordic-institute/X-Road-Metrics/blob/develop/anonymizer_module/opmon_anonymizer/iio/postgresql_manager.py#L112

Error message:

Failed initializing postgresql database connector.
ERROR: Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/opmon_anonymizer/main.py", line 93, in setup_writer
    writer = OpenDataWriter(settings, logger)  
File "/usr/lib/python3/dist-packages/opmon_anonymizer/iio/opendata_writer.py", line 42, in __init__
    self.db_manager = PostgreSqlManager(settings['postgres'], schema, index_columns, logger)
File "/usr/lib/python3/dist-packages/opmon_anonymizer/iio/postgresql_manager.py", line 43, in __init__
    self._ensure_table(table_schema, index_columns)
File "/usr/lib/python3/dist-packages/opmon_anonymizer/iio/postgresql_manager.py", line 95, in _ensure_table
    if not self._table_exists(cursor):  
File "/usr/lib/python3/dist-packages/opmon_anonymizer/iio/postgresql_manager.py", line 112, in _table_exists
    """, (cursor._table_name,))
AttributeError: 'psycopg2.extensions.cursor' object has no attribute '_table_name'

Solution:
cursor._table_name should be renamed back to self._table_name as it was before change:
65a1357#diff-2de518dd9301590a33d32a07fdcdf45bcaba485b18e61d65f29e5f8dcfd0b3e5L110

Additional problems:

Source:

"GRANT USAGE ON SCHEMA public TO %s;", (readonly_user,)

"GRANT SELECT ON %s TO %s;", (self._table_name, readonly_user)

Notation "%s" should be used for escaping data, not Identifiers like table names and user names. The constructed SQL just silently fails because of except Exception: pass:

GRANT USAGE ON SCHEMA public TO 'opendata_inst';
GRANT SELECT ON 'logs' TO 'opendata_inst';

Solution:
Identifiers should be properly escaped: https://www.psycopg.org/docs/sql.html#module-usage
If possible exception should not "pass" on all errors.

The same problem with identifiers occurs in other places, for example it is not possible to display opendata due to SQL error because of the bug in code:

"SELECT min(requestindate), max(requestindate) FROM %s;", (self._table_name,)

As an Administrator I want to be able to configure Metrics components to use TLS client authentication for MongoDB so that the connection is more secure

Currently, there is no way to configure the components to connect to MongoDB using client TLS even if the MongoDB is configured to accept them. We should extend the settings so that MongoClient can also be configured to use TLS. Currently, the following modules require access to MongoDB:

More information about the X-Road Metrics system architecture is available here.

The ticket this issue was created from is available here: https://nordic-institute.atlassian.net/browse/OPMONDEV-147

Acceptance criteria:

  • Client TLS can be configured in settings.yaml for MongoDB connections in all Metrics modules that use MongoDB
  • The old configuration continues to work without TLS
  • Documentation is updated to inform users about this option
  • The example settings.yaml files of all affected modules are updated

As a Metrics user I want the networking module to be updated so that it doesn't require excessive hardware resources

Sources:

Currently, the Networking module’s prepare_data process loads all the entries from the PostgreSQL database into memory, where it then aggregates the results. This increases the required amount of memory which must be made available to the module.

Suggestions:

  1. The prepare_data process should query already aggregated data from PostgreSQL via, for example, count() and "group by" methods. If needed, the required indexes should be added to the database.

This suggestion would lower the required amount of memory for the module since the data queried from the database is already aggregated and ready to be used.

The ticket this issue is created from can be found here: https://nordic-institute.atlassian.net/browse/OPMONDEV-152

Acceptance criteria:

As a Developer I want to investigate adding indexes to the PostgreSQL database to make queries faster

X-Road Metrics has two data storages:

  1. MongoDB - used for storing raw metrics data that’s collected from Security Servers.
  2. PostgreSQL - used for storing anonymised metrics data that’s published as open data.

We have indexes for MongoDB, but we should also investigate if any indexes could be added to the PostgreSQL database to make queries faster. 

More information about the X-Road Metrics system architecture is available here.

The ticket this issue is created from is available here: https://nordic-institute.atlassian.net/browse/OPMONDEV-110

Acceptance criteria:

  • The PostgreSQL tables we use and the queries ran against them are investigated. The following modules appear to be affected:
  • If there are places where adding an index would be beneficial, they are added during the setup process.

As a Metrics user I want the openadata module to be updated so that it doesn't require excessive hardware resources

Sources:

Currently, the opendata module saves the entire PostgreSQL response to memory, after which it creates the TGZ object in memory and forwards the response via the HttpResponse method. This creates a very high (200+GB) memory requirement since the data is stored in the memory.

Possible solutions:

  1. Instead of storing the entire PostgreSQL response in memory, use a cursor to query sets of data
  2. Compress the data using a Python iterator, instead of compressing all the entries at once
  3. Forward the response via StreamingHttpResponse, which would allow sending the response in smaller chunks

The changes should help reduce the amount of memory required from 200+GB to a more reasonable level, meaning that the hardware requirements could be greatly reduced.

The ticket this issue is created from can be found here: https://nordic-institute.atlassian.net/browse/OPMONDEV-151

Acceptance criteria:

Any Help for collector configuration

HI!

i try to run xroad-metrics-collector collect, i got error.
but i run xroad-metrics-collector list return correctly.

collector collect return
Screenshot 2024-04-26 at 3 51 18 PM

Collector Setting
Screenshot 2024-04-26 at 4 12 51 PM

As a Metric user I want networking ssl_mode configuration parameter to work

Currently ssl_mode configuration parameter is ignored by networking application:
https://github.com/nordic-institute/X-Road-Metrics/blob/develop/networking_module/prepare_data.R#L63-L83

It seems that odbc-postgresql ignores PGSSLMODE env variable, and it is required to put sslmode parameter into connection string.

For example the following code would fix the problem:

#if ("ssl_mode" %in% names(settings$postgres)) {
#    Sys.setenv(PGSSLMODE = settings$postgres$ssl_mode)
#}
if ("ssl_root_cert" %in% names(settings$postgres)) {
    Sys.setenv(PGSSLROOTCERT = settings$postgres$ssl_root_cert)
}

tryCatch(
  con <- odbcDriverConnect(
    connection = paste0(
      "Driver={PostgreSQL UNICODE}",
      ";Server=", settings$postgres$host,
      ";Port=", settings$postgres$port,
      ";Database=opendata_", settings$postgres$suffix,
      ";Uid=", settings$postgres$user,
      ";Pwd=", settings$postgres$password,
      ";UseDeclareFetch=1",
      ";Fetch=", fetchsize,
      ";sslmode=", settings$postgres$ssl_mode,
      ";"
    )
  ),

As a Metric user I want corrector queries to use DB indexes

Currently query executed by get_faulty_raw_documents() is slow because it does not use indexes:
https://github.com/nordic-institute/X-Road-Metrics/blob/develop/corrector_module/opmon_corrector/database_manager.py#L110-L118

It can be solved for new installations by adding new index to raw_messages in xroad-metrics-init-mongo:
https://github.com/nordic-institute/X-Road-Metrics/blob/develop/collector_module/opmon_mongodb_maintenance/create_indexes.py#L32

[('corrected', ASC), ('xRequestId', ASC)]

Or by running indexing command manually:

db.raw_messages.createIndex({"corrected": 1, "xRequestId": 1})

Additional notes: If would be great to mention the need to create new indexes in release notes.

As a Metric user I want opendata DB primary key to use bigint instead of integer

Opendata DB primary key sequence runs out of integer (int4) values in larger X-Road instances. Bigint should be used instead.

Software version:
1.1.1, 1.2.0

Host OS and version:
Ubuntu 20.04

The schema in anonymizer defines bigint as id column type:

But id column type is discarded during schema object creation:

Then during creation of table id column type is hardcoded to be SERIAL:

cursor.execute(f'CREATE TABLE {self._table_name} (id SERIAL PRIMARY KEY{column_schema});')

Example error log of anonymizer:

File \"/usr/lib/python3/dist-packages/opmon_anonymizer/iio/postgresql_manager.py\", line 58, in add_data
cursor.execute(query)psycopg2.errors.SequenceGeneratorLimitExceeded: nextval:
reached maximum value of sequence \"logs_id_seq\" (2147483647)

Proposed solution:

Replace id SERIAL PRIMARY KEY with id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY in postgresql_manager.py because SERIAL types are soft-deprecated:
https://wiki.postgresql.org/wiki/Don't_Do_This#Don.27t_use_serial

Hotfix for existing installations (NB! This is slow operation that locks tables, may take hours to complete and requires at least 50% of free drive space):

ALTER TABLE logs ALTER COLUMN id TYPE BIGINT;
ALTER SEQUENCE logs_id_seq AS BIGINT MAXVALUE 9223372036854775807;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.