Comments (14)
@utsavanand2 The error you're seeing occurs when a charging station connects to the server, while the server already holds an open connection with the same charging station. The error occurs on websocket level btw.
Normally, this shouldn't happen, but we decided to handle this edge case, since duplicate IDs can lead to other issues instead.
If there's network issues and the websocket close message never gets through to the other side, but for some reason the new connect message does, then:
- the connection gets accepted by the server
- the server kills the connection immediately afterwards, leading to the client logging
disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
I don't remember why it was decided to send a 1008 error instead of denying a websocket upgrade in the first place, but there was probably a reason. I'm looking through documentation and RFCs to get wiser.
from ocpp-go.
The issue described in #141 is completely separate and not linked to this at all.
#133 might indeed be related though. I'll look into this further and hopefully find a root cause soon.
from ocpp-go.
Hey,
the initialization snippet looks good to me, although 8s seems a bit low.
The timeout you are setting simply makes sure an outgoing request is canceled, if no response is received from the charging station within those 8 seconds. This allows you to overcome the issue where charging stations don't reply quickly enough (e.g. because of network issues), but it also means that message is lost. If the charging station eventually replies, your CSMS will never process that response.
The only case, where automatic retransmission is currently supported, is if the websocket layer recognizes that the other endpoint disconnected. This is typically achieved with the ping-pong mechanism. The ocpp-j layer currently has no other way of knowing, whether the other endpoint is having network issues, which is why the timeout is meant to be a last resort.
It's very hard to just guess the cause of the issue, since you're apparently dealing with intermittent network connections.
Some information that would help:
- Are you using websocket ping-pong in the first place? If so, what's the ping-pong timeout?
- Are you manually retransmitting messages, when they're canceled due to the timeout?
- Can you provide a log dump of both affected endpoints (the part where the issue arises)? If you have debug logs enabled on the CSMS, that should also provide more information about the message flow
from ocpp-go.
Hi @lorenzodonini, I switched back to the previous release where we did have some issues with the stations with network issues, but we didn't face a complete downtime for all charging stations. But with the new release of v0.16.0, even the 30 second timeout is giving us this trouble, at this point my belief is the issue is not with the timeout but rather us switching from v0.16.0 from v0.15.0 because we switched to the custom dispatcher the same time we upgraded to the new release.
1.> No we're not using websocket ping-pong, we just rely on the heartbeats to keep the connection alive.
2.> No we do not retransmit the messages.
3.> I do have the logs but unfortunately the logs are not on the debug level, so they are not of much use except the fact that the quantity of logs drastically reduces over a time period of 24 hrs after which it seems like it continues to run but doesn't accept any new requests.
I have built on top of your examples for the charge-point and I tried connecting to the production server when I started seeing this issue, pasting the logs of the charge-point-emulator below: (the server here is using the 30second default)
tsavanand@Utsavs-MacBook-Pro charge-point-simulator % ./charge-point-simulator
INFO[2022-08-21T16:41:55+05:30] connected to central system at ws://csms.daheimladen.com
INFO[2022-08-21T16:42:25+05:30] <nil> ocpp message (2596996162): GenericError - request timed out, no response received from server
INFO[0040] sending status notification
INFO[0050] sending status notification
ERRO[2022-08-21T16:42:55+05:30] disconnected from serverread tcp 192.168.0.208:53483->35.246.163.54:80: i/o timeout
INFO[0060] sending status notification
ERRO[2022-08-21T16:43:00+05:30] disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
INFO[0070] sending status notification
ERRO[2022-08-21T16:43:06+05:30] disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
ERRO[2022-08-21T16:43:11+05:30] disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
INFO[0080] sending status notification
ERRO[2022-08-21T16:43:17+05:30] disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
ERRO[2022-08-21T16:43:22+05:30] disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
INFO[0090] sending status notification
INFO[0100] sending status notification
INFO[0110] sending status notification
INFO[0120] sending status notification
INFO[0130] sending status notification
INFO[2022-08-21T16:44:08+05:30] ocpp message (2854263694): GenericError - request timed out, no response received from server
I'll switch our csms service back to v0.15.0, I believe this should fix our problems until we can find out and resolve the issue as it was the last working release for us. If I find anything of interest, I would share it with you here.
I also want to know if you have any plans for opening up your repository to Github Sponsors, as I use this project in production, and being an open source contributor in the past myself for some cloud native projects, I know for a fact that maintaining open source projects is hard, companies benefit from it with no intent of giving back (atleast most of the times). So I would love to pitch in personally as a Github Sponsor for this project ❤️
Thanks a lot for your time and efforts.
from ocpp-go.
INFO[2022-08-21T16:42:25+05:30] <nil> ocpp message (2596996162): GenericError - request timed out, no response received from server
if I understand this right, the charge point sent a request to the CSMS, which never replied, triggering the timeout? Can you verify, whether the CSMS ever received the request?
Btw to enable debug logs you can do this:
ws.SetLogger(logger)
ocppj.SetLogger(logger)
// the logger needs to be set to debug level
Any logger that adheres to the interface defined in logging/log.go
will do.
If you could provide a detailed message flow between the two endpoints with debug logs enabled, I may be able to identify where the issue is arising at least.
I will definitely have another look at the changes introduced in v0.16 and double-check if something broke. I hope v0.15 serves you well in the meantime.
I'm very happy to know you're using this in production. Yes, maintaning OSS can be tough and I do plan on opening the repo up to Github Sponsors. Before I do so, I just wanted to polish a few things (i.e. publish a documentation site, make sure nothing major is broken). Stay tuned.
from ocpp-go.
Issue #133 looks similar. Also is there a relation with the issue #141 ?
from ocpp-go.
Hi @lorenzodonini Upon reading the logs from here and conducting some experiments with the library and my dummy ocpp client, I noticed that we get this error when the charging station client abruptly crashes and disconnects from the server:
ERRO[2022-10-01T01:09:46+05:30] disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
Could this be the reason for the issue we're facing? I see that this check was in v0.15.0 as well and is not new in v0.16.0, but could be that for some other change in v0.16.0 is causing us to hit this policy violation more easily.
In #133 the author of the issue also says "after getted BootNotification request more than 10 times from chargepoint ,ths ocpp server is stucked". I'm carefully reviewing the changes in the code myself, but I would really appreciate your thoughts on this as well :))
I have another query regarding the changes which might relate to our issue:
I was not able to understand the need to increase the size of the requestChannel
from 1 to 20 in v0.16.0
from ocpp-go.
@lorenzodonini Yes I saw the check on the websocket file where a connection to the same ID was ressponded with the aforementioned code. But again I'm a bit puzzled because this check was there in v0.15.0 and around 3000 connected stations are working fine with it in our production environment.
In the last months, I've tried to rollout v0.16.0 about 2 to 3 times and after 24-48 hrs the stations would not respond to RFID cards, or RemoteStart and Stops.
The last time the ocpp server became unresponsive with v0.16.0, I ran the ocpp-go
client library against my production server and the logs I mentioned above is what I received.
I have enabled debug logs on the staging area where some stations are running without any issues with happy users, but the same is causing issues when I promote this library update to production with around 3000 active charging stations.
from ocpp-go.
Hi @lorenzodonini :)) Good afternoon, I'll be publishing the beta version of the csms server to production, which means we will again go from v0.15.0 to v0.16.0, this time with debug logs enabled. The motive is to get some debug logs which will help us find the issue to the problem.
I've also forked your repo and added some error logs that might help if we encounter errors like last time (for e.g., disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
)
https://github.com/utsavanand2/ocpp-go/commit/8b82ac15fa7c5563c942b112ddf3b16e18bc0aca
I've decided to do this promotion on Friday evening and monitor it through the weekend.
Is there any other section of the code or file which you would like me to pay attention to and add more debug or error level logs which might aid you as well?
from ocpp-go.
@lorenzodonini I ran the update on the production environment multiple times since Friday, I have the debug logs so should I post them here?
One more question, have you tried using the ocpp-go client library as a tool for loadtesting the ocpp server? I will be trying to create multiple instances of the charge point.
I do have a couple of observations, the error disconnected from serverwebsocket: close 1008 (policy violation): a connection with this ID already exists
seems to be coming after the charge point has connected, did not get a response from the CSMS, so probably tries to reconnect again without dropping the connection, but again this is not the primary concern, the concern is that the server seems to give up quickly when a lot of charging stations try to connect.
from ocpp-go.
@lorenzodonini I ran some tests yesterday by keeping the library at v0.16.0 but replacing the dispatcher file with the version used in v0.15.0 which doesn't use the context timeout, and it worked in production :)
from ocpp-go.
@utsavanand2 Thanks for collecting some debug logs. Feel free to upload them to any cloud and either send me a link via email or paste it here (maybe obfuscate sensitive data, if any).
I didn't do an extensive load testing, but scaling a sample setup to 100 clients seems to work just fine in my tests. I expect there is some race condition or incorrect state handling that leads to the issue you're experiencing.
Narrowing it down to the new context timeouts definitely helps. I'll have a deeper look asap and hope to have a fix soon.
from ocpp-go.
Hi @lorenzodonini thanks a lot! Can you please drop me an email at [email protected], I would love to discuss this more in detail, and drop a link to the logs via email :)
from ocpp-go.
@lorenzodonini I can be wrong but I was wondering if we should handle the case for when the context is cancelled in waitForTimeout
Line 575 in d4afed5
from ocpp-go.
Related Issues (20)
- Central-system 1.6 example - "fatal error: concurrent map writes" when connecting multiple chargers rapidly HOT 1
- Code Generation HOT 1
- TriggerMessage.Req validation disallows passing connector 0 HOT 6
- ChangeConfigurationRequest doesn't allow empty values
- Bad Request response for https via curl request HOT 1
- Release new version? HOT 2
- Fix vet report HOT 2
- Issue with DataTransfer on both ocpp 1.6 and 2.0.1 HOT 2
- Send Command To Charging Station
- I call remote start transaction but got error "ocpp message (2193907606): InternalError - An internal error occurred and the receiver was not able to process the requested Action successfully" HOT 1
- SecurityEventNotification.req HOT 13
- Decouple websocket server and message processing HOT 7
- Parses an OCPP-J message,Occurrence anomaly HOT 4
- Is their any way to bypass the SubjectAlternateName check while using tls? HOT 1
- Fetch chargePointId inside SetBasichandler HOT 6
- Fetch raw ocpp message HOT 4
- Bug with validation for connectorId of GetCompositeSchedule.conf #275
- shoudl idTag validator be altered? HOT 3
- Configurable session timeout
- Authorize request missing connectorId? HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ocpp-go.