Comments (20)
Hi Riandy,
Thank for looking at the pull request.
- Since in current version user could only delete messages by using
range
, essentially server will delete messages with seq id 1 until 300, one by one.
The user sends a delete request for a range {low: 1, hi: 301}. The rethinkdb adapter will receive the request to delete messages 1 <= seq < 301 and will delete the whole range at once: https://github.com/tinode/chat/blob/better-delete/server/db/rethinkdb/adapter.go#L979
The adapter will receive the ranges to delete just like they were sent by the client. It's up to the adapter to execute the delete strategy. For instance, all SQL databases can do an atomic DELETE FROM messages WHERE seq BETWEEN 1 AND 301
. Mongo DB can do it with db.collection.deleteMany()
- But if user could also delete messages by using
clear
, it just need to set the clear value on User A's subscription to topic X.
Yes, I removed this option. That was the old soft-delete method. It was only good for "hide all messages with IDs below clear
". It could not be used to delete individual messages by IDs, which is one of the most common use cases.
on DynamoDB which update operation is restricted to only 1 update per request, it is really troublesome.
Yes, I can see how this could be a problem.
What if instead we tell the user which items were actually deleted:
User asks to delete {low: 1, hi: 301}
(or just {hi: 301}
), DynamoDb adapter deletes 1 to 150 then fails. The adapter returns a "partial success", the server responds to the client with {ctrl code=207 params={what="del", delseq={low: 1, hi: 151}}}
? This way the client understands that the operation was only partially successful.
Alternatively, since this is an unusual limitation specific to DynamoDB, you can solve it in the DynamoDB adapter by creating a separate table just for the clear
or keeping a custom field like dynamo_db_clear
on subscriptions
in the existing schema. When reading the messages in MessageGetAll
you can just lookup the clear
value from that place. It will be a cheap lookup by the primary key. The adapter owns the storage schema. You can modify it for you case in any way you see fit. It's an optimization specific to the adapter. Adapter should own it.
At a high level there are only three ways of dealing with non-atomic backend operations:
- Have client issue only such requests which can be performed atomically by the server.
- Have server report partial success. If the operation executed partially, let the client know which parts were executed and which failed. Let the client clean up the mess. For instance, the client can just reissue the request.
- Same as above but try to clean up the mess on the server.
from chat.
Hi Ryandy,
Yep, hard deleting the message right now cannot be detected by offline users. The right approach is probably not to delete the message but to delete the Content only. I.e. make hard deleting the same as soft-deleting + setting Content to null. What do you think?
from chat.
Actually it should be setting Content and Head to null, not just Content.
from chat.
Hmm…
Suppose there is a grpABCDEF
which has Last SeqId = 10
& having 3 members: userA
, userB
, & userC
where userA
is topic owner. Suppose all the messages has been delivered to all members devices.
In other side, the Tinode client on each user’s device purposely cached the retrieved messages so when user load the topic it could only fetch the latest messages.
Suppose userA hard-delete messages with SeqId = 5
& SeqId = 6
while userB
is online & userC
is offline. Since it is hard-delete operation, it will be announced to all members of topic.
Since userB
is online, he will receive {pres}
notification so the client could remove the cache. But since userC
is offline, he won’t receive anything so the deleted messages would still appear on userC
device while they shouldn't.
So your suggested solution will not work because client simply doesn’t reload the messages from the beginning every time user load the topic. So I think there should be some kind of notification to offline user about the hard-delete action so the client could remove the cache.
If I remember correctly, Tinode used to have notification feature on me
topic. So why did this feature removed especially for announcing hard-delete action?
from chat.
It will work the same way it works for soft deleted messages.
Imagine that in your scenario userB accesses topic from two different devices (for instance, android phone and web) has soft-deleted the message seq=6 from the web while android was offline. The android will find out that the message is deleted by sending {sub topic= grpABCDEF get=data[ifmodifiedsince=timestamp_of_the_cache]}
. Then it will get the list of messages updated after the timestamp, including the soft-deleted message 6, marked as deleted.
I propose to do the same for hard-deleted messages.
from chat.
I just checked the code and actually that's how it works right now for messages deleted by list: https://github.com/tinode/chat/blob/master/server/db/rethinkdb/adapter.go#L907
I'm going to add null-ing the Head too.
from chat.
Like this: fc51910
from chat.
The android will find out that the message is deleted by sending {sub topic= grpABCDEF get=data[ifmodifiedsince=timestamp_of_the_cache]}
Ah I see, I forgot that Tinode now already support browse options based on time. In my adapter I forgot to implement it so I was still using since
& before
. Hahahaha…
Btw, I found bug on rethinkdb adapter though. If I send {sub topic=grpABCDEF get=data[after:timestamp_of_the_cache]} (see here), it will return following error:
topic: error loading topics gorethink: Cannot order by index `Topic_SeqId` after calling BETWEEN on index `Topic_UpdatedAt`. in:
r.DB("tinode").Table("messages").Between(["grpxaV0xERsHTg", {$reql_type$="TIME", epoch_time=1.510110138515e+09, timezone="+00:00"}], ["grpxaV0xERsHTg", r.MaxVal()], index="Topic_UpdatedAt").OrderBy(index=r.Desc("Topic_SeqId")).Limit(24)
Also I think when we delete message, we should also update its UpdatedAt
field. Otherwise we won’t be able to get it since the UpdatedAt
value still the same like CreatedAt
value. Notice that index on messages table is Topic_UpdatedAt
not Topic_DeletedAt
.
from chat.
I added the new time-based options exactly because since
and before
did not work for deleted messages. I did not want to break old code so I did not remove them.
I'll fix the bugs tomorrow. It should be easy.
from chat.
And yes, you are right about UpdatedAt
. I'll fix that too.
from chat.
Here is another idea.
Message SeqId
can be viewed as a mutation number. A topic has a set of messages. When the set is mutated (changed) by adding a message to it (someone sent {pub}
), the SeqId
is incremented by 1. Because messages are sent one by one, every message can be assigned a unique ID which is also a mutation index.
What if I do the same with deletions? Deletion is also a mutation after all. In addition to SeqID
I will create another Id, say DelId
. Any time someone soft- or hard-deletes messages the DelId
is incremented by 1 and the delId
is assigned to the deleted messages. Then I'll add what="del"
(in addition to "desc", "sub" and "data") to {sub}
and {get}
messages and will report DelId
to users. The server will report the list of deleted message IDs in another {meta}
message.
I think this is a cleaner and better approach than dealing with UpdatedAt
.
What do you think?
from chat.
Hmm…, yeah, I think that’s better approach👍
from chat.
Hmmm..., wait. After some thought, wouldn't it simpler especially from client perspective if we just use UpdatedAt
approach?
If we use UpdatedAt
approach, client just simply store the last message of the timestamp for next request. On the next request client could just use this stored timestamp to fetch the latest messages including deleted messages. For every messages retrieved, client then just need to check the DeletedAt
field. If it is nil, append to cache, but if it's not then delete from cache.
But if we used DelID
approach, client also need to keep track of last DelID
so it doesn't need to retrieve full list of deleted message ids every time it make request to server.
On the server side, if use UpdatedAt approach we just need to change a bit our adapter & keep existing architecture. But if we use DelID
approach, we might change a lot existing architecture, not only just the adapter.
What do you think?
from chat.
If we use UpdatedAt approach, client just simply store the last message of the timestamp for next request.
Suppose you want to browse messages back in time. How do you know that there are no more older messages? You would need to know the timestamp of the very first message. Or I would have to tell you somehow "there are no more messages". Right now it's not a problem because the first message has SeqId=1.
On the next request client could just use this stored timestamp to fetch the latest messages including deleted messages
Yes, that's how it supposed to work now. There is a problem though. Once connected, the client requests subscriptions from me
. Each subscription has a seq
. The client compares that to the stored value. If the one from me
is greater than the stored value, that means there are new messages in that subscription. But there is no equivalent way to find out if messages were deleted. So I have to add either UpdatedAt
or DelId
to subscriptions.
But if we used DelID approach, client also need to keep track of last DelID so it doesn't need to retrieve full list of deleted message ids every time it make request to server.
Yes. Without it it would have to keep track of the UpdatedAt
. It's equivalent.
I don't mind implementing UpdatedAt
. It's just does not seem as clean, because for some things browsing happens by id, for some by timestamp. It would be nice to use one or the other for everything.
from chat.
Regardless of how it's implemented, by DelId
or UpdatedAt
, what do you think about adding specific command for getting deleted messages, i.e. {get what=del}
. The server would respond to that with a single {meta}
message instead of a bunch of empty {data}
messages?
from chat.
But there is no equivalent way to find out if messages were deleted.
Hoo…, I see…
Yes, you’re right, currently we need to subscribe to the topic first to know whether or not there is a deleted messages.
It's just does not seem as clean, because for some things browsing happens by id, for some by timestamp.
Hmm…, yeah you’re right. It is much cleaner to use DelId
.
what do you think about adding specific command for getting deleted messages, i.e. {get what=del}
Yup, I think that’s a great idea.
Then that means in MessagesGetAll()
we need to only return messages which hasn’t been deleted, right? Since it will be redundant.
from chat.
Then that means in MessagesGetAll()
Correct. Plus another, similar method will be needed to return IDs of deleted messages only.
from chat.
Please take a look #44
from chat.
Hello, Gene
Yeah, I think it's great (y)
But I think it would be nice if user could also perform delete messages by using clear
beside range
though. For example imagine following scenario:
- User A have a topic named topic X which has large number of messages, for example 300.
- User A want to clear messages in topic X.
- Since in current version user could only delete messages by using
range
, essentially server will delete messages with seq id 1 until 300, one by one. - But if user could also delete messages by using
clear
, it just need to set the clear value on User A's subscription to topic X.
I don't know whether or not step 3 on RethinkDB is atomic, but on DynamoDB which update operation is restricted to only 1 update per request, it is really troublesome. Because if the error happens in the middle of delete process (for example when server processing message with seq id 150), then the changes which have been made to the messages so far would not be able to be reverted.
So I think it would be nice if user could also delete messages by using clear
.
What do you think, Gene?
from chat.
Thanks a lot for the suggestions, Gene👍
These are really awesome ideas😎👍
from chat.
Related Issues (20)
- Unable to `{sub}` with `on_behalf_of` HOT 2
- generate error HOT 1
- Can't connect to external database HOT 4
- [Question] External config is not working as expected HOT 2
- crash when removing a topic with RethinkDB HOT 3
- A feature request for status values. HOT 2
- error: The data couldn’t be read because it isn’t in the correct format in IOS HOT 1
- Can't initialize postgres database (does not use configured database to init) HOT 1
- Repetitive sub<->leave requests on same topic results in some leave requests with no response HOT 3
- Error starting gRPC call. HttpRequestException: No connection could be made because the target machine actively refused it. (127.0.0.1:6061) SocketException: No connection could be made because the target machine actively refused it. HOT 3
- Support S3 Compatible Storage Providers HOT 7
- External jwt token authenticator HOT 2
- Tinode not triggering password reset emails HOT 3
- NOT WORKING VIDEO CALL OTHER INTERNET HOT 1
- Cluster Not Working, Error: [ cluster: call failed two gob: unknown type id or corrupted data ] HOT 1
- In-app unread message counter does not account for deleted messages HOT 1
- How to create just a p2p topics? HOT 1
- Cannot initialize postgresql database HOT 7
- Drafty should correctly parse after fix for https://github.com/tinode/tinode-js/issues/74
- Cannot setup TLS HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chat.