\chapter{DHT Group Chats} This document details the groupchat implementation, giving a high level overview of all the important features and aspects, as well as some important low level implementation details. This documentation reflects what is currently implemented at the time of writing; it is not speculative. For detailed API docs see the groupchats section of the tox.h header file. \section{Features} \begin{itemize} \item Private messages \item Action messages (/me) \item Public groups (peers may join via a public key) \item Private groups (peers require a friend invite) \item Permanence (a group cannot 'die' as long as at least one peer retains their group credentials) \item Persistence across client restarts \item Ability to set peer limits \item Moderation (kicking, banning, silencing) \item Permanent group names (set on creation) \item Topics (may only be set by moderators and the founder) \item Password protection \item Self-repairing (auto-rejoin on disconnect, group split protection, state syncing) \item Identity separation from the Tox ID \item Ability to ignore peers \item Unique nicknames which can be set on a per-group basis \item Peer statuses (online, away, busy) which can be set on a per-group basis \item Custom parting/exit messages \end{itemize} \section{Group roles} There are four distinct roles which are hierarchical in nature (higher roles have all the privileges of lower roles). \begin{itemize} \item \textbf{Founder} - The group's creator. May set all other peers roles to anything except founder. May also set the group password, toggle the privacy state, and set the peer limit. \item \textbf{Moderator} - Promoted by the founder. May kick, ban and set the user and observer roles for peers below this role. May also set the topic. \item \textbf{User} - Default non-founder role. May communicate with other peers normally. \item \textbf{Observer} - Demoted by moderators and the founder. May observe the group and ignore peers; may not communicate with other peers or with the group. \end{itemize} \section{Group types} Groups can have two types: private and public. The type can be set on creation, and may also be toggled by the group founder at any point after creation. (\emph{Note: password protection is completely independent of the group type}) \subsection{Public} Anyone may join the group using the Chat ID. If the group is public, information about peers inside the group, including their IP addresses and group public keys (but not their Tox ID's) is visible to anyone with access to a node storing their DHT announcement. See the \href{#dht-announcements}{DHT Announcements} section for details. \subsection{Private} The only way to join a private group is by having someone in your friend list send you an invite. If the group is private, no peer/group information (mentioned in the Public section) is present in the DHT; the DHT is not used for any purpose at all. If a public group is set to private, all DHT information related to the group will expire within a few minutes. \section{Cryptography} Groupchats use the \href{https://en.wikipedia.org/wiki/NaCl_(software)}{NaCl/libsodium cryptography library} for all cryptography related operations. All group communication is end-to-end encrypted. Message confidentiality, integrity, and repudability are guaranteed via \href{https://en.wikipedia.org/wiki/Authenticated_encryption}{authenticated encryption}, and \href{https://en.wikipedia.org/wiki/Forward_secrecy}{perfect forward secrecy} is also provided. One of the most important security improvements from the old groupchat implementation is the removal of a message-relay mechanism that uses a group-wide shared key. Instead, connections are 1-to-1 (a complete graph), meaning an outbound message is sent once per peer, and encrypted/decrypted using a key unique to each peer. This prevents MITM attacks that were previously possible. This additionally ensures that private messages are truly private. Groups make use of 13 unique keys in total: Two permanent keypairs (encryption and signature), two group keypairs (encryption and signature), one session keypair (encryption), one shared symmetric key (encryption), and one temp DHT keypair (encryption). The Tox ID/Tox public key is not used for any purpose. As such, neither peers in a given group nor in the group DHT can be matched with their Tox ID. In other words, there is no way of identifying a peer aside from their IP address, nickname, and group public key. (\emph{Note: group nicknames can be different from the client's main nickname that their friends see}). \subsection{Permanent keypairs} When a peer creates or joins a group they generate two permanent keypairs: an encryption keypair and a signature keypair, both of which are unique to the group. The two public keys are the only guaranteed way to identify a peer, and both keypairs will persist for as long as a peer remains in the group (even across client restarts). If a peer exits the group these keypairs will be lost forever. This encryption keypair is not used for any encryption operations except for the initial handshake when connecting to another peer. For usage details on the signature key, see the \href{#moderation}{Moderation} section. \subsection{Session keypair/shared symmetric key} When two peers establish a connection they each generate a session encryption keypair and share one another's resulting public key. With their own session secret key and the other's session public key, they will both generate the same symmetric encryption key. This symmetric key will be used for all further encryption operations between them for the current session (i.e. until one of them disconnects). The purpose of this extra key exchange is to prevent an adversary from decrypting messages from previous sessions in event that a secret encryption key becomes compromised. This is known as forward secrecy. \subsection{Group keypairs} The group founder generates two additional permanent keypairs when the group is created: an encryption keypair, and a signature keypair. The public signature key is considered the \textbf{Chat ID} and is used as the group's permanent identifier, allowing other peers to join public groups via the DHT. Every peer in the group holds a copy of the group's public encryption key along with the public signature key/Chat ID. The group secret keys are similar to the permanent keypairs in that they will persist across client restarts, but will be lost forever if the founder exits the group. This is particularly important as administration related functionality will not work without these keys. See the \href{#founders}{Founders} section for usage details. \subsection{Temporary DHT keypair} All group related DHT procedures make use of toxcore's temp DHT keypair. This keypair is generated when the Tox object is initialized and does not persist across client restarts. See the \href{#dht-announcements}{DHT Announcements} section for further details. \section{Founders} The peer who creates the group is the group's founder. Founders have a set of admin privileges, including: \begin{itemize} \item Promoting and demoting moderators \item The ability to kick/ban moderators \item Setting the peer limit \item Setting the group's privacy state \item Setting group passwords \end{itemize} \subsection{Shared state} Groups contain a data structure called the \textbf{shared state} which is given to every peer who joins the group. In this structure resides all data pertaining to the group that must only be modifiable by the group founder. This includes things like the group name, the group type, the peer limit, and the password. Additionally, the shared state holds a copy of the group founder's public encryption and signature keys, which is how other peers in the group are able to verify the identity of the group founder. The shared state is signed by the founder using the group secret signature key. As the founder is the only peer who holds this secret key, this ensures that the shared state may be safely shared by untrusted peers, even in the absence of the founder. When the founder modifies the shared state, he increments the shared state version, signs the new shared state data with the group secret signature key, and broadcasts the new shared state data along with its signature to the entire group. When a peer receives this broadcast, he uses the group public signature key to verify that the data was signed with the group secret signature key, and also verifies that the new version is not older than the current version. \subsection{Moderation} The founder has the ability to promote other peers to the moderator role. Moderators have all the privileges of normal users, and additionally have the power to kick, ban, and unban, as well as give peers below the moderator role the roles of user and observer (see the \href{#group-roles}{Group roles} section). Moderators can also modify the group topic. Moderators have no power over one another; only the founder can kick, ban, or change the role of a moderator. \subsection{Kicks/bans} When a peer is kicked or banned from the group, his chat instance and all its associated data will be destroyed. This includes all public and secret keys. Additionally, the the peer will not receive any notifiactions; it will simply appear to them as if the group is inactive. \subsection{Moderator list} Each peer holds a copy of the \textbf{moderator list}, which is an array of public signature keys of peers who currently have the moderator role (including those who are offline). A hash (sha256) of this list called the \textbf{\verb'mod_list_hash'} is stored in the shared state, which is itself signed by the founder using the group secret signature key. This allows the moderator list to be shared between untrusted peers, even in the absence of the founder, while maintaining moderator verifiability. When the founder modifies the moderator list, he updates the \verb'mod_list_hash', increments the shared state version, signs the new shared state, broadcasts the new shared state data along with its signature to the entire group, then broadcasts the new moderator list to the entire group. When a peer receives this moderator list (having already verified the new shared state), he creates a hash of the new list and verifies that it is identical to the \verb'mod_list_hash'. \subsection{Sanctions list} Each peer holds a copy of the \textbf{sanctions list}. This list holds two sublists: Banned peers, and peers with the observer role, or the \textbf{ban list} and the \textbf{observer list} respectively. The ban list contains entries of peers who have been banned, including their last used nickname, IP address/port, and a unique ID. The sanctions list contains entries of peers who have been demoted to the observer role, including just their public encryption key. All entries additionally contain a timestamp of the time the entry was made, the public signature key of the peer who set the sanction, and a signature of the entry's data, which is signed by the peer who created the entry using their secret signature key. Individual entries are verified by ensuring that the entry's public signature key belongs to the founder or is present in the moderator list, and then verifying that the entry's data was signed by the owner of that key. Although each individual entry can be verified, we still need a way to verify that the list as a whole is complete and identical for every peer, otherwise any peer would be able to remove entries arbitrarily, or replace the list with an older version. Therefore each peer holds a copy of the \textbf{sanctions list credentials}. This is a data structure that holds the version, a hash (sha256) of all sanctions list entries plus the version, the public signature key of the last peer to have modified the sanctions list, and a signature of the hash, which is created by that key. When a moderator or founder modifies the sanctions list, he will increment the version, create a new hash, sign the hash+version with his secret signature key, and replace the old public signature key with his own. He will then broadcast the new changes (not the entire list) to the entire group along with the new credentials. When a peer receives this broadcast, he will verify that the new credentials version is not older than the current version and verify that the changes were made by a moderator or the founder. If adding an entry, he will verify that the entry was signed by the signature key of the entry's creator. When the founder kicks, bans or demotes a moderator, he will first go through the sanctions list and re-sign each entry made by that moderator with his own founder key, then re-broadcast the sanctions list to the entire group. This is necessary to guarantee that all sanctions list entries and its credentials are signed by a current moderator or the founder at all times. \textbf{Note:} \emph{The sanctions list is not saved to the Tox save file, meaning that if the group ever becomes empty, the sanctions list will be reset. This is in contrast to the shared state and moderator list, which are both saved and will persist even if the group becomes empty.} \section{Topics} Founders and moderators have the ability to set the \textbf{topic}, which is simply an arbitrary string of characters. The integrity of a topic is maintained in a similar manner as sanctions entries, using a data structure called \textbf{\verb'topic_info'}. This is a struct which contains the topic, a version, and the public key of the peer who set it. When a peer modifies the topic, they will increment the version, sign the new topic+version with their secret signature key, replace the public key with their own, then broadcast the new \verb'topic_info' data along with the signature to the entire group. When a peer receives this broadcast, they will first check if the public signature key of the setter either belongs to the founder, or is in the moderator list. They will then verify the signature using the setter's public signature key, and finally they will ensure that the version is not older than the current topic version. If the moderator who set the current topic is kicked, banned, or demoted, the founder will re-sign the topic using his own signature key, and rebroadcast it to the entire group. \section{State syncing} Peers send four unsigned 32-bit integers along with their ping packets: Their peer count\footnote{We use a "real" peer count, which is the number of confirmed peers in the peerlist (that is, peers who you have successfully handshaked and exchanged peer info with).}, their shared state version, their sanctions credentials version, and their topic version. If a peer receives a ping in which any of these values are greater than their own, this indicates that they may be out of sync with the rest of the group. In this case they will do one of two things: If they already have a sync request flagged for this peer, they will send a sync request. Otherwise they will set the flag and wait until the next ping arrives (this waiting is to correct for false-positives in the case of high network latency). The flag is reset after a sync request is sent, or whenever a ping is received in which all data is in sync. \section{Group syncing} In order to prevent entirely separate subgroups with the same Chat ID from being created, be it due to network issues or a malicious MITM attempt, it's necessary for groups to periodically search the DHT for announced nodes that match the group's Chat ID but are not present in the group. In case an unknown node is found, an attempt will be made to connect with it. If successful, the state sync mechanism will merge the subgroups shortly. Since we don't want to spam the DHT with a redundant number of requests that grows linearly with the size of the group, peers will take turns doing the search. Peers decide independently if it's their turn to search. Each peer has the same base timer T, and every interval of T they will do a search with a probability P which is inversely proportionate to the number of peers N. For example, if N=1 then P=1.0. If N=4 then P=0.25. If N=100 then P=0.01 and so on. This guarantees that a given group will do 1 search per T interval on average regardless of its size, and it also ensures that a full spectrum of the network is searched. Moreover, because peers act independently rather than in coordination, malicious peers have little exploit potential (e.g. attempting to stop the group from searching the DHT). In addition, peers who join a group via the DHT will attempt to connect to any nodes that are not in their freshly synced peer list. \section{DHT Announcements} Groupchats make use of the Tox DHT network in order to allow for groups that can be joined by anyone who possesses the Chat ID. As all of the information stored in or passed through the DHT can be viewed by any of the involved nodes, these types of groups are considered to be public. Private groups in contrast do not make use of the DHT for any purpose, and as such require a friend invite in order to join. \subsection{Announcement requests} When peers create or successfully join a public group they send an \textbf{announcement request}, containing information about the group that they're announcing and themselves to K of their close DHT nodes. The information in this request includes the announcer's group public encryption key and IP address/port, as well as the Chat ID of the group. The DHT attempts to store this announcement in the node that's closest to the Chat ID (\textbf{closeness} is calculated by the DHT's close function). DHT nodes can store up to N announcements each, after which they will replace the oldest announcements first. See the \href{#redundancy}{Redundancy} section for details on how DDoS attacks are mitigated. \subsection{Get nodes requests} When peers attempt to join a public group using the Chat ID they send a \textbf{get nodes request}, containing their IP/port, their group public encryption key, and the Chat ID to K of their close nodes. Those nodes will then check if any of their announcement entries match the supplied Chat ID. If not, they will relay the message to K of their own close nodes who will repeat the process (note that the close function guarantees that each successive relay will bring us closer to the Chat ID until we either find one of its entries, or have traversed the entire DHT network). Once a node finds an entry with the queried Chat ID it will send a \textbf{send nodes response} to the original node who made the request. The response will contain at least one entry (possibly more) which will hold the group public encryption key and the IP address/port of a peer who had previously made an announcement request for Chat ID. With this information the requester will automatically initiate the handshake protocol and attempt to join the group. \subsection{Redundancy} DHT nodes will send ping requests to all of their announcement entries periodically in order to ensure that they are still present in the network/group. When a peer goes offline or leaves a group, they no longer respond to these ping requests, and the nodes holding their entries will discard them. There are scenarios in which an announcement may be dropped from the network, such as if the sole node holding the entry goes offline, or in the case of DDOS attack which attempts to push all old entries out of the DHT. In order to ensure that those announcements are not permanently lost, announcers will periodically check when they last received a ping request for a given announcement. After a certain amount of time without receiving a ping request they will assume that their entry is no longer in the DHT network and re-announce themselves. This ensures that every peer present in a group has an active announcement in the DHT at all times, and it also ensures that a group cannot become 'lost'. \begin{code} module Network.Tox.Application.GroupChats where \end{code}