Web rich text editor compatible with Android device input

pubuzhixing
13 min readNov 6, 2023

Rich text editors have always posed challenges in the front-end world, but these challenges intensify when rich text is used on Android devices.

This article mainly introduces the handling of input compatibility issues on Android devices under the rich text editor framework Slate. Before actually introducing the compatibility solution, I will briefly introduce the particularities of input on Android devices, as well as the overall compatibility processing ideas, and finally introduce the input Compatible with some details.

The Slate framework is a very excellent rich text editor framework in the open source community. I personally have always liked it very much. It has an elegant architectural design and a simple API. There is a lot worth learning in all aspects.

The framework view layer of the Slate editor only officially supports slate-react. Because our team’s front-end technology stack is Angular, we implemented an Angular version of the view layer slate-angular with reference to slate-react, and it has also been open source ( https://github.com/worktile/slate-angular ).

The Android input compatibility solution introduced in this article is mainly practiced in slate-angular and refers to the implementation ideas of slate-react. However, our implementation is simpler than slate-react and easier to migrate to other editor libraries.

History

As early as 2021, the Slate community was discussing the scenario of supporting rich text content input on Android devices (in the slate-react library). Initially, when support was first introduced, there were significant differences in input handling between Android devices and other desktop browsers. Therefore, a separate AndroidEditable component was extracted to handle input delegation on Android devices. Gradually, the community discovered that maintaining two Editable components in one framework posed significant problems. Therefore, in 2022, a major refactoring of the overall solution was carried out, primarily completed by https://github.com/BitPhinix (who is also the initiator of the slate-yjs project). AndroidEditable and the basic Editable component were merged, which was a major undertaking referenced in the “Android input handling rewrite” ( https://github.com/ianstormtaylor/slate/pull/4988 ). The new approach proposed by this rewrite, including RestoreDom and AndroidInputManager, is highly instructive. 👍🏻

Because initially our company did not have enough demand for rich text input scenarios on Android devices, slate-angular has not kept up with the compatibility handling for Android devices in slate-react (😂 luckily we didn’t follow, otherwise we would have encountered many difficulties). However, this year (2023), our company has confirmed the need to support rich text editing on Android devices, so we have conducted research on this technology and achieved compatibility in slate-angular.

Currently, the tests have shown normal performance on Android versions: API 31, API 33, with Gboard, Sougou, Baidu, and Weixin input methods.

Having trouble with input on Android?

Initially, I thought there was no issue with input on Android devices using slate-angular because I had tested it on my phone and found that I could input normally without any special handling. However, I came across an issue on GitHub that shed some light on the situation: Weird behavior with the android keyboard “Gboard” . It was then that I realized the difference — I was using the Sougou input method on my Android phone, and it seemed to have some connection with the input. This raised a question in my mind.

Later, a colleague on mobile devices informed me that they were also experiencing issues with the Sougou input method. At this point, I began to investigate the problem in detail and eventually discovered that there was indeed an issue with the Sougou input method. However, the issue only occurred when switching to “English” mode, and in most cases, there were no problems when using the “Chinese” mode.

Root Cause of the Issue

As you may be aware, issues on Android devices may be related to the input method, and in most cases, they only occur in “English” input mode.

I’m not sure if you’ve noticed the difference between inputting Chinese and English on your phone. When entering English on a phone, the input method provides a predictive or suggestive function, underlining the word being typed, as shown in the image (with the focus on the end of the word “editable”).

At the top of the input method, there will be optional words that can be replaced as a whole to achieve auto-completion, spelling correction, and so on. This prompt function should be available in all input methods. The problem with rich text editing lies in the underline and automatic replacement/completion.

In the front-end processing, this behavior in Android devices (English mode) is converted into a composition input event family (compositionstart/compositionupdate/compositionend). A little-known fact is that the default behavior of composition input events in browsers cannot be prevented, even if event.preventDefault() is called. As a result, it is difficult to maintain consistency between the data model of content input and the rendering of the interface (Slate’s framework core mechanism).

The English input behavior on Android devices itself has its peculiarities. Each time an English letter is input, the data carried by beforeinput includes all the letters input before. For example, if the user wants to input the word ‘editable’, when the user inputs ‘e’ and then ‘d’, the data carried by the beforeinput event is ‘ed’. When the next letter ‘i’ is input, the data carried is ‘edi’. This behavior can cause additional trouble in data processing, especially for handling solutions like Slate that need to map the user’s input behavior to modifications of the data model (the corresponding Slate data synchronization needs to delete first and then insert, and it needs to be clear how many characters need to be deleted before insertion, without any deviation).

Once the browser on an Android device enters the composition input state (triggering the compositionstart event), the keyCode of keyboard events is always 229. In this case, it is impossible to recognize whether the Backspace key or Enter key is pressed, causing significant obstacles in determining the intent of the input behavior.

This behavior is similar to Chinese input (Chinese input also involves combining input events), but the difference is that in the process of combining input in Chinese input, the input result is invalid or meaningless. We only care about the final combined input result. Usually, we only need to insert the final content in compositionend. However, English input on Android devices is not like this. Each input may be the final result, and it cannot be uniformly processed in compositionend.

I believe classmates who have worked on rich text editors or are familiar with the Slate framework should understand that the most difficult problem to deal with in editors is Chinese input. There is no good solution, only to avoid the default behavior of the browser. The problem with Android input is similar and even more complicated than Chinese combining input. Different input methods also behave differently.

Approach

Regarding how Slate recognizes user interaction intent and converts it into modifications to the data model, this will not be reiterated here. Instead, let’s discuss the compatibility handling approach for Android device input.

Recognition of Interaction Intent:

The overall approach relies on determining the interaction intent based on the inputType in the beforeinput event. In some scenarios, the user’s input behavior cannot be fully determined by inputType alone, so it is combined with the accompanying data from the beforeinput event for evaluation.

RestoreDOM:

As explained earlier, once the browser enters the Composition state, the DOM updates generated by user input (default browser behavior) cannot be prevented. We can only wait for the browser rendering to complete and then write code to restore the interface, ensuring correct rendering and consistency with the model data.

When dealing with this issue, we drew inspiration from the concept of RestoreDOM in slate-react . When monitoring content input on Android devices, the processing cycle of restoreDom is entered. Within restoreDom , the subsequent DOM updates within a specific period of time are monitored (based on MutationObserver ), because I know that the next DOM update will definitely come from the default rendering of this input by the browser. Upon detecting a DOM update, specific DOM elements are restored based on the update type (add, remove, or type equal to characterData ).

RestoreDOM also supports passing a processing function. After the DOM is restored, this function is executed immediately to perform the actual modification of the editor data. The specific operations to be performed are determined by the caller based on the recognized intent.

Process:

User input -> DOM update (default behavior) -> Restore DOM -> Data transformation -> DOM rendering (data-driven)

One potential issue is whether there will be other DOM updates that are not related to the current input during the monitoring of DOM updates. This is a vacuum zone and I cannot guarantee it at the moment, especially in scenarios that support collaborative editing. The specific behavior still needs to be verified.

Code Details

I: Intercept and Handle “beforeinput” Event

if (IS_ANDROID) {
let targetRange: Range | null = null;
let [nativeTargetRange] = event.getTargetRanges();
if (nativeTargetRange) {
targetRange = AngularEditor.toSlateRange(editor, nativeTargetRange);
}
// COMPAT: SelectionChange event is fired after the action is performed, so we
// have to manually get the selection here to ensure it's up-to-date.
const window = AngularEditor.getWindow(editor);
const domSelection = window.getSelection();
if (!targetRange && domSelection) {
targetRange = AngularEditor.toSlateRange(editor, domSelection);
}
targetRange = targetRange ?? editor.selection;
if (type === 'insertCompositionText') {
if (data && data.toString().includes('\n')) {
restoreDom(editor, () => {
Editor.insertBreak(editor);
});
} else {
if (targetRange) {
if (data) {
restoreDom(editor, () => {
Transforms.insertText(editor, data.toString(), { at: targetRange });
});
} else {
restoreDom(editor, () => {
Transforms.delete(editor, { at: targetRange });
});
}
}
}
return;
}
if (type === 'deleteContentBackward') {
// gboard can not prevent default action, so must use restoreDom,
// sougou Keyboard can prevent default action(only in Chinese input mode).
// In order to avoid weird action in Sougou Keyboard, use resotreDom only range's isCollapsed is false (recognize gboard)
if (!Range.isCollapsed(targetRange)) {
restoreDom(editor, () => {
Transforms.delete(editor, { at: targetRange });
});
return;
}
}
if (type === 'insertText') {
restoreDom(editor, () => {
if (typeof data === 'string') {
Editor.insertText(editor, data);
}
});
return;
}
}

The details of the code are not described in detail. The main thing is to assert user input behavior based on some contextual states.

Key point: One relatively important point is to use event.getTargetRanges() to obtain the range of the current input, which means when inputting a character, it will replace which characters. In terms of editor data, it is necessary to know which characters need to be deleted before insertion. There are some compatibility issues with different input methods. If it cannot be obtained, try to get it through window.getSelection(). If it still cannot be obtained, then it will be up to fate, because if this data cannot be obtained, the front-end cannot determine the range of data modification and can only ignore it!

二:resotreDom function

export function restoreDom(editor: Editor, execute: () => void) {
const editable = EDITOR_TO_ELEMENT.get(editor);
let observer = new MutationObserver(mutations => {
mutations.reverse().forEach(mutation => {
if (mutation.type === 'characterData') {
// We don't want to restore the DOM for characterData mutations
// because this interrupts the composition.
return;
}

mutation.removedNodes.forEach(node => {
mutation.target.insertBefore(node, mutation.nextSibling);
});

mutation.addedNodes.forEach(node => {
mutation.target.removeChild(node);
});
});
disconnect();
execute();
});
const disconnect = () => {
observer.disconnect();
observer = null;
};
observer.observe(editable, { subtree: true, childList: true, characterData: true, characterDataOldValue: true });
setTimeout(() => {
if (observer) {
disconnect();
execute();
}
}, 0);
}

This function is crucial for Android compatibility processing, based on the MutationObserver to monitor DOM changes, and then restore the updates that cause inconsistency between the editor data and the interface state. It controls the timing of the editor’s actual execution of data model modifications. When there is no DOM update within a setTimeout period, the monitoring of MutationObserver is automatically cancelled to avoid affecting normal DOM updates.

Currently, there is no major issue with controlling this timing. After entering content on Android, the timing of DOM updates is after the Promise period and before the setTimeout period. The triggering time of MutationObserver happens to be between these two. Using setTimeout can add an extra layer of protection to prevent MutationObserver from monitoring standard DOM updates.

Problem Record

This section is a record of specific issues encountered during the compatibility handling of input on Android devices. I apologize for not recording the exceptions in video format, but only in text. Those who do not need it temporarily can skip it. If there is a need for compatibility with input issues on Android devices, you can refer to it.

① Moving the focus unexpectedly triggers text insertion

This problem is caused by the handling logic in the previous code. In certain scenarios, text insertion needs to be handled in the compositionend event to address the issue of certain browsers not triggering the beforeinput event. Now, under the combined input feature of Android device browsers, moving the focus will also trigger compositionend. Therefore, it is necessary to add a judgment condition to prevent incorrect insertion on Android devices.

② Abnormal behavior when pressing the Enter key

③ Abnormal behavior when pressing the Backspace key at the end of a word

These two input behaviors ultimately trigger the beforeinput event with inputType set to insertCompositionText. As mentioned earlier, slate-angular did not handle this type of input before, and because of the “special combined input feature of Android device input methods”, the default behavior of browsers cannot be intercepted. Therefore, it is necessary to recognize the input intent and handle the impact of the default browser behavior.

Recognizing the user’s line break behavior is a challenge. Currently, the logic in slate-react is used as a reference: by judging whether the data carried in the beforeinput event parameters contains ‘\n’ to identify: 1. If it contains ‘\n’, it is considered a line break behavior; 2. If it does not contain ‘\n’, it is considered an insertion behavior (deletion behavior in the case of combined input is also treated as insertion). If the carried data is undefined, it is a normal deletion behavior.

④ Pressing the Backspace key in the middle of a word will display the deletion of two letters in the interface

⑤ Unable to correctly delete content by pressing the Backspace key after selecting all text

These two problems are similar and are still issues with intercepting the default browser behavior. The difference is that the input type of the beforeinput event they trigger is deleteContentBackward. Therefore, special handling needs to be done for deleteContentBackward type input on Android devices. Initially, when dealing with problem ④, it was only necessary to execute Editor.deleteBackward(editor) in the callback of the restoreDom function. However, problem ⑤ occurred, so in the end, the DOM selection corresponding to the event was obtained (converted to Slate selection targetRange) before calling restoreDom, and then Transforms.delete(editor, { at: targetRange }) was executed in the callback of restoreDom to solve these two problems.

⑥ Abnormal handling of data when pressing the Backspace key in English mode of Sougou Keyboard (an additional segment of combined text is added)

This problem only occurs in the Sougou Keyboard browser. The main reason is that when handling the insertCompositionText type input of beforeinput, the real selection cannot be obtained through event.getTargetRanges(). From the browser’s behavior, it can be seen that after deleting a word, the browser does lose the composition selection (the word no longer has an underline), but the input method’s predictive input state still exists, resulting in inconsistent states. Essentially, it is because the DOM modification method of slate-angular disrupts the input method’s predictive input function.

The solution is to modify the DOM modification method of slate-angular, replacing the rendering of the last level of string with a component (adding SlateDefaultStringComponent). In the component, the DOM is updated based on the state. If the text to be rendered is exactly the same as the text in the DOM (the default browser modification behavior takes effect), the browser state will be maintained and no rendering update will be performed, thus not interrupting the input method’s predictive input state.

⑦ Abnormal focus update when pressing the Space key at the end of a word (when input is in the predictive input state) in Sougou Keyboard

The root cause of this problem is that in Sougou Keyboard, pressing the Space key at the end of a word triggers the beforeinput event twice. The input type of the first time is insertCompositionText, which is used to update the combined input text, and the input type of the second time is insertText, which is used to insert a space. Because the insertCompositionText input type is specially handled on Android devices and its execution cycle is placed in restoreDom, this causes the execution timing of the insertText handling to be earlier than the execution cycle of insertCompositionText, resulting in abnormal focus. The solution is to also handle insertText specially on Android devices and place it in the execution cycle of restoreDom.

⑧ Unable to obtain targetRange through event.nativeTargetRange() on Android 31

Obtain it through compatibility measures. At this time, the selectionchange event has already been triggered, so targetRange can be obtained through window.getSelection().

⑨ When using Sougou keyboard, after entering an English letter at the beginning of the first line, entering the first letter of subsequent content will result in the repetition of the first letter.

For example, if the user wants to enter “love”, after entering “l” and then “o”, the result will be “llo”. Similar problems can also occur when entering content through suggestions.

This is actually an error in the data layer, where there is duplicate data in the editor’s data layer.

The problem is due to the underlying editor rendering different DOM structures for empty characters and text. The data-driven DOM changes interrupt the input method’s predictive suggestion functionality. The interface displays that the suggestion is interrupted after typing “l”, but internally in the input method it is not interrupted, resulting in inconsistent states. When typing “o” next, it is inserted as “lo”.

This problem is similar to issue ⑥, and there is no good solution. The only way is to operate on the DOM more carefully, avoiding abrupt deletion or insertion of DOM to complete the update.

Summary

This is all for this sharing session. The main purpose is to introduce the root cause and compatibility approach of content input issues on Android, as well as to document the problems encountered during the compatibility process, so that everyone can refer to them when dealing with Android compatibility. If you have any questions, please leave a comment for discussion.

Android input compatibility is like a clogged pipe, wherever there is a leak, you need to block it, and there is no particularly good solution, but try to use a consistent approach when handling it.

References:

https://w3c.github.io/uievents/tools/key-event-viewer.html

https://discuss.prosemirror.net/t/contenteditable-on-android-is-the-absolute-worst/3810

This article discusses the compatibility issues of rich text editors on Android devices and introduces the compatibility solution of the Slate framework. The article explains the special features of input on Android devices and the overall compatibility handling method. The article also emphasizes the details of input compatibility and provides code examples.

--

--