I’m trying to create new functionality using existing in-line comments in a rich text field.
My goal with this is to create automations to aggregate comments with referenced text in documents are reports; e.g. to comment on AI output, include the comments with a re-feed to AI to improve it.
My strategy is to create a script to fetch the comment itself, as well as referrred text that was selected as commented-upon (the yellow underlined text that gets highlited when hovering over the text)
In the JSON of the rich text field, the comment text itself is clearly visible, but the referenced (commented-upon) text is only indicated with indices. For example:
"comments": [
{
"from": 142,
"to": 181,
"id": "0c6c7927-dcab-4014-8552-ab22be4f4fd7",
"body": {
"doc": {
"type": "doc",
"content": [
{
"type": "paragraph",
"attrs": {
"guid": "551cc0fa-c40e-4f2d-a102-81a9e897ccf5"
},
"content": [
{
"type": "text",
"text": "Comment one"
}
]
}
]
},
"comments": []
},
"date": 1746570173161,
"author": {
"id": "9e75b117-a9fa-44a1-96bd-4cebe6b6bbed"
},
"thread": null,
"state": "open",
"detached": false
},
I tried to create a script that generates the referenced text based on the indices “from” and “to” while mapping that to the characters in the rich text field.
Markdown extraction shows offsets, meaning wrong start and end characters result in the wrong referenced text. What I suspect is that I need to:
- Map the rich text JSON to Markdown indices.
- Use \n per blockquote paragraph.
- Capture snippets for all nodes.
Could you pllease give pointers or code example how this is done?