Public
Edited
Mar 21, 2021
1 star
Insert cell
Insert cell
Insert cell
Diff = require("diff")
Insert cell
diffTest = Diff.diffWords("this and that", "that and this")
Insert cell
diffTestWithSpace = Diff.diffWordsWithSpace(
"this and that",
"that and this \nand the other"
)
Insert cell
diffTest2 = Diff.diffWords(
`Here<cb/> is a brief example of text marked up for <i>rtwtr</i> followed by markup conventions and details.
<ub>
The unit above is defined by the full-stop sentence-level punctuation. But this unit has more than one sentence in it. Nonetheless, it will match the corresponding single-sentence unit in the b_file.
</ub>
And this sentence has been<tb/> divided into two units by so-called thought-break tags, the final one here, redundant.<tb/>
There are five units, count ’em, in this section.
<pb/>
<sb/>
`,
`In this table<cb/> is a short example of text marked up for <i>rtwtr</i> followed by more detailed explanation of <i>rtwtr</i> markup conventions.
<ub>
The unit above is one sentence and so is this one; it will match the corresponding multi-sentence unit in the a_file simply due to its full-stop.
</ub>
And this sentence is on its own.<pb/>
Its thought was divided and it will even get its own paragraph in the rendered visualization.
There are five units in both texts.
<pb/>
<sb/>
`
)
Insert cell
function pushLineObj(arr, text, type) {
let lineObj = {};
lineObj.type = type;
lineObj.text = text;
arr.push(lineObj);
}
Insert cell
tagFriendlyWordDiff = (a, b) => {
let diffs = Diff.diffWordsWithSpace(a, b);
let lines = [];
let misplacedOpenTag = false;
for (let i = 0; i < diffs.length; i++) {
let type = " ";
if (diffs[i].added == true) type = "+";
if (diffs[i].removed == true) type = "-";
let span = diffs[i].value; // .replaceAll("\n", "")
if (misplacedOpenTag) {
let tagCloseIndex = span.indexOf(">");
if (tagCloseIndex != -1) {
span = "<" + span;
misplacedOpenTag = false;
}
}
misplacedOpenTag = span.indexOf("<") == span.length - 1;
if (misplacedOpenTag) span = span.slice(0, span.length - 1);
//span = span.replaceAll("\n", ""); // only do this within verse?
let punctPlusLinefeed = /[!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]\n/;
let badPattern = span.search(punctPlusLinefeed);
while (badPattern != -1) {
let tagOnLine = span.slice(0, badPattern + 1);
pushLineObj(lines, tagOnLine, type);
span = span.slice(badPattern + 2);
badPattern = span.search(punctPlusLinefeed);
}
pushLineObj(lines, span, type);
}
return lines;
}
Insert cell
RiTa = require("rita")
Insert cell
PUNCT = /^[\p{P}|\+|-|<|>|\^|\$|\ufffd|`]*$/u
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
afile = `<verse>
<pb/>
<ub>
Along the journey of our life half way<br/>
I found myself again in a dark wood<br/>
wherein the straight road no longer lay<br/>
<br/>
<br/>
— Dale, 1996
</ub>
</verse>
`
Insert cell
bfile = `<verse>
<pb/>
<ub>
At the midpoint in the journey of our life<br/>
I found myself astray in a dark wood<br/>
For the straight path had vanished.<br/>
<br/>
<br/>
— Creagh and Hollander, 1986
</ub>
</verse>
`
Insert cell
Insert cell
normalized = tagFriendlyWordDiff(afile, bfile)
Insert cell
tokenize = function(value) {
// All whitespace symbols except newline group into one token, each newline - in separate token
let tokens = value.split(/([^\S\r\n]+|[()[\]{}'"\r\n]|\b)/);

// Join the boundary splits that we do not consider to be boundaries. This is primarily the extended Latin character set.
for (let i = 0; i < tokens.length - 1; i++) {
// If we have an empty string in the next field and we have only word chars before and after, merge
if (
!tokens[i + 1] &&
tokens[i + 2] &&
extendedWordChars.test(tokens[i]) &&
extendedWordChars.test(tokens[i + 2])
) {
tokens[i] += tokens[i + 2];
tokens.splice(i + 1, 2);
i--;
}
}
return tokens;
}
Insert cell
extendedWordChars = /^[a-zA-Z\u{C0}-\u{FF}\u{D8}-\u{F6}\u{F8}-\u{2C6}\u{2C8}-\u{2D7}\u{2DE}-\u{2FF}\u{1E00}-\u{1EFF}]+$/u
Insert cell
tokenize("↵<pb/>↵This ")
Insert cell
a_file = `❡&nbsp;&nbsp;&nbsp;THE FUTURE OF LANGUAGE
<pb/>
This “writing through” of Vilém Flusser’s ‘The Future of Writing,’ reconfiguring it so as to become John Cayley’s ‘The Future of Language,’ will not consider problems concerning any possible future for the teaching or philosophizing of any particular art of language in the face of the growing importance of non- or anti-linguistic messages in our surroundings, although those problems have already become significant in the so-called developed countries.
<ub>
Instead, it proposes to consider a tendency that underlies those problems: namely, the tendency to deny or distrust the fundamental linearity of language – including as it is perceived during processes of reading – and toward multi-dimensional codes such as photographs, films, TV, screen-based graphic design in the service of social and socialized media, and, generally, a conception of art and aesthetics that is dominated by visuality, by so-called “fine” as “visual” or “plastic” art even as and when this world of art embraces the conceptualism or “post-medium condition” which could, in principle if not in practice, be extended to the arts of language.
This distrust and denial may be observed everywhere if one glances even superficially at the codified world that surrounds us.
</ub>
<ub>
Literature is $50bn behind art.
The MoMAs in every province and metropolis are stuffed to their gills with hipsters, gleeful families, and young “artists” while fewer and fewer deserted book malls provide desultory subterranean spaces for retiree reading groups.
</ub>
The “future” of language, or rather, of those gestures which align symbols to produce our shared, collective, readable utterances, must be seen against the background of a long-standing tendency to distrust their alignment.
<sb/>

The translation from surface into line implies a radical change of dimensionality with respect to the grasp of meaning.
The eye that deciphers an image scans the surface, and it thus establishes reversible and arbitrary spatial relations between the elements of the image.
It may go back and forth while deciphering the image.
This spatiality of relations that prevails within the image characterizes the world for those who use images for the understanding of the world, who “imagine” it.
For them, all the things in the world are related to each other in such a reversible, spatial equivalence, and the world is structured by “eternal return.”
It is just as true to say that night follows day as that day follows night, that sowing follows reaping as that reaping follows sowing, that life follows death as that death follows life.
The crowing of the cock calls the sun to rise just as much as the rising sun calls the cock to crow.
In such a world, circular time orders all things, “assigns them their just place,” and if a thing is displaced it will be readjusted by time itself.
Because to live is to displace things, life in such a world is a series of “unjust” acts that will be revenged in time.
This demands that we propitiate the order of the world, the “gods” of which it is full.
In sum: the “imagined” world may be a world of myth, of magic, an ahistorical world.
`
Insert cell
b_file = `❡&nbsp;&nbsp;&nbsp;THE FUTURE OF WRITING
<pb/>
This essay will not consider the problems concerning the future of teaching the art of writing in the face of the growing importance of nonliterate messages in our surroundings, although those problems will become ever more important both in the so-called developed countries and in societies where illiteracy is still widespread.
<ub>
Instead, it proposes to consider a tendency that underlies those problems: namely, the tendency away from linear codes such as writing and toward two-dimensional codes such as photographs, films, and TV, a tendency that may be observed if one glances even superficially at the codified world that surrounds us.
</ub>
<ub>
</ub>
The future of writing, of that gesture which aligns symbols to produce texts, must be seen against the background of that tendency.
<sb/>

he translation from surface into line implies a radical change of meaning.
The eye that deciphers an image scans the surface, and it thus establishes reversible relations between the elements of the image.
It may go back and forth while deciphering the image.
This reversibility of relations that prevails within the image characterizes the world for those who use images for the understanding of the world, who “imagine” it.
For them, all the things in the world are related to each other in such a reversible way, and their world is structured by “eternal return.”
It is just as true to say that night follows day as that day follows night, that sowing follows reaping as that reaping follows sowing, that life follows death as that death follows life.
The crowing of the cock calls the sun to rise just as much as the rising sun calls the cock to crow.
In such a world, circular time orders all things, “assigns them their just place,” and if a thing is displaced it will be readjusted by time itself.
Because to live is to displace things, life in such a world is a series of “unjust” acts that will be revenged in time.
This demands that man propitiate the order of the world, the “gods” of which it is full.
In sum: the “imagined” world is the world of myth, of magic, the prehistorical world.
`
Insert cell
tagFriendlyWordDiff(a_file, b_file)
Insert cell
Diff.diffWordsWithSpace(a_file, b_file)
Insert cell
s = "\n>\nInstead,"
Insert cell
s.search(/[!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~]\n/)
Insert cell

Purpose-built for displays of data

Observable is your go-to platform for exploring data and creating expressive data visualizations. Use reactive JavaScript notebooks for prototyping and a collaborative canvas for visual data exploration and dashboard creation.
Learn more