Published
Edited
May 8, 2019
1 fork
Importers
22 stars
Insert cell
Insert cell
embed(sentimentChart)
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
NUM_POINTS = 500
Insert cell
Insert cell
md`Another way to potentially improve the quality at the cost of increased run time is to increase the chunking level below. If you change the parameter 'doc' to 'page' (from 'ntile') associations will be done by physical page, rather than by percentile through the book.`
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
embed(sentimentChart)
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
html`${volume.repr()}<br>${selfSimilarityPlot}`
Insert cell
Insert cell
function do_chunking(volume, chunking) {
// chunk relative to settings in this particular observable notebook.
if (chunking == 'page') {
return filled_library.chunks({doc: 'page', what: 'word', sections: ['body']})
}
else if (chunking == '10ths') {
return filled_library.chunks({doc: 'ntile', 'ntiles': 10, what: 'word'})
} else {
return filled_library.chunks({doc: 'ntile', 'ntiles': 100, what: 'word'})
}
}
Insert cell
zipfPlot = md`### Zipf plot

This is a pretty unremarkable plot, but it can be nice to see: do the distributions of word frequencies follow Zipf's law? Since George Zipf was originally writing about the counts in Joyce's *Ulysses,* that would be a sensible book to [look at here.](https://observablehq.com/@bmschmidt/book-visualizations-sandbox?htid=inu.30000111060913)

`
Insert cell
Insert cell
Insert cell
pos_lookup = new Object({"PRP$": "Possessive pronoun", "VBG": "Verb, gerund or present participle", "FW": "Foreign word", "VBN": "Verb, past participle", "VBP": "Verb, non-3rd person singular present", "WDT": "Wh-determiner", "JJ": "Adjective", "WP": "Wh-pronoun", "VBZ": "Verb, 3rd person singular present", "DT": "Determiner", "RP": "Particle", "NN": "Noun, singular or mass", "VBD": "Verb, past tense", "POS": "Possessive ending", "TO": "to", "PRP": "Personal pronoun", "RB": "Adverb", "NNS": "Noun, plural", "NNP": "Proper noun, singular", "VB": "Verb, base form", "WRB": "Wh-adverb", "CC": "Coordinating conjunction", "LS": "List item marker", "PDT": "Predeterminer", "RBS": "Adverb, superlative", "RBR": "Adverb, comparative", "CD": "Cardinal number", "EX": "Existential there", "IN": "Preposition or subordinating conjunction", "WP$": "Possessive wh-pronoun", "MD": "Modal", "NNPS": "Proper noun, plural", "JJS": "Adjective, superlative", "JJR": "Adjective, comparative", "SYM": "Symbol", "UH": "Interjection",
// Begins my amendations
".": "Punctuation", ",": "Punctuation", "NE": "Noun, unknown", "``": "Punctuation",
"CARD": "Number", "''": "Punctuation", ":": "Punctuation", "#": "Punctuation",
"-LRB-": "Unknown", "-RRB-": "Unknown", "$": "Unknown"})
Insert cell
md`# General supporting code

The Volume and Library classes are inherited from a different notebook.

The a new 'Library' instance is created, and the 'fetch_all' method is called to asynchronously fill in the requested id.

`
Insert cell
import {Volume, Library} from '@bmschmidt/javascript-bindings-to-the-hathi-features-data'
Insert cell
filled_library = new Library().fetch_all([htid])
Insert cell
md`# Imports

Here are other javascript imports required to make this notebook run.

I use the new VegaLite API just to test it out; it's still a little undocumented (there were some sorting operations I couldn't work out the syntax for) so I sometimes, but not always, compiled to normal vega-lite JSON.
`
Insert cell
Insert cell
import {vl} from '@vega/vega-lite-api'
Insert cell
import {slider, radio, select, text} from "@jashkenas/inputs"
Insert cell
d3Fetch = require('d3-fetch')

Insert cell
d3 = require('d3', 'd3-array')
Insert cell

One platform to build and deploy the best data apps

Experiment and prototype by building visualizations in live JavaScript notebooks. Collaborate with your team and decide which concepts to build out.
Use Observable Framework to build data apps locally. Use data loaders to build in any language or library, including Python, SQL, and R.
Seamlessly deploy to Observable. Test before you ship, use automatic deploy-on-commit, and ensure your projects are always up-to-date.
Learn more