Public
Edited
Jul 1, 2023
Paused
1 star
Traditional text analysis of complaints
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
// ADD CUSTOM STOPWORDS HERE
customStopwords = ['airtransat', 'rouge', 'tel', 'aircanada', 'miami', 'francfort', 'frankfurt', 'lima', 'air', 'canada', 'flight', 'newark', 'aviv', 'ons', 'yeah', 'faudra']
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
dat
Type Table, then Shift-Enter. Ctrl-space for more options.

Insert cell
Insert cell
matrix = similarity.getDistanceMatrix()
Insert cell
similarity = new tfidf.Similarity(corpus)
Insert cell
Insert cell
Insert cell
Insert cell
Insert cell
dat = db.query(`SELECT GROUP_CONCAT(${select2}) as text, airline FROM data WHERE lang = '${select1}' GROUP BY airline`)
Insert cell
airlines = dat.map(d => d.airline)
Insert cell
complaints = dat.map(d => d.text)
Insert cell
Insert cell
corpus = new tfidf.Corpus(
airlines,
complaints,
toggle,
customStopwords,
K1,
b
)
Insert cell
marginwidth = 80
Insert cell
K1 = 2.0
Insert cell
b = 0.75
Insert cell
Insert cell
uniq_lang = db.query(`SELECT DISTINCT(lang) as lang from data`)
Insert cell
uniq_airline = db.query(`SELECT DISTINCT(airline) as airline from data`)
Insert cell
db = DuckDBClient.of({ data: FileAttachment("tripadvisor_reviews_multi_downsampled_trans.parquet") })
Insert cell
tfidf = import("tiny-tfidf")
Insert cell

Purpose-built for displays of data

Observable is your go-to platform for exploring data and creating expressive data visualizations. Use reactive JavaScript notebooks for prototyping and a collaborative canvas for visual data exploration and dashboard creation.
Learn more