Public
Edited
Feb 18, 2023
Insert cell
Insert cell
Insert cell
data = FileAttachment("powerplants_allUS_2010.csv").csv()
Insert cell
viewof table = Inputs.table(data)
Insert cell
## Prepping Data: Changing Text to Numbers
If you scan through the data, you'll see there are text and numbers, along many rows and many different dimensions (columns).

The table above doesn't show this well, but if you expand the triangle next to data = > Array(5587) above, you'll see the data in it's more pure form, and you'll see that everything is being brought in from the CSV as a text string. They're not numbers. Further, many of these numbers have commas in them, which is great for human readability (it's actually a human visual processing thing), but not so good for a computer doing calculations. We need to parse this to change some of these text to actual numbers.
Insert cell
Insert cell
mile = "5280" // a variable as a string, similar to how a csv file brings in data.
Insert cell
Number(mile) // change to a number
Insert cell
+mile // change to a number
Insert cell
Insert cell
mile2 = "5,280" // another string, this time with a comma in it.
Insert cell
mile2.replace(/,/g,'') // finds all (the g) instances of the text , between the //, and replaces it with nothing ''
Insert cell
Insert cell
toNum = (text) => Number(text.replace(/,/g,''))
Insert cell
toNum(mile2)
Insert cell
Insert cell
Insert cell
data.forEach(d => {
d["Plant annual net generation (MWh)"] = toNum(d["Plant annual net generation (MWh)"]);
d["Plant annual CO2 equivalent emissions (tons)"] = toNum(d["Plant annual CO2 equivalent emissions (tons)"]);
})
Insert cell
Insert cell
data[1]
Insert cell
data[1]["Plant annual net generation (MWh)"]
Insert cell
data[1]["Plant annual CO2 equivalent emissions (tons)"]
Insert cell
Insert cell
Insert cell
data.forEach(d => {
d["Plant CO2 factor"] = toNum(d["Plant annual CO2 emissions (tons)"]) * 1;
d["Plant CH4 factor"] = toNum(d["Plant annual CH4 emissions (lbs)"]) * 2000 * 25;
})
Insert cell
data[5]["Plant CO2 factor"]
Insert cell
data[5]["Plant CH4 factor"]
Insert cell
Insert cell
Insert cell
Insert cell
data.length
Insert cell
Insert cell
data.filter(d => d["Plant state abbreviation"] == "VA")
Insert cell
Insert cell
Insert cell
data.filter(d => d["Plant annual net generation (MWh)"] > 300000)
Insert cell
data.filter(d => d["Plant primary fuel generation category"] == "GAS")
Insert cell
Insert cell
data.filter(d => d["Plant primary fuel generation category"] == "GAS" && d["Plant annual net generation (MWh)"] > 300000)
Insert cell
Insert cell
filteredData = data.filter(d => d["Plant primary fuel generation category"] == "GAS" && d["Plant annual net generation (MWh)"] > 300000)
Insert cell
Insert cell
Insert cell
minNetGen = d3.min(data, d => d["Plant annual net generation (MWh)"])
Insert cell
maxNetGen = d3.max(data, d => d["Plant annual net generation (MWh)"])
Insert cell
extentsNetGen = d3.extent(data, d => d["Plant annual net generation (MWh)"]) // both min and max
Insert cell
sumNetGen = d3.sum(data, d => d["Plant annual net generation (MWh)"]) // in true Tableau fashion!
Insert cell
meanNetGen = d3.mean(data, d => d["Plant annual net generation (MWh)"])
Insert cell
medianNetGen = d3.median(data, d => d["Plant annual net generation (MWh)"])
Insert cell
Insert cell
Insert cell
Insert cell
plantsByState = d3.group(data, d => d["Plant state abbreviation"])
Insert cell
Insert cell
Insert cell
netGenByState = d3.rollup(data, v => d3.sum(v, d => d["Plant annual net generation (MWh)"]), d => d["Plant state abbreviation"]);
Insert cell
netGenByState.get("VA")
Insert cell
Insert cell
netGenByStateAndFuel = d3.rollup(data, v => d3.sum(v, d => d["Plant annual net generation (MWh)"]), d => d["Plant state abbreviation"], d => d["Plant primary fuel generation category"]);
Insert cell
netGenByStateAndFuel.get("VA")
Insert cell
netGenByStateAndFuel.get("VA").get("GAS")
Insert cell
Insert cell
stateAvgLongs = d3.rollup(data, v => d3.mean(v, d => d["Plant longitude"]), d => d["Plant state abbreviation"]);
Insert cell
stateAvgLongs.get("VA")
Insert cell
Insert cell
Insert cell
//sortedData = d3.sort(data, d => d["Plant annual net generation (MWh)"])
Insert cell
Insert cell
//sortedData = d3.sort(data,(a,b) => d3.ascending(a["Plant annual net generation (MWh)"],b["Plant annual net generation (MWh)"]))
Insert cell
viewof sortedtable = Inputs.table(sortedData)
Insert cell
Insert cell
Insert cell
data.slice(0,1000)
Insert cell
Insert cell
data.slice(1000,5000)
Insert cell
Insert cell
Insert cell
//shuffledData = d3.shuffle(data)
Insert cell
viewof shuffledtable = Inputs.table(shuffledData)
Insert cell
Insert cell
//sample = d3.shuffle(data).slice(0,1000)
Insert cell
Insert cell
Insert cell

One platform to build and deploy the best data apps

Experiment and prototype by building visualizations in live JavaScript notebooks. Collaborate with your team and decide which concepts to build out.
Use Observable Framework to build data apps locally. Use data loaders to build in any language or library, including Python, SQL, and R.
Seamlessly deploy to Observable. Test before you ship, use automatic deploy-on-commit, and ensure your projects are always up-to-date.
Learn more