**Sample datasets**
_Such dataset are very common in the data science community_
- [Iris Flower dataset](https://en.wikipedia.org/wiki/Iris_flower_data_set) ([csv](https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv))
<pre>
sepal_length,sepal_width,petal_length,petal_width,species
5.1,3.5,1.4,0.2,setosa
4.9,3,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
5,3.6,1.4,0.2,setosa
</pre>
- [Cars](https://ai.stanford.edu/~jkrause/cars/car_dataset.html.) ([csv](https://gist.githubusercontent.com/noamross/e5d3e859aa0c794be10b/raw/b999fb4425b54c63cab088c0ce2c0d6ce961a563/cars.csv))
<pre>
"","mpg","cyl","disp","hp","drat","wt","qsec","vs","am","gear","carb"
"Mazda RX4",21,6,160,110,3.9,2.62,16.46,0,1,4,4
"Mazda RX4 Wag",21,6,160,110,3.9,2.875,17.02,0,1,4,4
"Datsun 710",22.8,4,108,93,3.85,2.32,18.61,1,1,4,1
"Hornet 4 Drive",21.4,6,258,110,3.08,3.215,19.44,1,0,3,1
"Hornet Sportabout",18.7,8,360,175,3.15,3.44,17.02,0,0,3,2
"Valiant",18.1,6,225,105,2.76,3.46,20.22,1,0,3,1
"Duster 360",14.3,8,360,245,3.21,3.57,15.84,0,0,3,4
</pre>
- Individuals [https://www.mockaroo.com/](https://www.mockaroo.com/)
<pre>
id,first_name,last_name,email,gender,ip_address
1,Gale,Bernardini,gbernardini0@flickr.com,Female,236.165.167.229
2,Ravid,Magnar,rmagnar1@indiegogo.com,Male,127.20.137.234
3,Courtney,Simcox,csimcox2@hp.com,Male,25.250.201.67
4,Farlay,Killeley,fkilleley3@behance.net,Male,211.236.251.254
5,Ambros,Godier,agodier4@i2i.jp,Male,198.226.197.211
6,Melicent,Ahren,mahren5@thetimes.co.uk,Female,128.171.235.98
7,Freedman,Paullin,fpaullin6@posterous.com,Male,10.133.34.122
8,Jabez,Jonsson,jjonsson7@comsenz.com,Male,173.77.112.108
</pre>
- [Stocks data](https://raw.githubusercontent.com/LyonDataViz/MOS5.5-Dataviz/master/data/stocks.csv)
<pre>
symbol,date,price
MSFT,Jan 2000,39.81
MSFT,Feb 2000,36.35
MSFT,Mar 2000,43.22
MSFT,Apr 2000,28.37
MSFT,May 2000,25.45
MSFT,Jun 2000,32.54
MSFT,Jul 2000,28.4
MSFT,Aug 2000,28.4
MSFT,Sep 2000,24.53
</pre>
**What not to do**
- Stick too long to the same dataset, then the visualization might get too specific
- Forget to explore the actual dataset being used to find interesting patterns, properties, semantic, ..
- The generated dataset can then be aggregated, filtered, reduced, etc. as a regular dataset
**Other**
- https://www.kelp.nyc/
- https://tuftsvalt.github.io/snowcat/
- https://amnesia.openaire.eu/
- https://github.com/jiananlu/faked_csv
- API testing https://reqres.in/