The first satRdays conference was held in Auckland, New Zealand on 22 February 2020. We had an amazing day pulled together by a great group of volunteers and we hope that we can hold this event annually, and perhaps in cities around New Zealand.
Awesome organisers and volunteers.
Thanks team !!
Jonathan Ng, Wai Loon Tham, @Kim_Fitter @MeganGuidry_@hackity_pat @snowflaksmasher Izzy Johnson, @earowang @selinthefirst @am_innocenter#rstats #satRdays #satRday pic.twitter.com/msTmii5UxJ
— SatRday New Zealand (@SatRday_NZ) February 22, 2020 This blog covers:
Introduction
Data
Exploratory Data Analysis
Data Visualisation
Conclusions
References
Introduction
Mount Ruepehu is an active volcano on the North Island which is skied in winter and walked in summer.
This analysis looks at whether anyone should swim in the crater lake at the top of the mountain.
Data
We can access volcano field time series temperature observation results data from GeoNet.
Introduction
Data
Data Summary
Exploratory Data Analysis
Data Cleaning
Data Visualisation
Conclusions
References
Introduction
I was inspired to analyse and visualise global fishing open data by the story produced by Jon Olav Eikenes.
This is also a sequel to my previous shark analysis post Is Swimming with Sharks Dangerous?, digging further into the potential reduction in shark numbers due to commercial fishing.
# Load packages
library(tidyverse)
library(sf)
library(rnaturalearth)
library(biscale)
library(cowplot)
Data
We will use the Global Fishing Watch Vessel Identity open data available with the Vessels metadata available under the Creative Commons Attribution-ShareAlike 4.
Introduction
Data
Data Summary
Exploratory Data Analysis
Data Cleaning
Data Visualisation
Conclusions
References
Introduction
We recently went swimming with sharks with a Marine Biologist. We asked the question “Is swimming with sharks dangerous?” in the dive brief about how to swim with sharks.
This is a spatial analysis of historical shark attacks.
# Load packages
library(tidyverse)
library(sf)
library(rnaturalearth)
Data
We will use the International Shark Attack File (ISAF) global shark attack csv dataset from 1580 until July 26, 2018.
Introduction
Data Summary
Exploratory Data Analysis
Data Cleaning
Data Visualisation
Conclusions
References
Introduction
The Geocomputation with R book has a great example of transport analysis in Bristol.
I decided this is a chance to get to know Open Street Map (OSM) and its data better given the flurry of bike route construction.
This is a bicycle network analysis of Auckland using open source spatial and bike counter data for 2018.
Personalised logos
The new Data Visualisation Society has created personalised logos, see the behind the scenes post by Amy Cesal. It is a visual representation of three key skill areas; data, visualisation and society.
See below the key to the logo:
There is also a timezone overlay in grey.
My personalised logo
This is my personalised logo:
Purple I have a background in project management and I am involved in community group organisation represented by the purple triangle.
Introduction
Data
Fundamentals of Data Visualisation
Final Visualisations
Conclusions
References
Introduction
This post creates and reviews at visualisation objects considering the fundamentals of data visualisation book;
Data visualization is part art and part science. The challenge is to get the art right without getting the science wrong and vice versa
Learning R is a bit of a journey over time so here goes Part 2 of an interactive visualisation project using tidyverse and ggplot compatible1 packages.
Introduction
Data
Exploratory Data Analysis
First Roadmap
Conclusions
References
Introduction
Recently one of the problems I was trying to solve required matrix algebra.
This falls under the umbrella of need to know R basics so I realised I needed a way to track and refresh my R skills, recognise gaps and ultimately create a reference cheat-sheet.
One of the things to reconcile with working in a field is that what interests you and what you need to create or provide as a service commercially is not always the same thing.
If doing only one then it’s like throwing one ball - it is quite doable but monotonous.
Personal projects
Publishing personal side projects as blogs is great practice and ongoing personal learning. I use these to improve the efficiency of my end to end workflow and also to try out different tools, packages and functionality.
Introduction
Import Data
Data Summary
Exploratory Data Analysis
Data Cleaning
Visualisations
Conclusions
References
Introduction
While touring the New Zealand South Island with friends in January 2019, one of our stops was the The University of Canterbury Mt John Observatory. Since it this area is a dark sky reserve, with a clear night and a new moon the midnight sky was lit up with stars and the milky way.
Introduction
Data Overview
Load Data
Data Cleaning
Exploratory Data Analysis
Animation
Conclusions
References
Introduction
I went to my first basketball game on 27 January 2019 to watch the Skycity Breakers versus Brisbane Bullets. The final score was Breakers 109-96 Bullets.
Since it was a close game I wondered if the type of shots attempted got closer to the hoop under pressure near the end of time, assuming that shorter range shots have a better success rate ?
This post is of the the folder structure of a blogdown website, to get a basic understanding of how it works and to also remind myself next time I face a similar publishing issue!
Publishing issue I have been updating my blogdown website with posts, and then I hit an issue. The about link would not update. I tried multiple amendments and multiple commits to no avail. I then re-created the whole website and reinstalled the theme.
Introduction
Data Overview
Load Data
Data Manipulation
Data Exploration
Data Visualisation
Conclusions
Introduction
The Kaikoura earthquake happened just after midnight on 14 November 2016.
After reading this article Astonishing Nasa photos show Kaikoura land raised by earthquake I thought it would interesting to look at what open data is available to view the land changes from the article.
This analysis has the following objectives:
Introduction
Data
Data Cleaning
Visualisations
Conclusions
Introduction
The motivations for cycling around town can be for fitness, economic or environmental reasons. However in Auckland the weather and the 53 odd volcanoes could be deterrents to would-be cyclists.
Auckland Transport (AT) publishes daily and monthly bike count data from counters around Auckland now available under CC BY 4.0 license.
Kia ora, us again!
This blog was using the creative portfolio theme until the number of posts grew. The tiled effect became busy and the website required a different website format including an archive list.
I have changed the theme to Blackburn following this post by Mike Treglia.
So far so good….
Introduction
Load the data
Summary of data
Clean Data
Exploratory Data Analysis
Feature Engineering
Training
Create Scorecard
Conclusions
Introduction
Credit Risk modeling predicts whether a customer or applicant may or may not default on a loan. These models include predictor variables that are categorical or numeric. One of the outputs in the modeling process is a credit scorecard with attributes to allocate scores.
The objectives of this post are as follow:
Introduction
This tutorial was inspired by the R Curious tutorial at useR! 2018, and follows on thematically from the R Curious workshop notes as an extension.
It is aimed at those with a background in Excel, who would also like to use R for data analysis. This tutorial compares the things you would normally do in Excel, but with an equivalent function in R.
This introductory level tutorial assumes you have already installed R and R studio and had a brief introduction to the R basics and R Markdown.
Introduction
Data
Clean Data
Success Factors
Specific Events Resulting in Internships
Social Impact
Conclusion
Introduction
We were looking for a real data from a non-profit organisation for the R-Ladies Auckland Dataviz meetup. We were approached by Summer of Tech(SoT) who volunteered their data for the group to explore.
SoT is a non-profit organisation that connects employers with students and graduates for paid work experience and graduate jobs.
Recently I have been working on an Natural Language Processing (NLP) client project. This field appears to extensively use Python packages so I used the opportunity to go on an NLP journey in Python, starting with a Jupyter notebook. The Python packages included here are the research tool NLTK, gensim then the more recent spaCy.
The purpose of this post is the next step in the journey to produce a pipeline for the NLP areas of text mining and Named Entity Recognition (NER) using the Python spaCy NLP Toolkit, in R.
Version control of code seems similar to creating, updating and sharing recipes. Since we were trying out crepe recipes at home, I decided to use this personal connection to work through a crepes made with git example to validate my understanding of Git from the command line.
Inspired by Reflections on 4 months of GitHub: my advice to beginners by Suzan Baert, I also started a personal Git cheatsheet, which will be incrementally updated with gems from future projects.
I have been experimenting with spatial data and looking at ways to combine data objects to get different views and perspectives. This post has the following objectives:
Import and load data direct to R without manually saving files to a local directory, so that the end to end process is reproducible.
Use vectors to efficiently clean the Excel sheet, reducing manual steps where possible so that the process could be leveraged to clean other Excel sheets in the Census data.
This R markdown post was inspired by the R-bloggers post Building your own blockchain in R.
Blockchain technology, types such as bitcoin and ethereum and its associated terminology are ubiquitous in the media at the moment.
The objective here is to learn how a generic blockchain works and understand some of the terminology through implementing a simple example. This generic example could then extended to various types of blockchains such as distributed ledgers, smart contracts or cryptocurrencies with network servers and appropriate privacy and security settings.
Welcome to my personal blog and portfolio of my data science journey.
Until now I have been creating and publishing R code to github and HTML to RPubs. I also have a Wordpress professional website to reach out to both technical and non-technical audiences with summaries of projects. I realised I needed another more personalised medium to publish the detailed code and HTML from projects so I created this blog.
There are a many tutorials and the great bookdown blogdown: Creating Websites with R Markdown with detailed steps on how to create a website using blogdown and Hugo.
I used the fantastic tutorial by Tyler Clavelle to create this blog. This website uses the blogdown R package and static site generator Hugo. I chose to use the Hugo creative portfolio theme, with side bar and tiles layout. I use Windows and GitHub repo to deploy the website.