R

The First SatRdays Conference in New Zealand

The first satRdays conference was held in Auckland, New Zealand on 22 February 2020. We had an amazing day pulled together by a great group of volunteers and we hope that we can hold this event annually, and perhaps in cities around New Zealand. Awesome organisers and volunteers. Thanks team !! Jonathan Ng, Wai Loon Tham, @Kim_Fitter @MeganGuidry_@hackity_pat @snowflaksmasher Izzy Johnson, @earowang @selinthefirst @am_innocenter#rstats #satRdays #satRday pic.twitter.com/msTmii5UxJ — SatRday New Zealand (@SatRday_NZ) February 22, 2020 This blog covers:

Should You Swim in the Ruepehu Crater Lake?

Introduction Data Exploratory Data Analysis Data Visualisation Conclusions References Introduction Mount Ruepehu is an active volcano on the North Island which is skied in winter and walked in summer. This analysis looks at whether anyone should swim in the crater lake at the top of the mountain. Data We can access volcano field time series temperature observation results data from GeoNet.

Which Countries have Large Scale Fishing Vessels?

Introduction Data Data Summary Exploratory Data Analysis Data Cleaning Data Visualisation Conclusions References Introduction I was inspired to analyse and visualise global fishing open data by the story produced by Jon Olav Eikenes. This is also a sequel to my previous shark analysis post Is Swimming with Sharks Dangerous?, digging further into the potential reduction in shark numbers due to commercial fishing. # Load packages library(tidyverse) library(sf) library(rnaturalearth) library(biscale) library(cowplot) Data We will use the Global Fishing Watch Vessel Identity open data available with the Vessels metadata available under the Creative Commons Attribution-ShareAlike 4.

Is Swimming with Sharks Dangerous?

Introduction Data Data Summary Exploratory Data Analysis Data Cleaning Data Visualisation Conclusions References Introduction We recently went swimming with sharks with a Marine Biologist. We asked the question “Is swimming with sharks dangerous?” in the dive brief about how to swim with sharks. This is a spatial analysis of historical shark attacks. # Load packages library(tidyverse) library(sf) library(rnaturalearth) Data We will use the International Shark Attack File (ISAF) global shark attack csv dataset from 1580 until July 26, 2018.

Spatial Bike Network Analysis in Auckland

Introduction Data Summary Exploratory Data Analysis Data Cleaning Data Visualisation Conclusions References Introduction The Geocomputation with R book has a great example of transport analysis in Bristol. I decided this is a chance to get to know Open Street Map (OSM) and its data better given the flurry of bike route construction. This is a bicycle network analysis of Auckland using open source spatial and bike counter data for 2018.

Changing Favicon in the Blackburn Theme

Personalised logos The new Data Visualisation Society has created personalised logos, see the behind the scenes post by Amy Cesal. It is a visual representation of three key skill areas; data, visualisation and society. See below the key to the logo: There is also a timezone overlay in grey. My personalised logo This is my personalised logo: Purple I have a background in project management and I am involved in community group organisation represented by the purple triangle.

Interactive Learning Roadmap - Part 2

Introduction Data Fundamentals of Data Visualisation Final Visualisations Conclusions References Introduction This post creates and reviews at visualisation objects considering the fundamentals of data visualisation book; Data visualization is part art and part science. The challenge is to get the art right without getting the science wrong and vice versa Learning R is a bit of a journey over time so here goes Part 2 of an interactive visualisation project using tidyverse and ggplot compatible1 packages.

Interactive Learning Roadmap - Part 1

Introduction Data Exploratory Data Analysis First Roadmap Conclusions References Introduction Recently one of the problems I was trying to solve required matrix algebra. This falls under the umbrella of need to know R basics so I realised I needed a way to track and refresh my R skills, recognise gaps and ultimately create a reference cheat-sheet.

Why Blog?

One of the things to reconcile with working in a field is that what interests you and what you need to create or provide as a service commercially is not always the same thing. If doing only one then it’s like throwing one ball - it is quite doable but monotonous. Personal projects Publishing personal side projects as blogs is great practice and ongoing personal learning. I use these to improve the efficiency of my end to end workflow and also to try out different tools, packages and functionality.

Celestial Maps

Introduction Import Data Data Summary Exploratory Data Analysis Data Cleaning Visualisations Conclusions References Introduction While touring the New Zealand South Island with friends in January 2019, one of our stops was the The University of Canterbury Mt John Observatory. Since it this area is a dark sky reserve, with a clear night and a new moon the midnight sky was lit up with stars and the milky way.

Animation of Basketball Shots

Introduction Data Overview Load Data Data Cleaning Exploratory Data Analysis Animation Conclusions References Introduction I went to my first basketball game on 27 January 2019 to watch the Skycity Breakers versus Brisbane Bullets. The final score was Breakers 109-96 Bullets. Since it was a close game I wondered if the type of shots attempted got closer to the hoop under pressure near the end of time, assuming that shorter range shots have a better success rate ?

Who Would Have Thought The Slug Was The Bug

This post is of the the folder structure of a blogdown website, to get a basic understanding of how it works and to also remind myself next time I face a similar publishing issue! Publishing issue I have been updating my blogdown website with posts, and then I hit an issue. The about link would not update. I tried multiple amendments and multiple commits to no avail. I then re-created the whole website and reinstalled the theme.

LiDAR Analysis of Kaikoura NZ

Introduction Data Overview Load Data Data Manipulation Data Exploration Data Visualisation Conclusions Introduction The Kaikoura earthquake happened just after midnight on 14 November 2016. After reading this article Astonishing Nasa photos show Kaikoura land raised by earthquake I thought it would interesting to look at what open data is available to view the land changes from the article. This analysis has the following objectives:

Auckland Bike Counter Analysis

Introduction Data Data Cleaning Visualisations Conclusions Introduction The motivations for cycling around town can be for fitness, economic or environmental reasons. However in Auckland the weather and the 53 odd volcanoes could be deterrents to would-be cyclists. Auckland Transport (AT) publishes daily and monthly bike count data from counters around Auckland now available under CC BY 4.0 license. Kia ora, us again!

Changing Theme with Blogdown

This blog was using the creative portfolio theme until the number of posts grew. The tiled effect became busy and the website required a different website format including an archive list. I have changed the theme to Blackburn following this post by Mike Treglia. So far so good….

Credit Risk Modeling and Scorecard Example

Introduction Load the data Summary of data Clean Data Exploratory Data Analysis Feature Engineering Training Create Scorecard Conclusions Introduction Credit Risk modeling predicts whether a customer or applicant may or may not default on a loan. These models include predictor variables that are categorical or numeric. One of the outputs in the modeling process is a credit scorecard with attributes to allocate scores. The objectives of this post are as follow:

Tutorial on Pivot Tables and other Excel things you can also do in R - Witch Trials data

Introduction This tutorial was inspired by the R Curious tutorial at useR! 2018, and follows on thematically from the R Curious workshop notes as an extension. It is aimed at those with a background in Excel, who would also like to use R for data analysis. This tutorial compares the things you would normally do in Excel, but with an equivalent function in R. This introductory level tutorial assumes you have already installed R and R studio and had a brief introduction to the R basics and R Markdown.

Summer of Tech Event and Intern Analysis

Introduction Data Clean Data Success Factors Specific Events Resulting in Internships Social Impact Conclusion Introduction We were looking for a real data from a non-profit organisation for the R-Ladies Auckland Dataviz meetup. We were approached by Summer of Tech(SoT) who volunteered their data for the group to explore. SoT is a non-profit organisation that connects employers with students and graduates for paid work experience and graduate jobs.

Name Entity Recognition using Python spaCy in R

Recently I have been working on an Natural Language Processing (NLP) client project. This field appears to extensively use Python packages so I used the opportunity to go on an NLP journey in Python, starting with a Jupyter notebook. The Python packages included here are the research tool NLTK, gensim then the more recent spaCy. The purpose of this post is the next step in the journey to produce a pipeline for the NLP areas of text mining and Named Entity Recognition (NER) using the Python spaCy NLP Toolkit, in R.

Crepes made with Git

Version control of code seems similar to creating, updating and sharing recipes. Since we were trying out crepe recipes at home, I decided to use this personal connection to work through a crepes made with git example to validate my understanding of Git from the command line. Inspired by Reflections on 4 months of GitHub: my advice to beginners by Suzan Baert, I also started a personal Git cheatsheet, which will be incrementally updated with gems from future projects.

Excel to Interactive Census Map

I have been experimenting with spatial data and looking at ways to combine data objects to get different views and perspectives. This post has the following objectives: Import and load data direct to R without manually saving files to a local directory, so that the end to end process is reproducible. Use vectors to efficiently clean the Excel sheet, reducing manual steps where possible so that the process could be leveraged to clean other Excel sheets in the Census data.

Simple Blockchain Example in R

This R markdown post was inspired by the R-bloggers post Building your own blockchain in R. Blockchain technology, types such as bitcoin and ethereum and its associated terminology are ubiquitous in the media at the moment. The objective here is to learn how a generic blockchain works and understand some of the terminology through implementing a simple example. This generic example could then extended to various types of blockchains such as distributed ledgers, smart contracts or cryptocurrencies with network servers and appropriate privacy and security settings.

Hello Blog

Welcome to my personal blog and portfolio of my data science journey. Until now I have been creating and publishing R code to github and HTML to RPubs. I also have a Wordpress professional website to reach out to both technical and non-technical audiences with summaries of projects. I realised I needed another more personalised medium to publish the detailed code and HTML from projects so I created this blog.

Creating A Website In Hugo Creative Portfolio Theme

There are a many tutorials and the great bookdown blogdown: Creating Websites with R Markdown with detailed steps on how to create a website using blogdown and Hugo. I used the fantastic tutorial by Tyler Clavelle to create this blog. This website uses the blogdown R package and static site generator Hugo. I chose to use the Hugo creative portfolio theme, with side bar and tiles layout. I use Windows and GitHub repo to deploy the website.