# Background

In February of 2020, Forrester predicted that as a result of COVID-19, demand for hardware including computer and communications equipment will be quite weak due to…

# Rejection Sampling in R

## How to generate random numbers using rejection sampling

When generating random numbers from a particular distribution, this process can be automated to a large extent.

For instance, if one wants to generate 100 random numbers that belong to a normal distribution in R, it is as simple as executing:

`rnorm(100)`

However, how does this process actually work “under the hood”? How can an algorithm know whether a random number belongs to a particular distribution or not?

The answer is through rejection sampling.

Rejection sampling is a means of generating random numbers that belong to a particular distribution.

# How Rejection Sampling Works

A Cartesian graph consists of x and y-axes across a defined…

# Analysing Posterior Predictive Distributions with PyMC3

## Using PyMC3 to model posterior distributions

The primary purpose of Bayesian analysis is to model data given uncertainty.

Since one cannot access all the data about a population to determine its precise distribution, assumptions regarding the same are often made.

For instance, I might make an assumption regarding the mean height of a population in a particular country. This is a prior distribution, or a distribution that is founded on prior beliefs before looking at data that could prove or disprove that belief.

Upon analysing a new set of data (a likelihood function), prior beliefs and the likelihood function can then be combined to form the…

# Rejection Sampling with Python

## Examples using the Normal and Cauchy Distributions

Rejection sampling is a means of generating random numbers that belong to a particular distribution.

For instance, let’s say that one wishes to generate 1,000 random numbers that follow a normal distribution. If one wishes to do this in Python using numpy, it is quite a simple execution:

`np.random.randn(1000)`

However, how exactly does this process work? Upon generating random numbers in Python, how can an algorithm know whether a random number belongs to a particular distribution or not? This is where rejection sampling comes in.

# Rejection Sampling

There is a reason I provided an image of a darts board at the beginning…

# Analysing Power Law Distributions with R

## Using poweRlaw to test the power law hypothesis

A power law distribution (such as a Pareto distribution) describes the 80/20 rule that governs many phenomena around us.

For instance:

• 80% of a company’s sales often comes from 20% of their customers
• 80% of a computer’s storage space is often taken up by 20% of the files
• 80% of the wealth in a country is owned by 20% of the people

These are just a few examples. While many believe that most datasets tend to follow a normal distribution — power law distributions tend to be a lot more common than we realise. …

# How Minesweeper Can Make Us Think Differently About Data

## We live in a world of uncertainty and imperfect information

I often like to play chess and minesweeper in my spare time (yes, don’t laugh).

Of these two games, I have always found minesweeper more difficult to understand, and the rules of play have always seemed very opaque.

However, the latter game is much more resembling of how situations often unfold in the real world. Here is why that is relevant to data science.

# Perfect vs. Imperfect Information

Compare that to chess, where in spite of one’s playing ability — all players have perfect information at all times.

One can always see every piece on the board, and neither opponent possesses any informational advantage…

# Working With Time Series Using SQL

## Using SQL to manipulate time series data

Tools such as Python or R are most often used to conduct deep time series analysis.

However, knowledge of how to work with time series data using SQL is essential, particularly when working with very large datasets or data that is constantly being updated.

Here are some useful commands that can be invoked in SQL to better work with time series data within the data table itself.

# Background

In this example, we are going to work with weather data collected across a range of different times and locations.

The data types in the table of the PostgreSQL database are as below:

`…`

# Want To Get Good At Time Series Forecasting? Predict The Weather.

## Understanding the components of a time series

For someone who originally comes from an economics background, it might seem quite strange that I would spend some time building models that can predict weather patterns.

I often questioned it myself — but there is a reason for it. Temperature patterns are one of the easiest time series to forecast.

# Time Series Components

When a time series is decomposed — or broken into its individual elements — a series consists of the following components:

• Trend: The general direction of the time series over a significant period of time
• Seasonality: Patterns that frequently repeat themselves in a time series
• Random: Random fluctuations in…

# Hotel Revenue Forecasting: Predicting ADR Fluctuations with ARIMA

## Using ARIMA to predict average daily rates

Average daily rates (henceforth referred to as ADR) represent the average rate per day paid by a staying customer at a hotel.

This is an important metric for a hotel, as it represents the overall profitability of each customer.

In this example, average daily rates for each customer are averaged over a weekly basis and then forecasted using an ARIMA model.

The below analysis is based on data from Antonio, Almeida and Nunes (2019): Hotel booking demand datasets.

# Data Manipulation

In this particular dataset, the year and week number for each customer (along with each customer’s recorded ADR value) is provided separately.

# Predicting Electricity Consumption with XGBRegressor

## Time series analysis of kilowatt consumption patterns

In this example, XGBRegressor is used to predict kilowatt consumption patterns for the Dublin City Council Civic Offices, Ireland. The dataset in question is available from data.gov.ie.

# What is XGBRegressor?

Have you used XGBoost (Extreme Gradient Boosting) for classification tasks before? If so, you will be familiar with the workings of this model.

Essentially, a gradient boosting model works by adding predictors to an ensemble in a sequential fashion, with the new predictor being fit to the residual errors made by the previous predictor. …

## Michael Grogan

Data Science Consultant — Expertise in time series analysis, statistics, Bayesian modeling, and machine learning with TensorFlow | michael-grogan.com

Get the Medium app