Data Visualization Using Seaborn

Seaborn is a Python data visualization library that provides stunning and informative statistical graphics. In this article I will be lightly discussing a few functions used for data visualization with seaborn:

  • seaborn.jointplot()
  • seaborn.distplot()
  • seaborn.boxplot()

seaborn.jointplot()

seaborn.jointplot() function displays a relationship between two variables (bivariate), x and y, and a univariate in the margins.

Univariate is a term used to describe a type of data that only observes a single attribute or characteristic, while a bivariate observes two types of data that are usually related. For example, number of tweets posted in a day vs number of engagements in tweets. If you only observed one of them, then it is considered univariate.

seaborn.jointplot() is intended to be a fairly lightweight wrapper [1].

There are a lot more parameters (see Seaborn official documentation) available but here are some of the important ones:

  1. x, y: vectors or keys in data
    • Variables that specify positions on the x and y axes
  2. data: pandas.DataFrame, numpy.ndarray, mapping or sequence
    • Input data structure. Either long-form or wide-form
  3. kind: {“scatter”, “kde”, “hist”, “hex”, “reg”, “resid”}
    • Kind of plot to draw

Here is an example of what a standard jointplot() function looks like when data is plotted- scatterplot with marginal histogram.

seaborn jointplot

You can load a data set of your own and assign values in parameter x and y, but in this example to show you what a jointplot() looks like I have assigned a value generated by randn(1000) to both variable’s data1 and data2, then assigned those variables to parameter x and y.

You can change the kind of plot by assigning a value from {“scatter”, “kde”, “hist”, “hex”, “reg”, “resid”} in the parameter kind.


seaborn.distplot()

seaborn.distplot() function displays univariate data in histogram with a line on it.

This function combines the matplotlib hist function (with automatic calculation of a good default bin size) with the seaborn kdeplot() and rugplot() functions [2].

seaborn distplot displot

There are a lot more parameters (see Seaborn official documentation) available but for this example, I have only passed in data1 (our observed data in this example) to distplot() function to show you what it looks like.


seaborn.boxplot()

seaborn.boxplot() function displays how the values in the data are spread out. It divides the data into sections that each contains approximately 25% of the data in a set.

A boxplot displays the distribution of data based on five number summary: minimum score, first lower quartile (Q1), median, third upper quartile (Q3), and maximum score [3].

  1. Minimum score: lowest score, left whisker
  2. Lower quartile (Q1): value between the minimum and median
  3. Median: mid-point of the data
    • If the median is in the middle of the box and whiskers are the same length on both sides, then the distribution is symmetric;
    • If the median is closer to the bottom and whisker is short on the lower end, then the distribution is positively skewed;
    • If the median is closer to the top and whisker is shorter on the upper end, then the distribution is negatively skewed.
  • Upper quartile (Q3): value between median and maximum
  • Maximum score: highest score, right whisker

The longer the box the more dispersed the data is, and the shorter the box the less dispersed the data is.

Important parameters (see Seaborn official documentation for more):

  • x, y, hue: names of variables in data or vector data, optional
  • data: DataFrame, array, or list of arrays, optional

In this example and output of boxplot() function, it shows the dispersion between Mfr Name data and CombMPG data taken from a DataFrame called df.



Resources:

[1] Seaborn.jointplot()

[2] Seaborn.distplot()

[3] Understanding Boxplots

Basic HTML Code Every Blogger Should Know

Hyper Text Markup Language (HTML) is the standard markup language responsible for creating the structure of web pages, it tells the browser how to display the content.

The purpose of this article is to teach bloggers basic HTML codes that may help them understand what is going on under the hood when they write headings, use different fonts, or insert an image on their favorite blogging platform. In this article we will be tackling the following HTML codes:

  1. Headings
  2. Paragraph
  3. <span> Tag
    • Using <span> tag to change font style
    • Using <span> tag to change font color
  4. <br> Tag (line break)
  5. <hr> Tag (horizontal rule)

html-codes-cheatsheet

Headings

Headings are used to guide the readers through an article. It helps the readers get an idea of what a certain section of your text is all about, and having a better text structure helps with that.

<h1> tag is usually used to define the most important text in your article which most likely be the title. Then as you write your content, <h2> or <h3> tag is used as a subheading to introduce different sections of your article.

The following are the different HTML headings you can use:

<h1>h1 heading</h1>
<h2>h2 heading</h2>
<h3>h3 heading</h3>
<h4>h4 heading</h4>
<h5>h5 heading</h5>
<h6>h6 heading</h6>

The actual output of headings h1 to h6 looks like this:

h1 heading

h2 heading

h3 heading

h4 heading

h5 heading
h6 heading

Paragraph

What is an article without a content? The contents of your article is written inside the <p> tag which defines a paragraph.

There are many different ways to style a paragraph using the Cascading Style Sheets (CSS), the design sheets for your website, such as manipulating its color, font, size, space between the letters, the position of your paragraph and more.

<p>Your text here<p>

<span> Tag

The <span> tag is an inline container used to mark up part of a text, or part of a document [1]. This tag can be used in many different ways such as changing a certain word’s color on a text, font size, or font style.

Changing the font style

We will change the style of a text using Grenze Gotisch font family that was taken from Google Fonts website. I chose this particular font to make the changes more obvious.

Once you have picked a font of your liking in Google Fonts, to embed the font, you will have to copy and paste the <link> code provided by them into the <head> tag of your html. This is an important step you cannot skip otherwise, the font will not work.

<head>

<link href="https://fonts.googleapis.com/css2?family=Grenze+Gotisch&display=swap" rel="stylesheet">

</head>
font-family: 'Grenze Gotisch', cursive;

After embedding the <link> code, we can now use the font. Here is an example of how to use <span> to change a certain text’s font style:

<p style="text-align: center; font-size:1.5rem;">

This is the normal font style and this is how I used

<span style="font-family: 'Grenze Gotisch', cursive;">

span to change the font style to Grenze Gotisch</span></p>

The actual output looks like this:

This is the normal font style and this is how I used span to change the font style to Grenze Gotisch


Changing the font color

I enjoy choosing colors from colorhunt.co because of its available color palettes or color schemes, and it is a great open platform for color inspiration.

For this example I have chosen this color because I think it is cute and change is obvious.

This is how it was achieved:

<p> For this example I have chosen 

<span style="color: #b83b5e; font-size:1.5rem"> 
this color</span> 

because I think it is cute and change is obvious.</p>

<br> Tag

  • The <br> tag inserts a single line break;
  •  The <br> tag is useful for writing addresses or poems;
  • The <br> tag is an empty tag which means that it has no end tag [2].
<p style="text-align: center;">
This is how you force<br> line breaks <br> 
in a text</p>

The actual output of the code:

This is how you force
line breaks
in a text


<hr> Tag

Horizontal rules can be used to separate the contents of your article with a line, and this can be done with <hr> tag.

<hr>

The actual output of the code:


There are ways to customize and change the style of the horizontal rule using CSS. Some styles available are dotted, dashed and more. Its width length, height, can be changed as well.



Resources:

[1] HTML <span> Tag

[2] HTML <br> Tag

[3] CSS hr border-style

Create Digital Clock Using Tkinter

digital clock using tkinter python

A Graphical User Interface (GUI) is an interface that displays objects on screen that users can interact with. It is more user-friendly compared to a text-based command-line interface for it uses objects such as icons, buttons, cursors, and other graphical elements to represent actions. There are many GUI toolkits that can be use with Python such as wxPython and JPython but for this tutorial, we will be creating a GUI application using Tkinter.

Tkinter is the standard GUI library for Python. Tkinter provides a variety of common GUI elements or widgets such as buttons, text box, labels, frame, and many more that can be use to build an interface with. The following are widgets available in Tkinter [1]:

Containers: frame, label frame, top level, pane window.

Buttons: button, radio button, check button (checkbox), and menu button.

Text Widgets: label, message, text.

Entry Widgets: scale, scrollbar, list box, slider, spin box, entry (single line), option menu, text (multi line), and canvas (vector and pixel graphics).

In this tutorial, we will create a simple digital clock to get the hang of using Tkinter.

 

Getting Started

import tkinter as tk
import datetime

Since we are going to be creating a GUI application using Tkinter, we must import tkinter module and import datetime module to work with date and time.

But before jumping on further, it is good practice to plan out the design layout first which will act as a blueprint as you code. This way you already know where to put widgets on the GUI application and time will not be wasted in figuring out where to place them when coding.

digital clock design layout
 

Creating the Application Main Window

x = datetime.now()

window = tk.Tk()
window.title("Digital Clock")

canvas = tk.Canvas(window, height=200, width=500)
canvas.pack()

frame = tk.Canvas(window, bg='#696969')
frame.place(relx=0, rely=0, relheight=1, relwidth=1)

#insert code here

window.mainloop()

Line 1: datetime object containing current date and time. Note that we use .strftime() to create a string representing date and time in another format which we’ll see later on how to use

Line 3: creates the GUI application main window

Line 4: sets the window title to “Digital Clock”

Line 6: creates the canvas, setting the height and width to height = 200, width = 500

Line 7: packs the canvas into the window

Line 9: creates the frame setting the background color to #696969

Line 10: places the frame in a specific position in the parent widget

  • relheight, relwidth − Height and width as a float between 0.0 and 1.0, as a fraction of the height and width of the parent widget [2]
  • relx, rely − Horizontal and vertical offset as a float between 0.0 and 1.0, as a fraction of the height and width of the parent widget [2]

Line 14: mainloop() method executes when GUI application is run, waiting for events from user

 

Inserting Label Widget

#Displays the 24-hour clock 00:00 
clock = tk.Label(frame, fg="#8FBC8F", bg='#696969', font="Verdana 110", anchor="nw")
clock.place(relx=0.05, rely=0.15, relheight=0.6, relwidth=0.7)

#Displays the seconds in clock
second = tk.Label(frame, fg="#8FBC8F", bg='#696969', font="Verdana 30", anchor="nw")
second.place(relx=0.7, rely=0.55, relheight=0.3, relwidth=0.1)

#Label for month
month = tk.Label(frame, fg='#BDB76B', bg='#696969', text="MONTH", font="Verdana 15")
month.place(relx=0.790, rely=0.1, relheight=0.15, relwidth=0.2)

#Displays month name, short version (e.g. FEB)
b = tk.Label(frame, fg='#8FBC8F', bg='#696969', text=x.strftime("%b"), font="Verdana 25 bold")
b.place(relx=0.790, rely=0.230, relheight=0.15, relwidth=0.2)

#Label for date
date = tk.Label(frame, fg='#BDB76B', bg='#696969', text="DATE", font="Verdana 15")
date.place(relx=0.790, rely=0.380, relheight=0.15, relwidth=0.2)

#Displays day of month 
d = tk.Label(frame, fg='#8FBC8F', bg='#696969', text=x.strftime("%d"), font="Verdana 25 bold")
d.place(relx=0.790, rely=0.51, relheight=0.15, relwidth=0.2)

#Label for weekday
day = tk.Label(frame, fg='#BDB76B', bg='#696969', text="DAY", font="Verdana 15")
day.place(relx=0.790, rely=0.650, relheight=0.15, relwidth=0.2)

#Displays weekday, short version (e.g. Wed)
a = tk.Label(frame, fg='#8FBC8F', bg='#696969', text=x.strftime("%a"), font="Verdana 25 bold")
a.place(relx=0.790, rely=0.77, relheight=0.15, relwidth=0.2)

The Label widget on Tkinter is used to display a text or image on the screen. The label widget uses double buffering, so you can update the contents at any time, without annoying flicker [3].

Label(*master, **options)

*master refers to the parent widget. In our case our master is the frame.

**options refers to the widget options.

One of the widget options we have used is called text which displays the text in the label. If you notice on line 14, 22, and 30 our text contains x.strftime(%b), x.strftime(%d), and x.strftime(%a) which will display the month, day of month, and weekday respectively on our GUI application.

 

Adding Functions

def get_time():
    hour_min = time.strftime("%H:%M")
    clock.config(text=hour_min)
    clock.after(200, get_time)

'''
clock = tk.Label(frame, fg="#8FBC8F", bg='#696969', font="Verdana 110", anchor="nw")
clock.place(relx=0.05, rely=0.15, relheight=0.6, relwidth=0.7)
'''

get_time()


def get_second():
    sec = time.strftime("%S")
    second.config(text=sec)
    second.after(200, get_second)

'''
second = tk.Label(frame, fg="#8FBC8F", bg='#696969', font="Verdana 30", anchor="nw")
second.place(relx=0.7, rely=0.55, relheight=0.3, relwidth=0.1)
'''

get_second()

def get_time() and def get_second() function is used to display time and seconds on their respective labels.

 

Resources:

[1] Tkinter (Wikipedia)

[2] Python – Tkinter place() method

[3] The Tkinter Label Widget

Machine Learning: House Price Prediction Using Linear Regression

Teaching helps in better understanding and retaining new materials learned and as part of my #100DaysOfCode challenge on Twitter, as I am learning Machine Learning algorithms, I will relay my learning here on this blog in hopes to help fellow beginners in this subject.

 

Linear Regression is an approach to modeling the relationship between two variables by fitting a straight line to the observed data. It is a basic and commonly used type of predictive analysis for a continuous dependent variable using a given set of independent variable. Thus, it can be said that Linear Regression is used for solving regression problems.

For better understanding of the definition:

Regression – statistical method that determines the strength and character of the relationship between a dependent variable and independent variable;

Continuous Variable – can take on unlimited number of values between its minimum and maximum value (e.g. price, salary, length, etc.)

An example of a relationship between an independent variable and dependent variable is shown below:

linear regression machine learning

This is a simple bivariate data (data involving two variables) plotted showing the time between two eruptions and the duration of the second eruption for 10 eruptions of the geyser Old Faithful with y (dependent variable) being the duration of eruption and x (independent variable) being the time between eruptions:

#x = Time between eruptions (in seconds)
#y = Duration of eruption (in seconds)

x = [272, 227, 237, 238, 203, 270, 218, 226, 250, 245]
y = [89, 79, 83, 82, 81, 85, 78, 81, 85, 79]

The regression line can be represented by the equation: y = mx + b

Where y and x are the variables describing a specific point on the graph, m is the slope of the line, and b the y-intercept describing where the line crosses the y-axis.

We calculate the R-squared to check if there is a relationship between the two variables because if there is no relationship between the two variables then linear regression cannot be used for prediction. R-squared value ranges from -1 to 1, where 0 means there is no relationship and -1 or 1 means there is a relationship.

We got 0.76 from the calculated r-squared value above example image, which shows that there is a relationship between the time between eruption and duration of eruption, albeit not perfect.

 

Getting Started

The dataset that I will be using on this tutorial is from GitHub user huzaifsayed’s USA_Housing, and I will be using Google Colaboratory notebook to write and execute the code.

 

Importing Dataset

import pandas as pd

url = 'https://raw.githubusercontent.com/huzaifsayed/Linear-Regression-Model-for-House-Price-Prediction/master/USA_Housing.csv'
dataset = pd.read_csv(url)

There are different ways to load a CSV file on Google Colaboratory notebook, one of the easiest is to upload from a GitHub repository. Copy the link of the raw dataset and store it in a variable. Then load that variable into Pandas read_csv to get the dataframe [1].

 
dataset.info()
dataset info linear regression machine learning

Pandas dataframe.info() function is used to get a summary of the dataframe which can be useful in doing an analysis of the data. As we are using linear regression, we will not be including the ‘Address’ column because it is an object that will not be useful in our linear regression model:

X = dataset[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
       'Avg. Area Number of Bedrooms', 'Area Population']]
y = dataset['Price']

 

Split Dataset into Train and Test

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=101)

We will use train_test_split() method from the model_selection library of sklearn. sklearn.model_selection train_test_split, splits arrays or matrices into random train and test subsets. (Documentation)

test_size – If float, should be between 0.0 and 1.0 and represent the proportion of the dataset to include in the test split. If int, represents the absolute number of test samples;

random_state – Controls the shuffling applied to the data before applying the split. Pass an int for reproducible output across multiple function calls. 

 

Training the Linear Regression Model

from sklearn.linear_model import LinearRegression

linmodel = LinearRegression()

#training the model using training set
linmodel.fit(X_train, y_train)

Create linear regression object storing it to a variable called linmodel, then train the model using the training set X_train and y_train. (Documentation)

Output:

training linear regression machine learning output
 

Prediction from Linear Regression Model

import matplotlib.pyplot as plt

predictions = linmodel.predict(X_test)  
plt.scatter(y_test, predictions)

Using the trained model, we will now use it to predict the outcome of the test set. And to visualize the result, we will plot the data points of y_test and predictions.

The prediction from the test set:

prediction from linear regression model
 

Some other real life application of Linear Regression can be to predict one’s salary based on experience, predict gas money to pay when going on a road trip based on miles driven, predict product sale based on past buying behavior, or even predict economic growth of a country or state. It is a simple yet very useful algorithm to use for predictions.

Full code of what I have done // Full code from the original

 

References:

[1] Get Started: 3 Ways to Load CSV files into Colab

[2] Linear Regression Machine Learning Project for House Price Prediction

[3 ] Machine Learning Project 1: Predict Salary using Simple Linear Regression

Creating a Text-Based Game with Python

It takes some time for a novice programmer, such as myself, to learn the ins and out of a new programming language. One of the quickest and fun way to understanding the basics of a new language is to create a simple program in order to gain a little bit of experience and to showcase some of the important concepts of the language [1].

This, I believe, is the next step in getting acquainted or comfortable in using a new language after successfully executing a much simpler program, “Hello World!”.

Inspired by BitLife – Life Simulator, Path of Adventure Text-based roguelike, and many other similar games, I wanted to make a simple text-based game that provides the user some choices that will lead to different scenarios.

   

Getting Started

The following learnings are what I will be applying to achieve this simple text-based game:

  • Python If…Else Statement – Executes a block of code if the condition is true otherwise, another block of code can be executed;
  • Python Functions – Block of code that only runs when called;
  • Python User Input – Asks user for input;
  • Python Time Module, time.sleep() – Adds delay in the execution of a program.

 

Organizing Ideas

It is important to organize ideas first before coding and there are various techniques in which this can be achieved such as writing pseudocodes and creating flowcharts but I will be focusing on the latter.

Creating a flowchart will help to visualize and understand the workflow of a program a little easier. A lot of novice programmers and non-technical people can get intimidated with a structured text and so, a flowchart provides a visual aid that helps to make that bit understandable and less intimidating to look at.

More so, a flowchart helps reduce unnecessary codes as it lists each necessary steps to solve certain problems.

 

Text-Based Game Code Snippet

import time


def start():
    print("It took a few moments to realize that you\n"
          "weren't inside the comfort of your home but\n"
          "you were staring up at rows of darkened treetops.\n"
          "Anxiety sets in as you realize you weren't\n"
          "supposed to be there. Suddenly you hear a growl\n"
          "behind. You: \n")
    time.sleep(3)
    print("A. Turn towards the source\n"
          "B. Get up and run\n"
          "C. Lie down and accept the inevitable")

    choice = input("Choice: ")

    if choice == 'a' or choice == 'A':
        turn()
    elif choice == 'b' or choice == 'B':
        run()
    elif choice == 'c' or choice == 'C':
        over()

This is the first 23 lines of code for this simple text-based game, the rest are similar to this code snippet.

This specific function def start() will print the two print statements containing the beginning of the story and the choices available. I put a 3-seconds delay between the two statements using the time module time.sleep() as to not overwhelm the user and give them time to read, after 3-seconds the second print statement will print along with the input function.

If…Else statements was used to make the code execute certain tasks depending on the choices made by the user. If the user inputs ‘a’ or ‘A’, the code will execute a function called turn(), if the user inputs ‘b’ or ‘B’, the code will execute a function called run(), and if the user inputs ‘c’ or ‘C’, the code will execute a function called over().

Output:

create a text based game python
 
def over():
    print("\nYou died. Wasn't exactly a wise decision.\n"
          "Do you want to play again?\n"
          "Press Y for yes\n"
          "Press N for no\n")

    choice = input("Choice: ")
    if choice == 'y' or choice == 'Y':
        start()
    else:
        print("Thanks for playing!")
        exit()

This is the function def over() which will only execute if the user inputs certain choices that will lead to this scenario.

Here I gave the user two choices on whether they want to play again or not. If the user inputs ‘y’ or ‘Y’, function def start() will be called and the game will start again from the beginning but if the user inputs ‘n’ or ‘N’ (or any another keys), a print statement will print “Thanks for playing!” and the program will stop.

Output:

create text based game python
 

Creating a simple text-based game with Python has only scratch the surface of the great things that could be achieved using this language such as game development, machine learning and artificial intelligence, data science, etc. This simple text-based game is certainly a good starting point that covers some of the few important concepts of not only Python but other new languages as well.

  

Resources:

[1] How to Learn a New Programming Language Fast

[2] When would I use pseudocode instead of flowchart

[3] Python Tutorial w3schools