Cloud Powering DH Research: References

Key Points

Introduction
  • Cloud computing is very flexible and has many diverse uses.

  • Setup of Compute Canada cloud environments is left to its users.

  • In this course we will setup a cloud environment to run WordPress.

  • We will see methods to easy setup of cloud environments.

How the Internet works
  • Computers often have an IP on a LAN and connect to the Internet through a router.

  • You can see what your computer’s IP address is by using either ipconfig on windows or ifconfig commands on linux or Mac.

  • You can see your computer’s public IP using the site ipv4.icanhazip.com.

  • Ports allow computers and routers to differentiate types network traffic.

Introduction to cloud computing
  • Elasticity refers to the ability to scale devices up or down to meet demand.

  • A Virtual Machine or Virtual Device is simulated with software running on physical hardware.

  • A cloud allows one to borrow or rent virtual devices on-demand.

  • Infrastructure as a Service (IaaS): the service provider provides you with the ability to create and manage virtual devices. You have complete control over VM configuration.

  • Platform as a Service (PaaS): the service provider provides you with an environment to build and setup your software.

  • Software as a Service (SaaS): the service provider provides the software and all the infrastructure and operating system configuration and management required to run the software. (e.g. gmail).

  • OpenStack is a cloud operating system which allows you to manage your virtual devices.

Creating a keypair
  • A shell is a text based method for interacting with a computer.

  • SSH is a Secure SHell that allows remote interaction with a computer.

  • An SSH key pair allows a user to be authenticated on a remote computer.

  • The Linux filesystem is a tree with / at the root and directories creating branches.

  • The cd command is used to change directories.

  • The pwd command is used to display the current working directory.

  • The ls command is used to list the directory structure.

  • The ssh-keygen command is used to create key pairs.

  • The cat command is used print the contents of a file to the terminal.

  • The chmod command is used change the file mode or permissions.

  • The private key, id_rsa, must only be readable and writable by the file’s owner.

Creating a virtual machine
  • The flavor of a VM prescribes the hardware profile of the VM.

  • A boot source specifies from what the VM should boot.

  • A public key must be inject into the VM in order to connect to it.

  • A floating IP must be added to a VM to connect to it from outside the local network in the cloud.

  • Port 22 must be opened in the security rules to allow SSH to connect to the VM.

  • A security group controls which ports to allow traffic in and out on.

Creating a persistent virtual machine
  • p flavors or persistent flavors can have their VCPUs oversubscribed by up to 8 times.

  • A volume is a virtual hard drive allowing its contents to persist from one VM to the next.

  • p flavors should typically boot from a volume.

  • A p flavor not booting from a volume will have an unusably small root disk.

Applying updates
  • Use sudo apt update to update the package list.

  • Use sudo apt upgrade to upgrade packages.

  • Reboot after updates have been installed.

  • You may need to repeat the apt update, apt upgrade, reboot process a few times to ensure all updates have been applied.

Creating a web server
  • Use apt search to find a specific package name.

  • Backup configuration files before making changes.

  • The apache2ctrl command is used to manage the apache web service.

  • The systemctl command is used to manage system services.

Installing MySQL
  • MySQL also has a ‘root’ account that is used to manage the MySQL server, this is different from the operating system ‘root’ account used to manage the operating system.

  • The mysql command allows you to view and modify MySQL databases.

  • The SHOW DATABASES command shows the available databases in your MySQL server.

  • The USE command switches which database you actively working with.

  • The SHOW TABLES command shows the tables in the active database.

  • The DESCRIBE <table> command displays data columns present in the given table.

  • The SELECT command displays rows from a table.

  • The UPDATE command is used to modify rows in a table.

  • The EXIT command exits the mysql program.

Installing PHP
Creating a Self-Signed SSL Certificate
Creating a WordPress site
Installing OpenStack CL client
  • The CL client can be used to manager your OpenStack project from any computer connected to the Internet.

  • The OpenStack RC file provides settings to connect the CL client with your cloud project.

  • The source command is used to apply settings in a file to your shell environment.

Using the OpenStack CL client
  • openstack help shows the list of available commands.

  • openstack help <command-group> shows the list of sub commands matching <command-group>.

  • openstack help <command-group> <command> shows the help text for <command>.

  • It is important to keep track of the volume disk type.

Automating with cloud-init
  • User data provided to a VM can be either a cloud-config or script file.

  • User data can be set using the --user-data on the command line or using the Post-Creation tab when launching a VM within horizon.

  • Cloud-init can be used to automate the initial installation of configuration of software

  • Cloud-init runs once after the first boot of a newly crated VM

Bash Scripting
  • A for loop executes commands multiple times while changing the loop variable each iteration.

  • A stream is a sequence of data elements which are made available over time.

  • The echo command sends a string to a stream.

  • The head command displays the begging of a file or stream.

  • Pipes | are used to pipe the output of one command to the input of the next.

  • tr translates characters in one set to corresponding characters in another set.

  • Redirects <,> are used to redirect output and input to files or streams.

  • sed is a stream editor which can be used to replace one string with another in a stream.

YAML
  • YAML is a format to store data in a way that is human readable.

  • A YAML file can be validated using yamllint.

  • White space is important in YAML.

  • Indentation indicates scope.

  • Block notation (indicated with a |) is used to preserver newline characters.

Cloud-config
  • A Keypoint 0

Using Heat Orchestration Templates (HOT)
  • A Keypoint 0

Creating HOTs
  • A Keypoint 0

Your Project Part I
Your Project Part II
Your Project Part III

Glossary

apt
is a command which provides an interface to the Ubuntu package management system. Commonly used sub-commands are update for updating the package list and upgrade for upgrading already installed packages to the newest version. See Ubuntu manual page for apt for more details.
bash
is a replacement for the earlier Bourne shell and is the default shell for most Linux distributions. See also shell for a more description of shells in general.
boot
or startup of a compute involves loading files and starting programs running which are contained on a boot source.
boot source
what the virtual machine boots from. Examples of a boot sources are volumes and images
cat
a command to concatenate files and print on the standard output.
cd
a command to change directories.
chmod
a command to change file mode bits or permissions.
CIDR
stands for Classless Inter-Domain Routing which can be used for specifying ranges of IP addresses.
cloud computing
A computing paradigm that enables access to shared pools of configurable computing resources.
command
a series of characters entered in a shell indicating an action you would like the operating system to perform.
CPU
or central processing unit, is the electronic circuitry within a computer that carries out the instructions of a computer program.
CPU oversubscription
is when one physical CPU runs two or more VCPUs. In this case the one real physical CPU will switch back and forth between running tasks for the two or more VCPUs.
compute flavor
is a virtual machine flavor which is configured for short temporary usage. Because data safety is often less of a concern they are designed with a 20 GB root ephemeral disk and often have an extra ephemeral data disk attached. An example of a compute flavor name is c1-7.5gb-30 which has 1 VCPU, 7.5 GB of RAM, and a 30 GB extra ephemeral data disk in addition to the 20 GB ephemeral root disk.
computer network
is a digital telecommunications network which allows nodes in the network to share resources and exchange data.
decryption
the process of transforming an encrypted message into its original form before the encryption took place.
DNS
or domain name server is a computer which matches domain names to IP addresses.
domain name
is an identification string that defines a realm of administrative autonomy, authority or control. In general, a domain name represents an IP resource.
encryption
a process of transforming a message into one which is only readable by authorized parties.
elasticity
The ability to quickly change the amount of resources being used based on demand.
ephemeral disk
a virtual disk residing on the physical node or hypervisor which runs the virtual machine. Ephemeral disks, as the name might suggest do not outlive their virtual machine, meaning that when their virtual machine is terminated or deleted the drive is also deleted.
flavor
defines the virtual hardware specifications of a virtual machine
floating ip
is an IP address which is publicly addressable from the Internet and which can be moved between virtual machines. Also referred to as a public IP.
FQDN
is a domain name that is completely specified with all labels in the hierarchy of the domain name system.
git
git if a free open source, source code management tool. It keeps versioned snapshots of your code and easily displays the differences from one snapshot to the next.
github
github.com provides a web platform for hosting git version control repositories.
hardware virtualization
or sometimes referred to just as virtualization is the presentation of simulated hardware by software. For example this virtual hardware could be routers, computers, or disk drives.
hostname
is a label that is assigned to a device connected to a computer network
hypervisor
is computer software or hardware that creates and runs virtual machines
IaaS
Infrastructer as a Service is a service which provides computing infrastructure often through use of cloud computing
image
an image is a file which contains the contents of a virtual drive or volume. Images are however more portable than volumes as they can be downloaded and uploaded to various clouds and used with software such as VirtualBox
instance
see virtual machine
IP address
Internet Protocol address is a numerical label assigned to each device connected to a computer network.
key pair
a pair of cryptographic keys used in asymmetric encryption. Asymmetric encryption uses a different key to encrypt a message, in this case a public key, and another, in this case a private key, to decrypt the message.
LAN
local area network is a computer network that interconnects computers within a limited area such as a residence, school, laboratory, university campus or office building.
Linux
is a family of free and open-source software operating systems.
ls
a command to list the file system structure in the bash shell and many other common shells.
node
often refers to a computer within a computer network.
OpenStack
is open source software for creating clouds. See cloud computing
operating system
software which runs on a computer to manage computer hardware, data and common services for computer programs.
PaaS
Platform as a Service
persistent flavor
is a virtual machine flavor which is configured for long running or persisting virtual machines. These machines are typically for webservers and may spend substantial portions of their time not doing anything. As such they may have the VCPUs oversubscribed by up to a factor of 8. They are also meant to boot form a volume for added robustness. An example of a persistent flavor name is p1-1.5gb indicating a virtual machine with 1 VCPU and 1.5 GB of RAM.
port
identifies a specific process or a type of network service.
port-forwarding
also referred to as port mapping redirects communication requests from one address and port number to another while the data is traversing a network gateway or router
private IP
an IP address assigned to devices on a LAN and is only accessible from within the LAN. Private IP address often have the form 192.168.XXX.YYY.
private key
is a key which is part of a key pair which is intended to be kept private and is used to decrypted messages encrypted by the public key.
prompt
a set of characters presented in a shell to indicate it is waiting for a command.
public key
is a key which is part of a key pair which is intended to be distributed publicly and is used to encrypt messages to be decrypted by the private key.
public IP
see floating IP
pwd
a command to print the current working directory in the bash shell and many other common shells.
RAM
or random-access memory is a form of computer data storage that stores data for quick access by the CPU.
reboot
is the act of shutting down and then booting an already running computer. It is also a Linux command which can be issued to cause a computer to reboot.
root
can refer to the root of a file system, the root drive (which contains the root of the file system), or a root or administrative user.
router
is a networking device that forwards data between computer networks for example a WAN and a LAN
SaaS
Software as a Service
scalability
The ability to increase resources as needed.
security group
is a set of rules indicating how traffic can flow into and out of the virtual machines which are members of a security group.
security rule
is a rule for a particular port or range of ports dictating what IP address, range of IP addresses, or which security group are allowed to send or receive data.
shell
or more specifically a command-line interface is a user interface for interacting with an operating system by typing commands. A common shell is the Bash shell. Some times the word terminal and shell are used interchangeably but the shell defines which commands are used while a terminal is a means of interfacing with a shell and different shells can be used within a single terminal.
SSH
is a cryptographic network protocol for operating network services securely over an unsecured network, commonly used for remote command execution in a shell. It uses key pairs for authentication.
ssh-keygen
a command for creating key pairs.
static website
a static website is a web page that is delivered to the user exactly as stored, in contrast to dynamic websites.
sudo
is a Linux command which runs the command following it, supplied as an argument, as the root user or administrative user. See Linux man page for sudo for more details.
terminal
is a program for entering and displaying text, see also shell.
Ubuntu
is an operating system in the Linux family. Ubuntu is one of the more popular Linux based operating systems and is widely used in cloud environments. See the official Ubuntu page for more details.
VCPU
is a virtual CPU.
virtual device
is an emulation of a real physical device usually through means of virtualization software.
virtual machine
is a virtual device emulating a computer system which provides the functionality of a physical computer. A virtual machine runs on a real underlying computer.
VirtualBox
a software tool for creating and running virtual machines. See VirtualBox website for more details.
virtualization
see hardware virtualization
volume
a volume is a virtual disk drive that can be attached to a virtual computer as you would a real drive to a real computer.
WAN
wide area network is a computer network that extends over a large geographical distance. The Internet may be considered a WAN.